Elasticsearch fuzzy match is a process that locates pages that are likely to be relevant to a search argument even when the argument does not exactly correspond to the desired information. A fuzzy search is done by means of a fuzzy matching program, which returns a list of results based on likely relevance even though search argument words and spellings may not exactly match. Exact and highly relevant matches appear near the top of the list. Subjective relevance ratings, usually as percentages, may be given.
Fuzzy query
Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.
An edit distance is the number of one-character changes needed to turn one term into another. These changes can include:
Changing a character (box → fox)
Removing a character (black → lack)
Inserting a character (sic → sick)
Transposing two adjacent characters (act → cat)
To find similar terms, the fuzzy query creates a set of all possible variations, or expansions, of the search term within a specified edit distance. The query then returns exact matches for each expansion.
A Simple Elasticsearch Fuzzy Match
Fuzzy queries can most easily be performed through additional arguments to the match query type, as seen in the example below in this paragraph. In the example, the final request, a search for ‘vaacuum’, which has an extra A, should still turn up the ‘Vacuum’ product. The fuzziness argument specifies that the results match with a maximum edit distance of 2. It should be noted that fuzziness should only be used with values of 1 and 2, meaning a maximum of 2 edits between the query and a term in a document is allowed. Larger differences are far more expensive to compute efficiently and are not processed by Lucene. The official documentation still refers to setting fuzziness to float values, like 0.5, but these values are in-fact deprecated and are harder to reason about.
Elasticsearch fuzzy match can be configured to provide fuzziness by mixing its built-in edit-distance matching and phonetic analysis with more generic analyzers and filters. However, this approach requires a complex query against multiple fields, and recall is completely determined by Lucene edit distance and Soundex/metaphone (phonetic similarity).
Fuzzy search alternative
Elasticsearch fuzzy match is very difficult to set up. If you are looking for a turnkey solution then check out ExpertRec.
This is one of the easiest setup processes of all the available options out there and is highly recommended.
- Navigate to https://cse.expertrec.com/newuser?platform=cse and signup with your Google ID.
- Enter your website’s URL when prompted. You can select a server location near you and add the URL of your sitemap if you wish to. These will be auto-detected otherwise.
- You can play around with the settings and customize the UI as the crawl runs. Once it is complete, you can check out a fully functional demo before taking the search to your website.
- You can take the search to your website with little to no effort. All you need to do is to paste the code snippet available on the dashboard on your website.
ExpertRec comes with more customization options that you can explore. You can read this article to find a more detailed guide on the installation and configuration.