fuzzy query

Fuzzy Query Explained

Rate this article

Share this article

A Fuzzy query is a search engine feature that lets website users find the correct search results even when they make mistakes in the spelling for certain search queries. Since we don’t have any control over how your website users type in the search box, the design of the fuzzy query search engine has to be pretty robust to handle. Fuzzy queries usually work on the basis of mathematical formulae that work by finding the distance between two search queries.

How Fuzzy query searches work?

Consider the word that the user types like A and the word that matches in the database as B1, B2, B3, and so on.

fuzzy query

The word that has the least distance between is returned as result1, result 2, result 3 and so on in search results. Levenshtein distance is a common method to calculate the distance between two words.

Levenshtein Distance

The Levenshtein Distance is a measure of how different two words are. It measures how many changes are required to change one word into another.

Mathematically, , the Levenshtein distance between two strings a and b (of length |a| and |b| respectively) is given by Levenshtein where

fuzzy query

and Levenshtein is the distance between the first i characters of a and the first j characters of b.

To visualize this, have a look at the image below. There ate two wordsS2 from S1.

Levenshtein

Calculate Fuzziness

To calculate the Levenshtein distance, it is pretty easy to do so if you are on a Linux operating system by using the following function. All you have to do is store the two strings you want to compare in strings w1 and w2.

The code to use is stringdist.levenshtein(w1,w2) which will return the levenshtein distance between the two words.

Fuzzy query

How Fuzzy search works?

Fuzzy search in Python

Here is a sample code that does fuzzy search in python

import numpy as np
def levenshtein_ratio_and_distance(s, t, ratio_calc = False):
    """ levenshtein_ratio_and_distance:
        Calculates levenshtein distance between two strings.
        If ratio_calc = True, the function computes the
        levenshtein distance ratio of similarity between two strings
        For all i and j, distance[i,j] will contain the Levenshtein
        distance between the first i characters of s and the
        first j characters of t
    """
    # Initialize matrix of zeros
    rows = len(s)+1
    cols = len(t)+1
    distance = np.zeros((rows,cols),dtype = int)

    # Populate matrix of zeros with the indeces of each character of both strings
    for i in range(1, rows):
        for k in range(1,cols):
            distance[i][0] = i
            distance[0][k] = k

    # Iterate over the matrix to compute the cost of deletions,insertions and/or substitutions    
    for col in range(1, cols):
        for row in range(1, rows):
            if s[row-1] == t[col-1]:
                cost = 0 # If the characters are the same in the two strings in a given position [i,j] then the cost is 0
            else:
                # In order to align the results with those of the Python Levenshtein package, if we choose to calculate the ratio
                # the cost of a substitution is 2. If we calculate just distance, then the cost of a substitution is 1.
                if ratio_calc == True:
                    cost = 2
                else:
                    cost = 1
            distance[row][col] = min(distance[row-1][col] + 1,      # Cost of deletions
                                 distance[row][col-1] + 1,          # Cost of insertions
                                 distance[row-1][col-1] + cost)     # Cost of substitutions
    if ratio_calc == True:
        # Computation of the Levenshtein Distance Ratio
        Ratio = ((len(s)+len(t)) - distance[row][col]) / (len(s)+len(t))
        return Ratio
    else:
        # print(distance) # Uncomment if you want to see the matrix showing how the algorithm computes the cost of deletions,
        # insertions and/or substitutions
        # This is the minimum number of edits needed to convert string a to string b
        return "The strings are {} edits away".format(distance[row][col])

Source- https://www.datacamp.com/community/tutorials/fuzzy-string-python

Fuzzy search in Javascript

https://github.com/bevacqua/fuzzysearch

Tiny and blazing-fast fuzzy search in JavaScript

Fuzzy searching allows for flexibly matching a string with partial input, useful for filtering data very quickly based on lightweight user input.

Demo

To see fuzzysearch in action, head over to bevacqua.github.io/horsey, which is a demo of an autocomplete component that uses fuzzysearch to filter out results based on user input.

Fuzzy search with no coding

  • Go to Fuzzy query search engine creator.
  • Enter your website URL.
  • If you have a sitemap, enter the URL.
  • Once the crawl completes, add the code to your website and take live.
  • Expertrec is a paid fuzzy search engine for websites that costs 9$ per month.fuzzy query search engine

 

Sign Up for ExpertRec

Are you showing the right products, to the right shoppers, at the right time? Contact us to know more.
You may also like