What is Natural Language Search?
Natural Language Search is “Using human-like language when searching on a website”. Users can use the full sentences in their native language as if they are conversing with another human being. Also, the computer can simultaneously transform the human-like query into a machine-readable search query.
A Typical example could be something like this
if this query is a natural language search query: ladies shirt with price less than 1k.
This should be converted to the below SQL ( Structured Query Language ) for the computer to process:
select products where sex=women and product_type=shirt and price < 1000
Modern voice assistants like Siri, Alexa have these capabilities built using Natural Language Processing Search.
How does Natural Language Processing Search Work?
Search engines like Google, Bing have evolved to a very advanced level of handling NLP Search queries. NLP uses multiple techniques to identify the topic of the search rather than the keyword of the search using a variety of techniques like
Parsing is splitting the sentences into their components to find their meanings and the important words in the sentences.
Stemming is a process to derive the root word or the stem, the root word as we mean is not dictionary root. Its also called lowering the inflexion in words to their root forms.
According to Wikipedia, inflection is the process through which a word is modified to communicate many grammatical categories, including tense, case, voice, aspect, person, number, gender, and mood. Thus, although a word may exist in several inflected forms, having multiple inflected forms inside the same text adds redundancy to the NLP process.
Hence stemming is employed.
Why Stemming is Important?
The English language has multiple variations of the same word. This results in redundancy when developing NLP or ML/AI models which are not efficient. In order to increase the robustness of the ML/AI model stemming is widely used to remove the repetition of words and extract the normalised or root word.
Stemmer in NLTK Example:
Martin Porter invented the Porter Stemmer or Porter algorithm in 1980. Five steps of word reduction are used in this method, each with its own set of mapping rules. PorterStemmer() is a module in NLTK that implements the Porter Stemming technique.
|from nltk.stem import PorterStemmer|
|porter = PorterStemmer()|
|words = ['Connects','Connecting','Connections','Connected','Connection','Connectings','Connect']|
|for word in words:|
|# ---- Output|
|Connects ---> connect|
|Connecting ---> connect|
|Connections ---> connect|
|Connected ---> connect|
|Connection ---> connect|
|Connectings ---> connect|
|Connect ---> connect|
As per Wikipedia
Lemmatisation (or lemmatization) in linguistics is the process of grouping together the inflected forms of a word so they can be analyzed as a single item, identified by the word’s lemma, or dictionary form.
Generally in any language, words appear in one or more inflected forms (In linguistic morphology, inflection is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and definiteness. The inflection of verbs is called conjugation, and one can refer to the inflection of nouns, adjectives, adverbs, pronouns, determiners, participles, prepositions and postpositions, numerals, articles, etc., as declension.)
For example, Lemmatisation could be something like this, in English, the verb ‘to walk’ may appear as ‘walk’, ‘walked’, ‘walks’ or ‘walking’. The base form, ‘walk’, that one might look up in a dictionary, is called the lemma for the word.
Natural Language Search vs Keyword Search
Keyword Search is a relatively old way of performing a search in Google and search engines in the 1980s.
For eg. To search for the president of the United States. You would query something like “United States President“.
As time progresses these companies have also improvised their search algorithms so that it becomes very easy for users. The users can type in the queries in a natural way as to how they would ask another person.
NLQ version of the same query: Who is the present President of the United States of America.
Search Engines have the ability to parse/understand this query.
Looking for Natural Language Search For E-Commerce / Website?
If you are looking to build a Natural Language Search Engine for your website or your e-commerce website? Certainly its not worth the investment of your cost, time and effort. You can always look for Plug and Play Solutions like ExpertRec Natural Language Search
Natural Language Search Examples:
The internet’s Search Giant and Pioneer is one of the popular search engine that provides capabilities like Natural Language Search.
The original software was implemented by Gary Chevsky, from his own design.
START ( http://start.csail.mit.edu/index.php )
START is surprisingly the first Natural Language Search system, where any users can ask queries about Geography, Science, Arts. It was built by Boris Katz and his associates of the InfoLab Group at the MIT Computer Science and Artificial Intelligence Laboratory.