In this article, we will see what is indexing and why it is important in a search engine.

Every search engine has 3 main components-

  1. Crawling.
  2. Indexing.
  3. Query processing.

What is indexing / an index?

Let us imaging you have a website that sells laptops online. One of you tasks is to create a search engine that can search through your inventory of laptops. We will also assume you have a list of laptops, their names, price etc in a csv file.book indexing

Before you can search through this data, you have to create a search engine index. Having an index helps in getting search results faster and quicker. If not, the search engine will have to search across every product one by one which will take a large amount of time. (lesser processing time).

This is similar to an index that you would see at the end of a book that helps you find content faster.

What is an inverted index?

In an inverted index, each indexed term points to a list of documents that contain the term. Here is an example that shows how a inverted index looks like.

Compare this with a regular book index . Can you see the similarity

How does the Indexer get the data –

  1. XML, JSON feed
  2. CSV
  3. Web crawl
  4. RSS or ATOM feeds.

How to index data?

The following open source tools will let you create an index for free. (you will need to have coding knowledge)

  1. SOLR
  2. ELastic search
  3. Sphinx

Expertrec is a paid solution that takes care of indexing once you upload any document in the above mentioned formats. (no coding required).

How to increase indexing speed?

  1. Reduce the number of fields to be indexed.
  2. Use SSDs.
  3. Increase RAM of the machines that are indexing.

 

 

 

Categories: indexing

muthali ganesh

Muthali loves writing about emerging technologies and easy solutions for complex tech issues. You can reach out to him through chat or by raising a support ticket on the left hand side of the page.