In this article we will see how to create a PDF search using the command line.

Here we will take the following PDF and see if we can make extract its content searchable. PDF link here 

PDF search command line

Here are the steps-

  1. Open a terminal in linux.
  2. Use the wget function to get download the file and save it to dc-best-practices-google.pdf.
    wget "https://static.googleusercontent.com/media/www.google.com/en//corporate/datacenter/dc-best-practices-google.pdf"
    

     

  3. Use pdftotext function to convert the file to text.
    pdftotext dc-best-practices-google.pdf 
    

     

  4. open the file dc-best-practices-google.txt with any editor
    vim dc-best-practices-google.txt

     

  5. Use the grep command to search for Green data center
    grep -F -C2 "Green Data Center" dc-best-practices-google.txt

     

  6. This will show the following output which confirms that the PDF data has been made searchable.pdf search command line
  7. To create your PDF search engine, use this link PDF search command line
Categories: pdf search

muthali ganesh

Muthali loves writing about emerging technologies and easy solutions for complex tech issues. You can reach out to him through chat or by raising a support ticket on the left hand side of the page.