Warning: This Extension requires the use of
exec() and also requires you to install Xpdf (upload a file to a non-public location) yourself.
SearchWP offers the unique feature of extracting plain text from PDF files uploaded to your WordPress website. Out of the box, SearchWP attempts to do this using only PHP, but due to the complexity and variation of the PDF format that sometimes results in content not being accurately extracted. Enter Xpdf.
Xpdf is a command line utility that must be installed on your server in order for this Extension to work. Installation is simple, and instructions are included.
Using the Xpdf Integration Extension you can offload all the work PHP has to do in processing your PDF files to Xpdf, which is extremely fast and accurate when extracting content from your PDFs. After activating the Extension, you will need to follow the installation instructions. Once installed, SearchWP will offload the PDF content extraction process to Xpdf.
Installing Xpdf tools
Using this extension you can utilize Xpdf to extract the content from your PDFs.
IMPORTANT: Xpdf is not provided in this download. You must download Xpdf and upload it to a non-public (outside your Web root) location
Xpdf offers binary distributions of Xpdf tools for both Windows and Linux
xpdf-tools-linux-4.00.tar.gz(the version number may be different)
- Upload the
pdftotextbinary (found in either the
bin64directory after extracting) to a non-public location, outside your Web root
- Ensure you have set the proper permissions to the file
The last step is to tell SearchWP Xpdf Integration where you installed Xpdf. Add the following to your theme’s
functions.php, replacing /path/to/pdftotext with the actual path to the pdftotext binary (not the folder) on your server.