I am making a SharePoint 2010 web part that basically manages a document library of PDF documents.
One component that I am working on is searching through PDF documents. I tried a custom PDF dll (iTextSharp) that handles functions for PDF documents such as text extraction which then I did a "index_of" function on that for searching. However this had some bad side effects:
- could get time_out errors for taking too long for all documents
- the text extracted is inconsistent or had messed up characters and white spaces
My SharePoint 2010 has an ifilter for PDF documents, so it can search and index PDF files using the SharePoint search box. So is there a programmatic way utilize it in my web part? I want to be able to send a request to it, and maybe receive the data in maybe an XML format or something so I can build my own custom search result page.
Is this possible?
Thanks.