Displaying 1 result from an estimated 1 matches for "result&usg".
2009 Aug 28
2
OT: .doc,.xls,.pdf,.ppt (etc.) string parser/indexers
Does anyone have experience with linux tools to parse the text from
common non-text file formats for searching? I'm trying to use the
kinosearch add-on for twiki which is fine as far as the search goes, but
it takes forever to generate the index. It uses xpdf to extract strings
from pdf's, antiword for .doc, and since it is perl, the
Spreadsheet::ParseExcel module for .xls. Some