search for: pdfcontentextractor

Displaying 2 results from an estimated 2 matches for "pdfcontentextractor".

2006 Jul 25
1
RDig document processing error
...arding old parse lib/ferret/query_parser/query_parser.y:216: warning: method redefined; discarding old clean_string /usr/lib/ruby/gems/1.8/gems/rubyful_soup-1.0.4/lib/rubyful_soup.rb:230: warning: method redefined; discarding old attrs discovered content extractor class: RDig::ContentExtractors::PdfContentExtractor discovered content extractor class: RDig::ContentExtractors::WordContentExtractor discovered content extractor class: RDig::ContentExtractors::HtmlContentExtractor using Ferret 0.9.0 /usr/local/lib/site_ruby/1.8/rdig/url_filters.rb:116: warning: instance variable @patterns not initialized /usr/l...
2006 Jul 14
2
RDig config file problem
Hi All, Hope it is ok to post RDig queries on this forum. Just trying to get RDig working (Ubuntu 6.06, RDig 0.3.0, ferret 0.9.4, rubyful_soup 1.0.4) Here is my output: sh:~/rdigtry$ rdig -c config/rdig_config.rb discovered content extractor class: RDig::ContentExtractors::PdfContentExtractor discovered content extractor class: RDig::ContentExtractors::WordContentExtractor discovered content extractor class: RDig::ContentExtractors::HtmlContentExtractor /home/steven/rdigtry/config/rdig_config.rb:4 /usr/lib/ruby/gems/1.8/gems/rdig-0.3.0/lib/rdig.rb:113:in `configuration'' /hom...