Displaying 2 results from an estimated 2 matches for "pdfcontentextractor".
2006 Jul 25
1
RDig document processing error
...arding old parse
lib/ferret/query_parser/query_parser.y:216: warning: method redefined;
discarding old clean_string
/usr/lib/ruby/gems/1.8/gems/rubyful_soup-1.0.4/lib/rubyful_soup.rb:230:
warning: method redefined; discarding old attrs
discovered content extractor class:
RDig::ContentExtractors::PdfContentExtractor
discovered content extractor class:
RDig::ContentExtractors::WordContentExtractor
discovered content extractor class:
RDig::ContentExtractors::HtmlContentExtractor
using Ferret 0.9.0
/usr/local/lib/site_ruby/1.8/rdig/url_filters.rb:116: warning: instance
variable @patterns not initialized
/usr/l...
2006 Jul 14
2
RDig config file problem
Hi All,
Hope it is ok to post RDig queries on this forum.
Just trying to get RDig working (Ubuntu 6.06, RDig 0.3.0, ferret 0.9.4,
rubyful_soup 1.0.4)
Here is my output:
sh:~/rdigtry$ rdig -c config/rdig_config.rb
discovered content extractor class:
RDig::ContentExtractors::PdfContentExtractor
discovered content extractor class:
RDig::ContentExtractors::WordContentExtractor
discovered content extractor class:
RDig::ContentExtractors::HtmlContentExtractor
/home/steven/rdigtry/config/rdig_config.rb:4
/usr/lib/ruby/gems/1.8/gems/rdig-0.3.0/lib/rdig.rb:113:in
`configuration''
/hom...