Displaying 3 results from an estimated 3 matches for "title_tag_selector".
2007 Jan 23
3
Someone getting RDig work for Linux?
...ml parser used by default from RDig 0.3.3 upwards.
# Hpricot by far outperforms Rubyful Soup, and is at least as flexible
when
# it comes to selection of portions of the html documents.
:hpricot => OpenStruct.new(
# css selector for the element containing the page title
:title_tag_selector => ''title'',
# might also be a proc returning either an element or a string:
# :title_tag_selector => lambda { |hpricot_doc| ... }
:content_tag_selector => ''body''
# might also be a proc returning either an element or a string:...
2007 Sep 27
2
Problem getting "extract" from RDig
...ler.start_urls = [ ''http://localhost:3000/login/index'' ]
3. cfg.index.path =
"C:/rails/managedsupport/index/development/rdig-index"
4. cfg.verbose = true
5. cfg.content_extraction = OpenStruct.new(
6. :hpricot => OpenStruct.new(
7. :title_tag_selector => ''title'',
8. :content_tag_selector => ''body''
9. )
10. )
11.
12. end
I have created the index file using the code
1. rdig -c config/rdig_config.rb
Now in my controller I have written a code for testing the functionality
1....
2007 Sep 18
4
basic rdig setup
...rawler.start_urls = [ ''http://domain.tpl/'' ]
cfg.crawler.include_hosts = [ ''domain.tpl/'' ]
cfg.index.path = ''./rdig_index''
cfg.verbose = true
cfg.content_extraction = OpenStruct.new(
:hpricot => OpenStruct.new(
:title_tag_selector => ''title'',
:content_tag_selector => ''body''
)
Both enviroment.rb files have:
require ''acts_as_ferret''
require ''rdig''
require ''rdig_config''
Finally, both have rdig and hpricot gems installed....