search for: start_urls

Displaying 6 results from an estimated 6 matches for "start_urls".

Did you mean: start_tls
2007 Jan 21
4
could not install in WinXP
Directory of C:\search_app 01/21/2007 19:37 <DIR> . 01/21/2007 19:37 <DIR> .. 01/21/2007 19:36 427 008 ferret-0.10.13.gem 01/21/2007 19:07 148 992 rdig-0.3.4.gem 2 File(s) 576 000 bytes 2 Dir(s) 45 135 982 592 bytes free C:\search_app>gem install ferret Building native extensions. This could
2006 Jul 14
2
RDig config file problem
...dig-0.3.0/lib/rdig.rb:236:in `run'': No Configfile found! (RuntimeError) undefined method `path='' for nil:NilClass from /usr/lib/ruby/gems/1.8/gems/rdig-0.3. 0/bin/rdig:13 from /usr/bin/rdig:18 and here is my config file: RDig.configuration do |cfg| cfg.crawler.start_urls = [ ''http://bbc.co.uk'' ] cfg.indexer.path = ''/home/steven/rdigtry/index'' cfg.verbose = true end Seems as though the RDig script can''t load my config file? Any advice very gratefully received. Many Thanks, Steven -- Posted via http://www.ruby-foru...
2007 Jan 23
3
Someone getting RDig work for Linux?
...configfile I changed from config to cfg, because of maybe mistyping cfg.index.create = false RDig.configuration do |cfg| ################################################################## # options you really should set # provide one or more URLs for the crawler to start from cfg.crawler.start_urls = [ ''http://www.example.com/'' ] # use something like this for crawling a file system: cfg.crawler.start_urls = [ ''file:///home/myaccount/documents/'' ] # beware, mixing file and http crawling is not possible and might result in # unpredictable results....
2006 Jul 25
1
RDig document processing error
Hi all, Am having problems using RDig: With this rdig config... cfg.crawler.start_urls = [''http://www.defensetech.org''] cfg.crawler.include_hosts = [''www.defensetech.org''] cfg.index.path = ''/my/path/to/index'' cfg.verbose = true ...I get this output: $ rdig -c config/rdig_config.rb /usr/local/lib/site_ruby/1.8/ferret/index/term.r...
2007 Sep 27
2
Problem getting "extract" from RDig
...ble to figure out what is wrong with the code. I had the following lines in my /config/environment.rb 1. require ''rdig'' 2. require ''rdig_config'' I have the following code in my /config/rdig_config.rb 1. RDig.configuration do |cfg| 2. cfg.crawler.start_urls = [ ''http://localhost:3000/login/index'' ] 3. cfg.index.path = "C:/rails/managedsupport/index/development/rdig-index" 4. cfg.verbose = true 5. cfg.content_extraction = OpenStruct.new( 6. :hpricot => OpenStruct.new( 7. :title_tag_sele...
2007 Sep 18
4
basic rdig setup
...le On both machines I have run the indexer with no errors using: rdig -c config/rdig_config.rb Both machines have an index dir at the rails root that has two files, segments and segments_0. Both files look like they have next to nothing in them. Both rdig_config.rb files look like: cfg.crawler.start_urls = [ ''http://domain.tpl/'' ] cfg.crawler.include_hosts = [ ''domain.tpl/'' ] cfg.index.path = ''./rdig_index'' cfg.verbose = true cfg.content_extraction = OpenStruct.new( :hpricot => OpenStruct.new( :title_tag_selecto...