similar to: adding one url to rdig index?

Displaying 20 results from an estimated 6000 matches similar to: "adding one url to rdig index?"

2006 Jul 25
1
RDig document processing error
Hi all, Am having problems using RDig: With this rdig config... cfg.crawler.start_urls = [''http://www.defensetech.org''] cfg.crawler.include_hosts = [''www.defensetech.org''] cfg.index.path = ''/my/path/to/index'' cfg.verbose = true ...I get this output: $ rdig -c config/rdig_config.rb /usr/local/lib/site_ruby/1.8/ferret/index/term.rb:45:
2006 Mar 25
1
RDig - ferret-based website crawler/indexer
Hi! RDig is a small tool to build a Ferret index for the contents of a website or intranet. It contains a simple HTTP crawler and some support for extracting textual content from the fetched pages. I built this to implement a site-wide search for a recent project that combined a Rails application with lots of static html files generated by a CMS. Any feedback is very welcome! Rubyforge
2006 Jul 14
2
RDig config file problem
Hi All, Hope it is ok to post RDig queries on this forum. Just trying to get RDig working (Ubuntu 6.06, RDig 0.3.0, ferret 0.9.4, rubyful_soup 1.0.4) Here is my output: sh:~/rdigtry$ rdig -c config/rdig_config.rb discovered content extractor class: RDig::ContentExtractors::PdfContentExtractor discovered content extractor class: RDig::ContentExtractors::WordContentExtractor discovered
2007 Jan 23
3
Someone getting RDig work for Linux?
I got this root at linux:~# rdig -c configfile RDig version 0.3.4 using Ferret 0.10.14 added url file:///home/myaccount/documents/ waiting for threads to finish... root at linux:~# rdig -c configfile -q "Ruby" RDig version 0.3.4 using Ferret 0.10.14 executing query >Ruby< Query: total results: 0 root at linux:~# my configfile I changed from config to cfg, because of maybe
2007 Jul 29
7
RDig and AAF playing together
I have a site with two indexes. Index A is created offline by RDig and queried from the web via RDig (specifically, RDig.searcher.search). Index B is managed by AAF with :remote => true. Simple enough. However, I need to query both indexes from RDig. Usually this is ok, as I modified RDig to accept an array of search_paths with an element for index A and index B. However, when Index
2007 Jan 21
4
could not install in WinXP
Directory of C:\search_app 01/21/2007 19:37 <DIR> . 01/21/2007 19:37 <DIR> .. 01/21/2007 19:36 427 008 ferret-0.10.13.gem 01/21/2007 19:07 148 992 rdig-0.3.4.gem 2 File(s) 576 000 bytes 2 Dir(s) 45 135 982 592 bytes free C:\search_app>gem install ferret Building native extensions. This could
2007 Feb 10
5
Adding extra fields to an index (using RDig?)
Hello everyone, I am writing an application which collects a set of web sites and caches them locally for offline viewing. I want to do searches on this collection and associate extra data with each result (e.g date collected, reason for collection, perhaps a sequence number). Now all this data exists when the harvesting is done and could be stored in a database. I want to use RDig to index my
2007 Sep 18
4
basic rdig setup
I''m developing locally on Windows and I have a remote dev box that runs Linux. I''m trying to use RDig just to index using urls, no files. Both use acts_as_ferret for an administrative search that works fine. On the Windows machine, I get no errors, but get no results. On the Linux machine, I get: File Not Found Error occured at <except.c>:93 in xraise Error occured in
2007 Sep 27
2
Problem getting "extract" from RDig
Hi All, I have to have a site wide search for my current application. By search I mean I have to search the static and the dynamic contents from the database. I have been searching on this for a while on the net and RDig seems to be a apt solution. While using it I have encountered a few problems. I know these might be very basic issues but I have not been able to figure out what is wrong with
2007 Jun 23
2
End of File Error on index optmize
I was optimizing a 650MB using ferret (0.11.3) and I received the following error. I''ve seen some people have similar issues but I haven''t seen any resolutions. The contents of the index directory follow the error. Has anyone seen anything like this and found a resolution? Many thanks. /mnt/apps/search/releases/20070622175637/script/../config/../vendor/
2006 May 22
7
how to index the result of any instance method
Hi, One of the AAF features is to be able to index results of methods, but I haven''t seen anywhere how to do this. I have a method that returns the full text of a file and I''d like for this to be indexed. Can anyone out there help me out on this one? Tom -- Posted via http://www.ruby-forum.com/.
2007 Jun 24
1
Example for using ferret search engine
Hi, Is there any application where I can see the usage of Ferret engine(like example implementation). I have some difficulties in using it, sending query and getting the results. Thank you, Raj. -- Posted via http://www.ruby-forum.com/.
2006 Nov 17
4
acts_as_ferret and searching word docs
I was wondering if it is possible to search word documents using ferret. The actual text in a word document isn''t in a binary format - only the formatting. Surely it would be possible to parse that? -- Posted via http://www.ruby-forum.com/.
2006 Dec 15
1
acts_as_ferret: reindexing it too slow
Hi, Recently, I was trying to play around with AAF and found that reindexing table is very slow. Then I started looking into Ferret performance and tried myself and found that it''s very fast. Then, I just used Ferret to index my table and it was also very fast. All good. Then why reindexing using AAF is slow. After sometime I found that in the AAF, it uses (:key => :id) in
2007 Feb 15
3
Proximity searching in rdig ferret
Lucene has a syntax "foo bar"~10 for finding foo within 10 words of bar. Does ferret support this feature? (the ~ is used for fuzzy queries) Does rdig? This could be a deal breaker for me ''cos I really need proximity searches -- Posted via http://www.ruby-forum.com/.
2007 Jun 07
5
Advise on slowness in bootstrapping?
I am looking at trying to use ferret/aaf to supplement my querying against a medium and large table with lots of columns. Some facts first: Ferret 0.11.4 AAF 0.4.0 Ruby 1.8.6 Rails 1.2.3 Medium table: 105,464 rows 168 columns (mostly varchar(20)) 11 actual columns indexed in aaf plus 40 virtual columns indexed in aaf (virtual is concat of two physical columns. e.g. cast_first_name_1 +
2006 Aug 25
7
disabling automatic indexing in acts_as_ferret
I''d like to be able to enable/disable the automatic indexing of documents acts_as_ferret does. Something like MyModel.disable_indexing MyModel.enable_indexing would be perfect. I need this because I do some indexing that requires visiting the parents of the model objects and my import method imports the children first, so the information isn''t there yet. I''d like to
2007 Jun 12
5
index browser inconsistent with IndexReader
Hi, We have an index of around 1M web pages as part of our web app. The app uses ferret by way of RDig to perform searches. We have noticed anecdotally that some searches don''t work the way we thought they should, as if documents were missing from the index. Yesterday we came upon a concrete instance of this. Our documents have several fields, one of which is called :keywords and
2007 Apr 06
3
Double work at Model.rebuild_index
I''m noting that every time I run Model.rebuild_index its running twice the rebuild_index. Also, on ferret_index.log there is only one small difference from the first and second time, see: First time it shows: rebuild index: [] reindexing model User After it finishes, it automatically starts the second time and shows; rebuild index: [["User"]] reindexing model User The full
2007 Feb 15
0
rdig wildcard searches
Lucene has simple wildcard syntax supporting ? and * thus ruby could be matched by rub? r*by etc. This doesn''t work using rdig on the command line e.g. rdig -c config.rb -q ''data:"ru?y"'' gives RDig version 0.3.4 using Ferret 0.10.14 executing query >data:"ru?y"< Query: data:"ru y"~1 which is something entirely different. The