similar to: rdig wildcard searches

Displaying 20 results from an estimated 20000 matches similar to: "rdig wildcard searches"

2007 Jan 23
3
Someone getting RDig work for Linux?
I got this root at linux:~# rdig -c configfile RDig version 0.3.4 using Ferret 0.10.14 added url file:///home/myaccount/documents/ waiting for threads to finish... root at linux:~# rdig -c configfile -q "Ruby" RDig version 0.3.4 using Ferret 0.10.14 executing query >Ruby< Query: total results: 0 root at linux:~# my configfile I changed from config to cfg, because of maybe
2007 Feb 15
3
Proximity searching in rdig ferret
Lucene has a syntax "foo bar"~10 for finding foo within 10 words of bar. Does ferret support this feature? (the ~ is used for fuzzy queries) Does rdig? This could be a deal breaker for me ''cos I really need proximity searches -- Posted via http://www.ruby-forum.com/.
2006 Mar 25
1
RDig - ferret-based website crawler/indexer
Hi! RDig is a small tool to build a Ferret index for the contents of a website or intranet. It contains a simple HTTP crawler and some support for extracting textual content from the fetched pages. I built this to implement a site-wide search for a recent project that combined a Rails application with lots of static html files generated by a CMS. Any feedback is very welcome! Rubyforge
2006 Jul 25
1
RDig document processing error
Hi all, Am having problems using RDig: With this rdig config... cfg.crawler.start_urls = [''http://www.defensetech.org''] cfg.crawler.include_hosts = [''www.defensetech.org''] cfg.index.path = ''/my/path/to/index'' cfg.verbose = true ...I get this output: $ rdig -c config/rdig_config.rb /usr/local/lib/site_ruby/1.8/ferret/index/term.rb:45:
2006 Jul 14
2
RDig config file problem
Hi All, Hope it is ok to post RDig queries on this forum. Just trying to get RDig working (Ubuntu 6.06, RDig 0.3.0, ferret 0.9.4, rubyful_soup 1.0.4) Here is my output: sh:~/rdigtry$ rdig -c config/rdig_config.rb discovered content extractor class: RDig::ContentExtractors::PdfContentExtractor discovered content extractor class: RDig::ContentExtractors::WordContentExtractor discovered
2007 Jul 29
7
RDig and AAF playing together
I have a site with two indexes. Index A is created offline by RDig and queried from the web via RDig (specifically, RDig.searcher.search). Index B is managed by AAF with :remote => true. Simple enough. However, I need to query both indexes from RDig. Usually this is ok, as I modified RDig to accept an array of search_paths with an element for index A and index B. However, when Index
2007 Feb 10
5
Adding extra fields to an index (using RDig?)
Hello everyone, I am writing an application which collects a set of web sites and caches them locally for offline viewing. I want to do searches on this collection and associate extra data with each result (e.g date collected, reason for collection, perhaps a sequence number). Now all this data exists when the harvesting is done and could be stored in a database. I want to use RDig to index my
2007 Jun 23
2
End of File Error on index optmize
I was optimizing a 650MB using ferret (0.11.3) and I received the following error. I''ve seen some people have similar issues but I haven''t seen any resolutions. The contents of the index directory follow the error. Has anyone seen anything like this and found a resolution? Many thanks. /mnt/apps/search/releases/20070622175637/script/../config/../vendor/
2007 Sep 27
2
Problem getting "extract" from RDig
Hi All, I have to have a site wide search for my current application. By search I mean I have to search the static and the dynamic contents from the database. I have been searching on this for a while on the net and RDig seems to be a apt solution. While using it I have encountered a few problems. I know these might be very basic issues but I have not been able to figure out what is wrong with
2007 Sep 18
4
basic rdig setup
I''m developing locally on Windows and I have a remote dev box that runs Linux. I''m trying to use RDig just to index using urls, no files. Both use acts_as_ferret for an administrative search that works fine. On the Windows machine, I get no errors, but get no results. On the Linux machine, I get: File Not Found Error occured at <except.c>:93 in xraise Error occured in
2007 Jan 21
4
could not install in WinXP
Directory of C:\search_app 01/21/2007 19:37 <DIR> . 01/21/2007 19:37 <DIR> .. 01/21/2007 19:36 427 008 ferret-0.10.13.gem 01/21/2007 19:07 148 992 rdig-0.3.4.gem 2 File(s) 576 000 bytes 2 Dir(s) 45 135 982 592 bytes free C:\search_app>gem install ferret Building native extensions. This could
2006 Nov 19
1
score for wildcard searches
Hello All, I have a rails app that maintains movie data index and uses "acts_as_ferret" for search. I ran into an issue with the scoring of wildcard searches. When I search for word "super*", the record containing the word "superman" is ranked above the one having just "super". Is this normal or am I missing something? Any ideas on how scoring can be
2007 Jan 05
1
adding one url to rdig index?
Hey there, I''m building a rails site using RDig as a site-wide search. I would like to be able to add just one URL (or possibly a list) to an existing index, so that when certain pages change I can update the index without reindexing the entire site. I looked through the documentation and didn''t see an example on how to do this so I am looking for some guidance here :). Is
2007 Apr 14
3
Error on optimize leads to corrupt index?
The following exception occurred while trying optimize a large index: vendor/gems/rdig-0.3.4/lib/rdig/index.rb:46:in `optimize'': End-of- File Error occured at <except.c>:93 in xraise (EOFError) Error occured in store.c:216 - is_refill current pos = 0, file length = 0 Now, I get the following error any time I try to create a new index on the directory that I was trying
2007 Jan 05
3
Confused about Search Results
Hi everyone, I''m pretty new to Lucene and Ferret, so I feel that this is most likely myself not completely understanding the correct way to do this. I haved indexed ~2200 text files (of various sizes), and I am now running searches on the index to get a feel for Lucene and Ferret. In my first program, which is using Lucene I search for ''influenza'' and get the
2006 Nov 04
0
Ferret 0.10.6 released (and some benchmarks)
Hey folks, ** Description ** Firstly for those who don''t know, Ferret is a full-text search library which makes adding search to your application a breeze. It''s much faster than MySQL full-text search as well most other search libraries out there. It allows you to do Boolean (+ruby + rails -jewelry) and phrase queries ("the quick brown fox") as well as some more
2006 Aug 21
6
multiple-index searching with merged results
Hey.. i am just browsing through the lucene features and i''m wondering if this feature is available in ferret as well .. # multiple-index searching with merged results this would be nice, as i''m thinking about several indexes, as i am using a lot of wildcard queries for livesearches like google suggest. i think the performance would increase, if i split my rather big index in
2007 Aug 05
1
IO Errors on deleting documents with Ferret
I have a large index (~6GB, ~1 million docs) that was built by RDig. I wrote a script to iterate through the index to clear out some duplicate information to try to reduce the size of the index. clients.each {|client| docs = RDig.searcher.search("+supplier_id:#{client.id}") docs.each {|doc| data = doc[:data].dup #the contents of the web page new_results = {}
2007 Nov 05
6
Strange wildcard problem
Hi, Apologies for reposting this for those who read this via ruby-forum, but it didn''t make it to the list before, and the list seems more active... I''m using ferret (via acts_as_ferret) in a somewhat unorthodox manner and am having a strange wildcard problem. Before anyone wonders why we''re doing things this way, the answer is basically that it lets us
2007 Oct 08
1
wildcard searches with german umlauts
i just noticed a weird problem. i can successfully search with full terms like "Fl?chendesinfektionsstufen" or "Regionalan?sthesie" for example and get correct hits. but when i search for those entries with wildcards "Fl?chendesinfektion*" or "Regionalan?s*" it won''t find anything while "*chendesinfektionsstufen" or "*sthesie"