similar to: Error on optimize leads to corrupt index?

Displaying 20 results from an estimated 500 matches similar to: "Error on optimize leads to corrupt index?"

2007 Jun 23
2
End of File Error on index optmize
I was optimizing a 650MB using ferret (0.11.3) and I received the following error. I''ve seen some people have similar issues but I haven''t seen any resolutions. The contents of the index directory follow the error. Has anyone seen anything like this and found a resolution? Many thanks. /mnt/apps/search/releases/20070622175637/script/../config/../vendor/
2006 Aug 24
1
[0.10.0] Random error when big import
In a rails script (something in the "script" dir with the good require) I added many document (around 4000) to an index globaly instanciate (and build if not present) in config/environment.rb. I ran 3 three times my script (I deleted my index every time before), and only the third was successful. That was STRANGE ! :) These are the errors :
2007 Aug 05
1
IO Errors on deleting documents with Ferret
I have a large index (~6GB, ~1 million docs) that was built by RDig. I wrote a script to iterate through the index to clear out some duplicate information to try to reduce the size of the index. clients.each {|client| docs = RDig.searcher.search("+supplier_id:#{client.id}") docs.each {|doc| data = doc[:data].dup #the contents of the web page new_results = {}
2007 Jan 21
14
[ActsAsFerret] OpenSolaris (TextDrive) indexing issues
Gents, I successfully installed AAF on my TextDrive OpenSolaris Container, but I''m having some issues with indexing. I have a model called Blogs which has AAF enabled. The first time I tried to find_by_contents for a ''word'' I know was on the Database I got now results. Apparently the index was not ready yet. Then I waited a few hours and checked that the /index
2007 Mar 13
6
Acts_as_ferret and auto-flush
Hi, I''m using acts_as_ferret in with a mongrel and I'' m getting locking errors that after a while result in a corrupt database. I know about the problem with different processes writing to the index but I haven''t been able to get the DRB server working properly yet. I read on this list that another solution is to set :auto_flush to true but I''m not
2007 Sep 18
4
basic rdig setup
I''m developing locally on Windows and I have a remote dev box that runs Linux. I''m trying to use RDig just to index using urls, no files. Both use acts_as_ferret for an administrative search that works fine. On the Windows machine, I get no errors, but get no results. On the Linux machine, I get: File Not Found Error occured at <except.c>:93 in xraise Error occured in
2006 Oct 17
9
Error : End-of-File Error occured at <except.c>
Everything was working fine till last night. This morning I have many errors. I am using acts_as_ferret. Last updated around a month ago on linux. There are two different type of exceptions. I have over 12 exception emails but these are the two distince types. First exception: A EOFError occurred in home#event_info: End-of-File Error occured at <except.c>:79 in xraise Error occured in
2006 Jul 25
1
RDig document processing error
Hi all, Am having problems using RDig: With this rdig config... cfg.crawler.start_urls = [''http://www.defensetech.org''] cfg.crawler.include_hosts = [''www.defensetech.org''] cfg.index.path = ''/my/path/to/index'' cfg.verbose = true ...I get this output: $ rdig -c config/rdig_config.rb /usr/local/lib/site_ruby/1.8/ferret/index/term.rb:45:
2007 Jul 29
7
RDig and AAF playing together
I have a site with two indexes. Index A is created offline by RDig and queried from the web via RDig (specifically, RDig.searcher.search). Index B is managed by AAF with :remote => true. Simple enough. However, I need to query both indexes from RDig. Usually this is ok, as I modified RDig to accept an array of search_paths with an element for index A and index B. However, when Index
2011 Apr 27
6
Assignments inside lapply
Dear all I would like to ask you if an assignment can be done inside a lapply statement. For example I would like to covert a double nested for loop for (i in c(1:dimx)){ for (j in c(1:dimy)){ Powermap[i,j] <- Pr(c(i,j),c(PRX,PRY),f) } } to something like that: ij<-expand.grid(i=seq(1:dimx),j=(1:dimy)) unlist(lapply(1:nrow(ij),function(rowId) { return
2007 Feb 10
5
Adding extra fields to an index (using RDig?)
Hello everyone, I am writing an application which collects a set of web sites and caches them locally for offline viewing. I want to do searches on this collection and associate extra data with each result (e.g date collected, reason for collection, perhaps a sequence number). Now all this data exists when the harvesting is done and could be stored in a database. I want to use RDig to index my
2007 Mar 28
7
Newbie problem on production server
Hi, I just installed ferret for the first time and integrated it with my app. On my dev machine it''s fine but on my production server I get this when I call find_by_contents(): Processing LinksController#results (for 24.185.105.59 at 2007-03-28 05:28:36) [POST] Session ID: 3f2dc7c17147c0e52178ba697a119833 Parameters: {"commit"=>"Search",
2007 Jan 21
4
could not install in WinXP
Directory of C:\search_app 01/21/2007 19:37 <DIR> . 01/21/2007 19:37 <DIR> .. 01/21/2007 19:36 427 008 ferret-0.10.13.gem 01/21/2007 19:07 148 992 rdig-0.3.4.gem 2 File(s) 576 000 bytes 2 Dir(s) 45 135 982 592 bytes free C:\search_app>gem install ferret Building native extensions. This could
2007 Sep 27
2
Problem getting "extract" from RDig
Hi All, I have to have a site wide search for my current application. By search I mean I have to search the static and the dynamic contents from the database. I have been searching on this for a while on the net and RDig seems to be a apt solution. While using it I have encountered a few problems. I know these might be very basic issues but I have not been able to figure out what is wrong with
2006 Mar 25
1
RDig - ferret-based website crawler/indexer
Hi! RDig is a small tool to build a Ferret index for the contents of a website or intranet. It contains a simple HTTP crawler and some support for extracting textual content from the fetched pages. I built this to implement a site-wide search for a recent project that combined a Rails application with lots of static html files generated by a CMS. Any feedback is very welcome! Rubyforge
2007 Jan 05
1
adding one url to rdig index?
Hey there, I''m building a rails site using RDig as a site-wide search. I would like to be able to add just one URL (or possibly a list) to an existing index, so that when certain pages change I can update the index without reindexing the entire site. I looked through the documentation and didn''t see an example on how to do this so I am looking for some guidance here :). Is
2007 Jan 23
3
Someone getting RDig work for Linux?
I got this root at linux:~# rdig -c configfile RDig version 0.3.4 using Ferret 0.10.14 added url file:///home/myaccount/documents/ waiting for threads to finish... root at linux:~# rdig -c configfile -q "Ruby" RDig version 0.3.4 using Ferret 0.10.14 executing query >Ruby< Query: total results: 0 root at linux:~# my configfile I changed from config to cfg, because of maybe
2006 Jul 14
2
RDig config file problem
Hi All, Hope it is ok to post RDig queries on this forum. Just trying to get RDig working (Ubuntu 6.06, RDig 0.3.0, ferret 0.9.4, rubyful_soup 1.0.4) Here is my output: sh:~/rdigtry$ rdig -c config/rdig_config.rb discovered content extractor class: RDig::ContentExtractors::PdfContentExtractor discovered content extractor class: RDig::ContentExtractors::WordContentExtractor discovered
2007 Aug 28
3
Still getting "too many open files"
We have still having problems with Ferret dying on us regularly with the error message: >> ferret server error IO Error occured at <except.c>:93 in xraiseError occured in fs_store.c:127 - fs_each doing ''each'' in /var/www/web1/oms/current/script/../config/../index/production/band/20070805130005: <Too many open files> << We are running Ferret as a
2006 Jun 02
1
Indexing fails -- _ntc6.tmp exceeds 2 gigabyte maximum
Ferret 0.9.3 Ruby 1.8.2 NOT storing file contents in the index. Only indexing first 25k of each file. Very large data set (1 million files, 350 Gb) Code based on snippet from David Balmain''s forum posts. After 6 hours, Ferret bails out with Ruby "exceeds max file size". Cache: -rw-r--r-- 1 bill bill 2147483647 2006-06-01 22:45 _ntc6.tmp -rw-r--r-- 1 bill bill 1690862924