Erik Morton
2007-Aug-05  17:17 UTC
[Ferret-talk] IO Errors on deleting documents with Ferret
I have a large index (~6GB, ~1 million docs) that was built by RDig.  
I wrote a script to iterate through the index to clear out some  
duplicate information to try to reduce the size of the index.
clients.each {|client|
	docs = RDig.searcher.search("+supplier_id:#{client.id}")
	docs.each {|doc|
		data = doc[:data].dup #the contents of the web page
                 new_results = {}
                 new_results[:client_id] = client.id
                 new_results[:data] = data
                 index.delete doc[:doc_id]
                 index << new_results
         }
}
I''ve run a similar script before with no issues. However today I  
received the following error after 30 minutes or so:
/usr/lib/ruby/site_ruby/1.8/ferret/index.rb:726:in `initialize'': IO  
Error occured at <except.c>:93 in xraise (IOError)
Error occured in index.c:901 - sis_find_segments_file
         Error reading the segment infos. Store listing was
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:726:in  
`ensure_reader_open''
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:434:in  
`delete''
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:8:in  
`synchrolock''
         from /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize''
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:8:in  
`synchrolock''
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:428:in  
`delete''
Despite the error the index appear to not be corrupted, so I ran the  
script again for fun. The following error occurred after  
approximately 20 minutes:
/usr/lib/ruby/site_ruby/1.8/ferret/index.rb:723:in `close'': IO Error  
occured at <except.c>:93 in xraise (IOError)
Error occured in fs_store.c:264 - fs_new_output
         couldn''t create OutStream /mnt/apps/search/current/../../ 
shared/indexes/final/_a4kx.prx: <Too many open files>
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:723:in  
`ensure_reader_open''
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:434:in  
`delete''
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:8:in  
`synchrolock''
         from /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize''
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:8:in  
`synchrolock''
         from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:428:in  
`delete''
Here are the contents of the index directory:
[root at files]# ls -Al ../../shared/indexes/final/
total 5628324
-rw-------  1 initiate initiate 5713121647 Jul 31 14:22 _5d3s.cfs
-rw-------  1 root     root         115159 Aug  5 12:55 _5d3s_2yyy.del
-rw-------  1 root     root       22937900 Aug  5 11:28 _7tgc.cfs
-rw-------  1 root     root          11475 Aug  5 12:55 _7tgc_sx7.del
-rw-------  1 root     root        2220338 Aug  5 11:38 _820z.cfs
-rw-------  1 root     root        2311840 Aug  5 11:47 _8alm.cfs
-rw-------  1 root     root        2261887 Aug  5 11:56 _8j69.cfs
-rw-------  1 root     root        2089120 Aug  5 12:05 _8rqw.cfs
-rw-------  1 root     root        2244470 Aug  5 12:14 _90bj.cfs
-rw-------  1 root     root        2249160 Aug  5 12:22 _98w6.cfs
-rw-------  1 root     root        2231091 Aug  5 12:31 _9hgt.cfs
-rw-------  1 root     root        2244881 Aug  5 12:40 _9q1g.cfs
-rw-------  1 root     root        2273703 Aug  5 12:48 _9ym3.cfs
-rw-------  1 root     root         235566 Aug  5 12:49 _9zgy.cfs
-rw-------  1 root     root         220959 Aug  5 12:50 _a0bt.cfs
-rw-------  1 root     root         229074 Aug  5 12:51 _a16o.cfs
-rw-------  1 root     root         202310 Aug  5 12:52 _a21j.cfs
-rw-------  1 root     root         135823 Aug  5 12:53 _a2we.cfs
-rw-------  1 root     root         132935 Aug  5 12:54 _a3r9.cfs
-rw-------  1 root     root          14190 Aug  5 12:54 _a3uc.cfs
-rw-------  1 root     root          13868 Aug  5 12:54 _a3xf.cfs
-rw-------  1 root     root          13758 Aug  5 12:54 _a40i.cfs
-rw-------  1 root     root          14912 Aug  5 12:54 _a43l.cfs
-rw-------  1 root     root          13750 Aug  5 12:54 _a46o.cfs
-rw-------  1 root     root          14170 Aug  5 12:54 _a49r.cfs
-rw-------  1 root     root          13764 Aug  5 12:55 _a4cu.cfs
-rw-------  1 root     root          13719 Aug  5 12:55 _a4fx.cfs
-rw-------  1 root     root          13115 Aug  5 12:55 _a4j0.cfs
-rw-------  1 root     root           1826 Aug  5 12:55 _a4jb.cfs
-rw-------  1 root     root           1935 Aug  5 12:55 _a4jm.cfs
-rw-------  1 root     root           1739 Aug  5 12:55 _a4jx.cfs
-rw-------  1 root     root           1865 Aug  5 12:55 _a4k8.cfs
-rw-------  1 root     root           2072 Aug  5 12:55 _a4kj.cfs
-rw-------  1 root     root           1733 Aug  5 12:55 _a4ku.cfs
-rw-------  1 root     root            378 Aug  5 12:55 _a4kv.cfs
-rw-------  1 root     root            462 Aug  5 12:55 _a4kw.cfs
-rw-------  1 root     root            128 Aug  5 12:55 _a4kx.fdt
-rw-------  1 root     root              0 Aug  5 12:55 _a4kx.fdx
-rw-------  1 root     root              0 Aug  5 12:55 _a4kx.frq
-rw-------  1 root     root              0 Aug  5 12:55 _a4kx.tfx
-rw-------  1 root     root              0 Aug  5 12:55 _a4kx.tis
-rw-------  1 root     root              0 Aug  5 12:55 _a4kx.tix
-rw-------  1 root     root              0 Aug  5 12:55 ferret-write.lck
-rw-------  1 initiate initiate         16 Aug  5 12:55 segments
-rw-------  1 root     root           1142 Aug  5 12:55 segments_isfj
Here''s my platform: Linux xenU #1 SMP Thu Nov 30 13:48:50 SAST 2006  
i686 athlon i386 GNU/Linux
I''m using ruby 1.8.4 and Ferret 0.11.4, which has been hacked to add  
in better large file support.
Does anyone have any idea what''s going on? Many thanks in advance.
Erik
Benjamin Krause
2007-Aug-06  07:32 UTC
[Ferret-talk] IO Errors on deleting documents with Ferret
On 2007-08-05, at 7:17 PM, Erik Morton wrote:> Error occured in index.c:901 - sis_find_segments_file > Error reading the segment infos. Store listing was > > couldn''t create OutStream /mnt/apps/search/current/../../ > shared/indexes/final/_a4kx.prx: <Too many open files>Hey .. Both errors might have the same reason - to many open files .. I''ve had similar errors some month ago and raised my open files to 32k, and didn''t had an error since .. rails at omdb.org ~ $ ulimit -n 32768 Benjamin