Seth J. Morabito
2007-Jan-10 05:47 UTC
[Ferret-talk] Corrupt index and segfaults with heavy writes?
Hi everyone, We''re running a fairly heavily used Rails app that uses ferret (and acts_as_ferret) for search. We''re running on mongrel+Apache, Ruby 1.8.4, and ferret 0.10.13. We''re indexing a handful of attributes on our "Image" and "User" models. After the system has been running for several days, the index gradually becomes corrupted, and ferret begins to segfault once the corruption is bad enough. We''re running with ten mongrel servers balanced behind Apache, so it takes a while before they all die due to the segfaulting. The segfault spits out the following line into mongrel.log: /usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/index.rb:271: [BUG] Segmentation fault ruby 1.8.4 (2005-12-24) [i686-linux] I haven''t done a ton of deeper investigation, but we suspect it may be related to locking problems. As I said, we''re getting fairly heavy use, and every time a user views an Image, a view counter is updated and the Image is saved. This causes acts_as_ferret to re-add the model to the index, so the index is getting heavy write use. As a side effect of this, we see a lot of locking errors in the logs, which cause 500 error for our users: Ferret::Store::Lock::LockError (Lock Error occured at <except.c>:103 in xpop_context Error occured in index.c:5368 - iw_open Couldn''t obtain write lock when opening IndexWriter Eventually, we start seeing corruption errors like these (as an example): End-of-File Error occured at <except.c>:79 in xraise Error occured in compound_io.c:123 - cmpdi_read_i Tried to read past end of file. File length is <9> and tried to read to <19> And then boom, mongrel processes start to die, slowly. IF the locking is leading to corruption problems, one thing that would really help is if we didn''t update the index on every write. We''re not searching on the image view counter, so this might end up being more of an acts_as_ferret question than a ferret question (i.e., it''d be nice to tell acts_as_ferret not to reindex the model if we''re not updating an attribute we search on!). But that aside, has anyone else encountered problems with heavy writing? Thanks much, -Seth
Jens Kraemer
2007-Jan-10 08:29 UTC
[Ferret-talk] Corrupt index and segfaults with heavy writes?
Hi! On Tue, Jan 09, 2007 at 09:47:03PM -0800, Seth J. Morabito wrote:> Hi everyone, >[..]> > IF the locking is leading to corruption problems, one thing that would > really help is if we didn''t update the index on every write. We''re not > searching on the image view counter, so this might end up being more of > an acts_as_ferret question than a ferret question (i.e., it''d be nice > to tell acts_as_ferret not to reindex the model if we''re not updating an > attribute we search on!).if you''re on aaf trunk, this is possible: model_instance.disable_ferret # will disable ferret for the next save model_instance.save or model_instance.disable_ferret do # ferret is disabled for all saves model_instance.save # occuring inside the block end> But that aside, has anyone else encountered problems with heavy > writing?Yes, we''ve had the very same errors in an application not using aaf. Moveing all the indexing into a single backgroundrb process. Since then everything is fine. I have a drb indexing feature for aaf in the works, too. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66
Hi,> >IF the locking is leading to corruption problems, one thing that would >really help is if we didn''t update the index on every write. We''re not >searching on the image view counter, so this might end up being more of >an acts_as_ferret question than a ferret question (i.e., it''d be nice >to tell acts_as_ferret not to reindex the model if we''re not updating an >attribute we search on!).acts_as_ferret(:fields => [:filename, :creator, ...]) With this you can control the fields that are indexed with ferret. It will produce less overhead if you don''t index fields you don''t search full-text. Regards, Ewout