Seth J. Morabito
2007-Jan-10 05:47 UTC
[Ferret-talk] Corrupt index and segfaults with heavy writes?
Hi everyone,
We''re running a fairly heavily used Rails app that uses ferret (and
acts_as_ferret) for search. We''re running on mongrel+Apache, Ruby
1.8.4, and ferret 0.10.13. We''re indexing a handful of attributes on
our "Image" and "User" models.
After the system has been running for several days, the index gradually
becomes corrupted, and ferret begins to segfault once the corruption is
bad enough. We''re running with ten mongrel servers balanced behind
Apache, so it takes a while before they all die due to the segfaulting.
The segfault spits out the following line into mongrel.log:
/usr/local/lib/ruby/gems/1.8/gems/ferret-0.10.13/lib/ferret/index.rb:271: [BUG]
Segmentation fault
ruby 1.8.4 (2005-12-24) [i686-linux]
I haven''t done a ton of deeper investigation, but we suspect it may be
related to locking problems. As I said, we''re getting fairly heavy
use,
and every time a user views an Image, a view counter is updated and the
Image is saved. This causes acts_as_ferret to re-add the model to the
index, so the index is getting heavy write use. As a side effect of
this, we see a lot of locking errors in the logs, which cause 500 error
for our users:
Ferret::Store::Lock::LockError (Lock Error occured at <except.c>:103 in
xpop_context
Error occured in index.c:5368 - iw_open
Couldn''t obtain write lock when opening IndexWriter
Eventually, we start seeing corruption errors like these (as an example):
End-of-File Error occured at <except.c>:79 in xraise
Error occured in compound_io.c:123 - cmpdi_read_i
Tried to read past end of file. File length is <9> and tried to
read to <19>
And then boom, mongrel processes start to die, slowly.
IF the locking is leading to corruption problems, one thing that would
really help is if we didn''t update the index on every write.
We''re not
searching on the image view counter, so this might end up being more of
an acts_as_ferret question than a ferret question (i.e., it''d be nice
to tell acts_as_ferret not to reindex the model if we''re not updating
an
attribute we search on!). But that aside, has anyone else encountered
problems with heavy writing?
Thanks much,
-Seth
Jens Kraemer
2007-Jan-10 08:29 UTC
[Ferret-talk] Corrupt index and segfaults with heavy writes?
Hi! On Tue, Jan 09, 2007 at 09:47:03PM -0800, Seth J. Morabito wrote:> Hi everyone, >[..]> > IF the locking is leading to corruption problems, one thing that would > really help is if we didn''t update the index on every write. We''re not > searching on the image view counter, so this might end up being more of > an acts_as_ferret question than a ferret question (i.e., it''d be nice > to tell acts_as_ferret not to reindex the model if we''re not updating an > attribute we search on!).if you''re on aaf trunk, this is possible: model_instance.disable_ferret # will disable ferret for the next save model_instance.save or model_instance.disable_ferret do # ferret is disabled for all saves model_instance.save # occuring inside the block end> But that aside, has anyone else encountered problems with heavy > writing?Yes, we''ve had the very same errors in an application not using aaf. Moveing all the indexing into a single backgroundrb process. Since then everything is fine. I have a drb indexing feature for aaf in the works, too. cheers, Jens -- webit! Gesellschaft f?r neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Kr?mer kraemer at webit.de Schnorrstra?e 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66
Hi,> >IF the locking is leading to corruption problems, one thing that would >really help is if we didn''t update the index on every write. We''re not >searching on the image view counter, so this might end up being more of >an acts_as_ferret question than a ferret question (i.e., it''d be nice >to tell acts_as_ferret not to reindex the model if we''re not updating an >attribute we search on!).acts_as_ferret(:fields => [:filename, :creator, ...]) With this you can control the fields that are indexed with ferret. It will produce less overhead if you don''t index fields you don''t search full-text. Regards, Ewout