Hi,
I''m adding some news articles to a keyed Ferret 0.10.14 index and
encountering quite serious instability when concurrently reading and
writing to the index, even though with just 1 writer and 1 reader
process.
If I recreate the index without a key, concurrent reading and writing
seem to work fine (and indexing is about 10 times quicker :)
I''m testing by running my indexing script (which retrieves up to 1000
database records using ActiveRecord, adds to the index and exits) and
concurrently manually re-running a search on the index using my Rails
web interface. This is in a dev environment with only 1 user (me) and
about 58000 docs.
The error I get is along the lines of the following, with a different
filename each time:
IO Error occured at <except.c>:79 in xraise
Error occured in fs_store.c:324 - fs_open_input
couldn''ferret_index/development/news_article_versions/_2ih.tix: <No
such file or directory>
/usr/lib/ruby/gems/1.8/gems/ferret-0.10.14/lib/ferret/index.rb:682:in
`initialize''
/usr/lib/ruby/gems/1.8/gems/ferret-0.10.14/lib/ferret/index.rb:682:in
`ensure_reader_open''
/usr/lib/ruby/gems/1.8/gems/ferret-0.10.14/lib/ferret/index.rb:385:in
`[]''
/usr/lib/ruby/1.8/monitor.rb:229:in `synchronize''
/usr/lib/ruby/gems/1.8/gems/ferret-0.10.14/lib/ferret/index.rb:384:in
`[]''
#{RAILS_ROOT}/app/models/news_article_version.rb:35:in `ferret_search''
#{RAILS_ROOT}/app/models/news_article_version.rb:35:in `ferret_search''
#{RAILS_ROOT}/app/controllers/news_articles_controller.rb:56:in
`search''
It seems to occur roughly once per batch, and usually towards the end of
the batch. I''m not using aaf. I create my keyed index like this:
@@ferret_index = Index::Index.new(:path =>
"#{RAILS_ROOT}/ferret_index/#{RAILS_ENV}/news_article_versions",
:field_infos => field_infos,
:id_field => :id,
:key => :id,
:default_input_field => :text)
Unkeyed, I just drop the :key option (duh). :id is just the
ActiveRecord id, from an auto_increment field in MySQL.
As a note, when concurrently searching on the keyed index, the number of
hits returned increases throughout the indexing process. With a
non-keyed index, the number of hits doesn''t increase until the end.
It looks to me that when using a keyed index, Ferret commits each record
added. When non-keyed, it commits when the Index is closed. That I
don''t get the error with non-keyed might just be because there are less
commits, so less opportunities for the "bug" to trigger.
Is this is bug I''ve come across? Is concurrent reading/writing like
this expected to work?
I''m using Ferret 0.10.14 on Ubuntu Edgy, with "ruby 1.8.4
(2005-12-24)
[i486-linux]" and "gcc version 4.1.2 20060928"
Thanks in advance!
John
--
http://johnleach.co.uk