Patrick Wright
2007-Oct-10 19:19 UTC
[Ferret-talk] Multiple index instances and ferret/acts_as_ferret
We''re running Ferret and acts_as_ferret in our production environment. We have multiple mongrels talking to a single index on a separate (virtual) server over DRb. This is working ok for now, as our index updates are fairly infrequent. I''m concerned with the lack of rendundancy/scalability in this layout. Our index won''t get too big - maybe 100k indexed objects, each no more than 500 words - but it needs to be highly available, like the rest of our site (www.caring.com, if you are interested). One alternate approach I''m considering would be to do something like this: - disable the after_save callbacks in acts_as_ferret in production mode, to stop multiple mongrels writing to the index. - move all index writes to a centralized batch process which interacts with a ''master'' index - periodically clone out the master index to slave indexes located locally to each user-facing rails index (not using DRb) My last company used this approach for a lucene index, with lucene running behind a custom search webapp not that different from SOLR, so the user-facing webservers retrieved search results over http from the search webapp. We had to write some fairly intricate scripting to stop and start the search webapps whilst we copied out the master index to the slaves. Does anyone have any experience with this kind of approach? Is there some standard way to distribute and run multiple instances of an index? Bonus question - how upset does a running mongrel get when the ferret index it talks to is suddenly replaced by a new set of files? Thanks for any insights on how best to solve this. Thanks, Patrick Wright
Jens Kraemer
2007-Oct-16 08:56 UTC
[Ferret-talk] Multiple index instances and ferret/acts_as_ferret
On Wed, Oct 10, 2007 at 12:19:17PM -0700, Patrick Wright wrote: [..]> One alternate approach I''m considering would be to do something like this: > - disable the after_save callbacks in acts_as_ferret in production mode, to > stop multiple mongrels writing to the index. > - move all index writes to a centralized batch process which interacts with > a ''master'' index > - periodically clone out the master index to slave indexes located locally > to each user-facing rails index (not using DRb)should work. [..]> Does anyone have any experience with this kind of approach? Is there some > standard way to distribute and run multiple instances of an index?omdb.org uses rsync to sync index versions.> Bonus question - how upset does a running mongrel get when the ferret index > it talks to is suddenly replaced by a new set of files?Not upset at all if you do it like that: have two index directories and a symlink pointing to the one in use atm. Then sync to the currently unused index, change over the symlink and tell your mongrel to re-open it''s searcher. Cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database
Benjamin Krause
2007-Oct-16 10:10 UTC
[Ferret-talk] Multiple index instances and ferret/acts_as_ferret
> > Not upset at all if you do it like that: have two index directories > and > a symlink pointing to the one in use atm. Then sync to the currently > unused index, change over the symlink and tell your mongrel to re-open > it''s searcher.as Jens said, we''re doing exactly that .. take a look at the switching here: http://bugs.omdb.org/browser/branches/2007.1/lib/omdb/ferret/lib/util.rb we''ve also added a "last-switched" status file (0 byte with timestamp), to notify mongrel of a new index. So mongrels are using the old index unless the status file gets a newer timestamp. http://bugs.omdb.org/browser/branches/2007.1/lib/omdb/ferret/searcher.rb Ben