Jen
2007-Feb-25 05:20 UTC
[Ferret-talk] Acts _As_Ferret - How to confirm Indexing is complete?
Hello I have a couple of questions, Hope someone here can help answer them. I am using acts_as_ferret on a model Item with around 10 million rows. I use Item.rebuild_index at the ruby console to build the index. It seems to run for at least 48 hours when building. My questions are: 1) How do you know when the indexing is over and complete? 2) How can you confirm that ALL records in the table were indexed? (especially since the table runs into millions of records) Thanks!! -- Posted via http://www.ruby-forum.com/.
Jens Kraemer
2007-Feb-25 10:12 UTC
[Ferret-talk] Acts _As_Ferret - How to confirm Indexing is complete?
Hi! On Sun, Feb 25, 2007 at 06:20:55AM +0100, Jen wrote:> Hello I have a couple of questions, Hope someone here can help answer > them. > > I am using acts_as_ferret on a model Item with around 10 million rows. > I use Item.rebuild_index at the ruby console to build the index. It > seems to run for at least 48 hours when building. > > My questions are: > 1) How do you know when the indexing is over and complete?indexing is done when rebuild_index returns. atm there is no logging of the progress rebuild_index already has made with a running rebuild. However I''m thinking about adding some kind of logging now.> 2) How can you confirm that ALL records in the table were indexed? > (especially since the table runs into millions of records)if rebuild_index returns normally and no error is thrown, I''d say it was successful and indexed all your records. To make sure you have all 10 million documents in the index, you can inspect the index with a small script like that: require ''rubygems'' require ''ferret'' reader = Ferret::Index::IndexReader.new(''path/to/index'') puts "#{reader.num_docs} documents in index" cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
Jen
2007-Feb-27 17:19 UTC
[Ferret-talk] Acts _As_Ferret - How to confirm Indexing is complete?
Thanks, Jens! I will try your suggestion. It would be nice to have the logging thing if you plan to add it in - esp for builds that take a loooong time :-) Btw is there any way to speed up the build process? Thanks, again... -Jen Jens Kraemer wrote:> Hi! > > On Sun, Feb 25, 2007 at 06:20:55AM +0100, Jen wrote: >> Hello I have a couple of questions, Hope someone here can help answer >> them. >> >> I am using acts_as_ferret on a model Item with around 10 million rows. >> I use Item.rebuild_index at the ruby console to build the index. It >> seems to run for at least 48 hours when building. >> >> My questions are: >> 1) How do you know when the indexing is over and complete? > > indexing is done when rebuild_index returns. atm there is no logging of > the progress rebuild_index already has made with a running rebuild. > > However I''m thinking about adding some kind of logging now. > >> 2) How can you confirm that ALL records in the table were indexed? >> (especially since the table runs into millions of records) > > if rebuild_index returns normally and no error is thrown, I''d say it was > successful and indexed all your records. To make sure you have all 10 > million documents in the index, you can inspect the index with a small > script like that: > > require ''rubygems'' > require ''ferret'' > reader = Ferret::Index::IndexReader.new(''path/to/index'') > puts "#{reader.num_docs} documents in index" > > cheers, > Jens > > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa-- Posted via http://www.ruby-forum.com/.
On Tue, Feb 27, 2007 at 06:19:00PM +0100, Jen wrote:> Thanks, Jens! I will try your suggestion. It would be nice to have the > logging thing if you plan to add it in - esp for builds that take a > loooong time :-) > > Btw is there any way to speed up the build process?if you have enough ram you can increase the batch size used during rebuilding (declared class_methods.rb, look for batch_size), that should result in less database calls. You can also limit the number of fields you index by excplicitly naming the fields you need to search in in your call to acts_as_ferret, if you don''t do this already. cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa