Using the Drb allows me to synchronize writes to the index in a multi mongrel environment. I was under the impression that the remote index would not block if two mongrels were searching the index. Is that the case? This line in ferret_server.rb makes me think otherwise: # Calls are not queued atm, so this will block until the call returned. # def method_missing(name, *args) I see how the above would allow for synchronizing writes, but I don''t see how it would allow for concurrent reads. I''m seeing some issues with production performance of queries, and I''d like to figure out if concurrent queries against the remote server will block or if they are run in different threads. I''m running on linux with the trunk version of AAF and Ruby 1.8.4. Erik
On Thu, Sep 27, 2007 at 03:12:51PM -0400, Erik Morton wrote:> Using the Drb allows me to synchronize writes to the index in a multi > mongrel environment. I was under the impression that the remote index > would not block if two mongrels were searching the index. Is that the > case? This line in ferret_server.rb makes me think otherwise: > > # Calls are not queued atm, so this will block until the call > returned.Don''t worry, it''s only bad wording :-) What this means is only that indexing is not done in an asynchronous way. So your call to Model#save which triggers an index update won''t return until the server has finished adding that record to the index. Other processes will get their own threads on the DRb side, synchronization is done in Ferret''s Index class which allows concurrent searches. cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
Thanks Jens. I''m seeing very strange behavior. The ferret/drb server is running on the web server. I disable AAF in my environment.rb like so: MyModel.disable_ferret Then, every hour I run a script that grabs the records that have been updated and I call MyModel.bulk_index array_of_changed_objects When there are, say, 100 or so objects that have changed the server where DRB is will have approximately 50% of the CPU waiting on IO, and will drop from 120MB free of memory to 10MB free, though the ruby/ drb process doesn''t seem to actually consume that memory -- or at least it''s not reported to by top. While the batch update is happening it seems like my entire site is locked up. Requests usually hang until the indexing completes. Further, I can''t even run script/ console from a different machine until the indexing completes. I''m doing the bulk index on two indexes. One is about 41K records, the other is 1 million records. In both cases there has been at most 100 or so objects that needed to be indexed in bulk. The fact that script/ console, when run from a different server, doesn''t load until the index stops makes me think that either something is blocking in the ferret/drb server, or the optimization of the 3GB index after the bulk_index of 100 records is consuming all of the web server''s resources. Any idea what is going on or how I can debug this issue? Thanks in advance. On Sep 27, 2007, at 4:29 PM, Jens Kraemer wrote:> On Thu, Sep 27, 2007 at 03:12:51PM -0400, Erik Morton wrote: >> Using the Drb allows me to synchronize writes to the index in a multi >> mongrel environment. I was under the impression that the remote index >> would not block if two mongrels were searching the index. Is that the >> case? This line in ferret_server.rb makes me think otherwise: >> >> # Calls are not queued atm, so this will block until the call >> returned. > > Don''t worry, it''s only bad wording :-) > > What this means is only that indexing is not done in an asynchronous > way. So your call to Model#save which triggers an index update won''t > return until the server has finished adding that record to the index. > > Other processes will get their own threads on the DRb side, > synchronization is done in Ferret''s Index class which allows > concurrent > searches. > > cheers, > Jens > > -- > Jens Kr?mer > webit! Gesellschaft f?r neue Medien mbH > Schnorrstra?e 76 | 01069 Dresden > Telefon +49 351 46766-0 | Telefax +49 351 46766-66 > kraemer at webit.de | www.webit.de > > Amtsgericht Dresden | HRB 15422 > GF Sven Haubold, Hagen Malessa > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk
On Fri, Sep 28, 2007 at 03:42:30PM -0400, Erik Morton wrote:> Thanks Jens. > > I''m seeing very strange behavior. The ferret/drb server is running on > the web server. I disable AAF in my environment.rb like so: > > MyModel.disable_ferret > > Then, every hour I run a script that grabs the records that have been > updated and I call > > MyModel.bulk_index array_of_changed_objects > > When there are, say, 100 or so objects that have changed the server > where DRB is will have approximately 50% of the CPU waiting on IO, > and will drop from 120MB free of memory to 10MB free, though the ruby/ > drb process doesn''t seem to actually consume that memory -- or at > least it''s not reported to by top.Where do you get these numbers from? possibly it''s just the os using unused ram for filesystem buffers?> While the batch update is happening it seems like my entire site is > locked up. Requests usually hang until the indexing completes. > Further, I can''t even run script/ console from a different machine > until the indexing completes. I''m doing the bulk index on two > indexes. One is about 41K records, the other is 1 million records. In > both cases there has been at most 100 or so objects that needed to be > indexed in bulk. The fact that script/ console, when run from a > different server, doesn''t load until the index stops makes me think > that either something is blocking in the ferret/drb server, or the > optimization of the 3GB index after the bulk_index of 100 records is > consuming all of the web server''s resources.While the batch update is running, nobody else is able to update the index. So yes, every other request that wants to update the index will hang. However requests not using aaf at all, or searches, should do fine. If you feel like the DRb takes too much CPU, use renice or nice when you start it to lower it''s priority. Another possibility that comes to mind is database locks, however I can''t imagine where these should come from. With your index size, the optimizing might be the culprit - just comment out that portion and look how it goes without it (in ferret_extensions.rb). cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database
We are going to hack AAF a bit to make the optimize optional. Should we flush every time we bulk index if we don''t optimize? I''m running this application on EC2, so I think part of the problem is the poor IO performance on the VPS. Thanks. On Sep 29, 2007, at 12:12 PM, Jens Kraemer wrote:> On Fri, Sep 28, 2007 at 03:42:30PM -0400, Erik Morton wrote: >> Thanks Jens. >> >> I''m seeing very strange behavior. The ferret/drb server is running on >> the web server. I disable AAF in my environment.rb like so: >> >> MyModel.disable_ferret >> >> Then, every hour I run a script that grabs the records that have been >> updated and I call >> >> MyModel.bulk_index array_of_changed_objects >> >> When there are, say, 100 or so objects that have changed the server >> where DRB is will have approximately 50% of the CPU waiting on IO, >> and will drop from 120MB free of memory to 10MB free, though the >> ruby/ >> drb process doesn''t seem to actually consume that memory -- or at >> least it''s not reported to by top. > > Where do you get these numbers from? possibly it''s just the os using > unused ram for filesystem buffers? > >> While the batch update is happening it seems like my entire site is >> locked up. Requests usually hang until the indexing completes. >> Further, I can''t even run script/ console from a different machine >> until the indexing completes. I''m doing the bulk index on two >> indexes. One is about 41K records, the other is 1 million >> records. In >> both cases there has been at most 100 or so objects that needed >> to be >> indexed in bulk. The fact that script/ console, when run from a >> different server, doesn''t load until the index stops makes me think >> that either something is blocking in the ferret/drb server, or the >> optimization of the 3GB index after the bulk_index of 100 records is >> consuming all of the web server''s resources. > > While the batch update is running, nobody else is able to update the > index. So yes, every other request that wants to update the index will > hang. However requests not using aaf at all, or searches, should do > fine. > > If you feel like the DRb takes too much CPU, use renice or nice > when you > start it to lower it''s priority. > > Another possibility that comes to mind is database locks, however I > can''t imagine where these should come from. > > With your index size, the optimizing might be the culprit - just > comment > out that portion and look how it goes without it (in > ferret_extensions.rb). > > cheers, > Jens > > -- > Jens Kr?mer > http://www.jkraemer.net/ - Blog > http://www.omdb.org/ - The new free film database > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk
On Sat, Sep 29, 2007 at 12:27:50PM -0400, Erik Morton wrote:> We are going to hack AAF a bit to make the optimize optional. Should > we flush every time we bulk index if we don''t optimize?yes, I''d do so. It will be done when the index class closes the underlying writer, anyway. I just made bulk_index a bit more configurable, you may now pass :optimize => false to skip the optimization step.> I''m running this application on EC2, so I think part of the problem > is the poor IO performance on the VPS.Yes, I guess poor IO performance and optimizing a 3GB index aren''t an optimal combination :-) cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa