I am using AAF trunk, and I want a way to rebuild an index on a production site with little or no interruption to service. The Drb Server documentation* states that when an index is rebuilt, it is done in a separate location and then swapped into place when finished, and so to do a complete rebuild on a live site, one must take into consideration objects which have been created or modified in the meantime. To achieve this, I have come up with the following solution: http://pastie.textmate.org/66602 [1] Does this look like a complete solution? I suppose it relies on timestamp consistency between system components... it is possible that between setting "start = ..." and performing the rebuild, another thread in the system will have create an earlier timestamp for an object that did not get committed until after the rebuild began. Is it possible to do a perfect rebuild, or would that require building a layer of concurrency logic into AAF? [2] Is the behavior described in the Drb Server documentation different from AAF when not using the Drb Server? Thanks, John * http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer#AAFtrunk
> [1] Does this look like a complete solution? I suppose it relies on > timestamp consistency between system components... it is possible > that between setting "start = ..." and performing the rebuild, > another thread in the system will have create an earlier timestamp > for an object that did not get committed until after the rebuild > began. Is it possible to do a perfect rebuild, or would that require > building a layer of concurrency logic into AAF?You can sync your server clocks using ntpd, and you can always update a few extra seconds to work around latency. -Kyle
<shameless thread bump/> Jens, any thoughts on this? On May 31, 2007, at 2:30 PM, John Bachir wrote:> I am using AAF trunk, and I want a way to rebuild an index on a > production site with little or no interruption to service. The Drb > Server documentation* states that when an index is rebuilt, it is > done in a separate location and then swapped into place when > finished, and so to do a complete rebuild on a live site, one must > take into consideration objects which have been created or modified > in the meantime. To achieve this, I have come up with the following > solution: > > http://pastie.textmate.org/66602 > > [1] Does this look like a complete solution? I suppose it relies on > timestamp consistency between system components... it is possible > that between setting "start = ..." and performing the rebuild, > another thread in the system will have create an earlier timestamp > for an object that did not get committed until after the rebuild > began. Is it possible to do a perfect rebuild, or would that require > building a layer of concurrency logic into AAF? > > [2] Is the behavior described in the Drb Server documentation > different from AAF when not using the Drb Server? > > Thanks, > John > > * http://projects.jkraemer.net/acts_as_ferret/wiki/DrbServer#AAFtrunk
On Fri, Jun 08, 2007 at 06:13:51AM -0400, John Bachir wrote:> <shameless thread bump/>yeah, that''s ok, I still didn''t catch up with the list ;-)> Jens, any thoughts on this?see below.> > > On May 31, 2007, at 2:30 PM, John Bachir wrote: > > > I am using AAF trunk, and I want a way to rebuild an index on a > > production site with little or no interruption to service. The Drb > > Server documentation* states that when an index is rebuilt, it is > > done in a separate location and then swapped into place when > > finished, and so to do a complete rebuild on a live site, one must > > take into consideration objects which have been created or modified > > in the meantime. To achieve this, I have come up with the following > > solution: > > > > http://pastie.textmate.org/66602 > > > > [1] Does this look like a complete solution? I suppose it relies on > > timestamp consistency between system components... it is possible > > that between setting "start = ..." and performing the rebuild, > > another thread in the system will have create an earlier timestamp > > for an object that did not get committed until after the rebuild > > began. Is it possible to do a perfect rebuild, or would that require > > building a layer of concurrency logic into AAF?The scenario you describe might happen and cause a record not to be indexed, but I''d implement it just like you did. To be safe you can subtract a minute or so from your recorded start time ;-) If it is really critical for you to have all records indexed and relying on the timestamps is a no-go you''ll have to implement your own synchronisation mechanism, maybe with checking for a running rebuild on each index update, and recording the corresponding records somewhere for later indexing.> > [2] Is the behavior described in the Drb Server documentation > > different from AAF when not using the Drb Server?Without the DRb server aaf won''t use index versions but will re-build the index in place. I didn''t introduce the versioning there because the usual non-DRb-scenarios (test cases and development system) don''t require it. With non-DRb-Multi-Process-Scenarios it would be hard to implement anyway. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
John Bachir
2008-Jan-26 00:23 UTC
[Ferret-talk] rebuilding the index completely and consistently
Hey folks. Here''s an update to my Super Duper Ferret Single Index Rebuild that we were discussing back in June. On Jun 8, 2007, at 7:54 AM, Jens Kraemer wrote:> On May 31, 2007, at 2:30 PM, John Bachir wrote: > >> I am using AAF trunk, and I want a way to rebuild an index on a >> production site with little or no interruption to service. The Drb >> Server documentation* states that when an index is rebuilt, it is >> done in a separate location and then swapped into place when >> finished, and so to do a complete rebuild on a live site, one must >> take into consideration objects which have been created or modified >> in the meantime. To achieve this, I have come up with the following >> solution: >> >> http://pastie.textmate.org/66602 >> >> [1] Does this look like a complete solution? I suppose it relies on >> timestamp consistency between system components... it is possible >> that between setting "start = ..." and performing the rebuild, >> another thread in the system will have create an earlier timestamp >> for an object that did not get committed until after the rebuild >> began. Is it possible to do a perfect rebuild, or would that require >> building a layer of concurrency logic into AAF? > > The scenario you describe might happen and cause a record not to be > indexed, but I''d implement it just like you did. > > To be safe you can subtract a minute or so from your recorded start > time ;-)I''ve come up with this rake task: http://pastie.textmate.org/private/4xyk2o0obibzi2tmpbog Jens, what do you think? Anyone have any improvements to offer? Cheers, John
Jens Kraemer
2008-Jan-26 09:23 UTC
[Ferret-talk] rebuilding the index completely and consistently
Hi! On Fri, Jan 25, 2008 at 07:23:06PM -0500, John Bachir wrote:> Hey folks. > > Here''s an update to my Super Duper Ferret Single Index Rebuild that > we were discussing back in June. >[..]> > > I''ve come up with this rake task: > > http://pastie.textmate.org/private/4xyk2o0obibzi2tmpbog > > Jens, what do you think? Anyone have any improvements to offer?looks great. Mind if I add this as an example to acts_as_ferret? Cheers, Jens -- Jens Kr?mer http://www.jkraemer.net/ - Blog http://www.omdb.org/ - The new free film database