thr3ads.net - Ferret talk - [Ferret-talk] How to make a single Writer [Mar 2006]

If this information is useful, please help other people find it:
Share via:

Tom Davies

2006-Mar-05 20:47 UTC

[Ferret-talk] How to make a single Writer

I am using lighttpd with two procs and occasionally the .lock file
will not be properly removed by Ferret at which point my application
will end up throwing nothing but 500 errors.

Therefore, I have decided to go with a single writer thread... which
is probably a better long term solution anyways.

I would like some feedback on the best way to structure this.  My app
is hosted on TextDrive, so drb (distributed ruby) is not allowed.

The only other solution I can come up with is to write all pending
updates to a shared file.  This could involve either:

1) serialize each object using something like YAML to a file and then
deserializing them by the writer during updates.

2) just write the ids that need to be updated in the index and then
read each object fresh from the database using its id when updating
the index.

I am leaning towards solution 2, as it is easier to implement, should
be faster to write and read from the intermediate file and will be
easier to remove duplicate index updates.  The only drawback to 2 is
it will require one additional database read for every index update...
but this could be minimized by batch reading with a where id in (...).

Also, both 1 and 2 will require a lockfile for managing concurrent
access to the intermediate file.  I am thinking of just using this
lockfile library: http://raa.ruby-lang.org/project/lockfile/
Does anyone have any experience with this?

Thanks,
Tom

Alex Young

2006-Mar-05 21:07 UTC

head link

[Ferret-talk] How to make a single Writer

Tom Davies wrote:> 2) just write the ids that need to be updated in the index and then
> read each object fresh from the database using its id when updating
> the index.
> 
> I am leaning towards solution 2, as it is easier to implement, should
> be faster to write and read from the intermediate file and will be
> easier to remove duplicate index updates.  The only drawback to 2 is
> it will require one additional database read for every index update...
> but this could be minimized by batch reading with a where id in (...).Why not add a needs_indexing column to your object table?  That way, not 
only do you not have to care about concurrent intermediate file access 
(because the DB takes care of that for you), but you can also do all 
your pending database reads at once, if that''s appropriate.  If
you''ve
got a single writer thread, it can write the flag back either on all 
once it''s done, or on each as it goes.  It seems much simpler all round
to me...  Of course, if you don''t want to change your object table 
schema, then you could create a separate table specifically for this.

-- 
Alex

Tom Davies

2006-Mar-05 21:40 UTC

head link

[Ferret-talk] How to make a single Writer

That is an excellent idea Alex.  Not sure why I didn''t think of that :)

Basically, your concept is like adding a dirty flag to my table.

I like this approach much better.  However, for my particular case, I
will modify it slightly to just use the existing updated_at columns
that I have for each of my models that need indexing.  Then my index
writer won''t have to lock the model database tables to reset the dirty
flag.  It will just keep track of the last time it updated the index.

Thanks for finding a much simpler solution.  That .lock file way was
making me nervous :)

Tom

On 3/5/06, Alex Young <alex at blackkettle.org>
wrote:> Tom Davies wrote:
> > 2) just write the ids that need to be updated in the index and then
> > read each object fresh from the database using its id when updating
> > the index.
> >
> > I am leaning towards solution 2, as it is easier to implement, should
> > be faster to write and read from the intermediate file and will be
> > easier to remove duplicate index updates.  The only drawback to 2 is
> > it will require one additional database read for every index update...
> > but this could be minimized by batch reading with a where id in (...).
> Why not add a needs_indexing column to your object table?  That way, not
> only do you not have to care about concurrent intermediate file access
> (because the DB takes care of that for you), but you can also do all
> your pending database reads at once, if that''s appropriate.  If
you''ve
> got a single writer thread, it can write the flag back either on all
> once it''s done, or on each as it goes.  It seems much simpler all
round
> to me...  Of course, if you don''t want to change your object table
> schema, then you could create a separate table specifically for this.
>
> --
> Alex
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>

Alex Young

2006-Mar-05 22:49 UTC

head link

[Ferret-talk] How to make a single Writer

Tom Davies wrote:> Basically, your concept is like adding a dirty flag to my table.Pretty much - it''s dirty within a specific context.
> I like this approach much better.  However, for my particular case, I
> will modify it slightly to just use the existing updated_at columns
> that I have for each of my models that need indexing.  Then my index
> writer won''t have to lock the model database tables to reset the
dirty
> flag.  It will just keep track of the last time it updated the index.Sounds good.  Just remember to record the *start* of the write, not the 
end - otherwise you''ll get records being marked as updated while your 
write''s happening, and they''ll get missed by the next update.
> Thanks for finding a much simpler solution.  That .lock file way was
> making me nervous :)No worries :-)

-- 
Alex

Maybe Matching Threads

Search for more maybe matching threads

Ferret talk - Mar 2006 - How to make a single Writer

[Ferret-talk] How to make a single Writer

[Ferret-talk] How to make a single Writer

[Ferret-talk] How to make a single Writer

[Ferret-talk] How to make a single Writer

Maybe Matching Threads