Hello, I use xapian-core-1.0.6 with the corresponding perl bindings. I run a 1 writer/N reader setup on 5 databases, that is the writer as well as each reader holds 5 open db handles at a time. Since I use the perl- bindings and am therefore not able to catch a DatabaseModifiedError, I do reopen() a database-handle before every query. Nevertheless I casually get DatabaseModifiedErrors. They occur, if my writer (manually) flush()es two databases with little cached changes consecutively while having high load on the searchers. I know that DatabaseModifiedErrors get thrown if a database version get incremented at least two times while a search is running. In fact there *are* two very fast db updates in this case, and there *are* queries that might easily exceed the runtime of those updates. But my updates are on different databases. It feels to me, like my reader processes mixing up the version information of the different databases or something like that? My workaround does a 1-second sleep() after each (manual) flush(), which gives the readers enough time to finish before the next flush() occurs. This works. Unfortunately, I was not able to reproduce the issue with a simple test-case so far, but i think that in fact there has to be a bug in xapian to display this behaviour? Or is it a misunderstanding by me concerning database versions? Any ideas? Regards, mrks
On Wed, May 28, 2008 at 11:33:39AM +0200, Markus W?rle wrote:> I use xapian-core-1.0.6 with the corresponding perl bindings. I run a > 1 writer/N reader setup on 5 databases, that is the writer as well as > each reader holds 5 open db handles at a time. Since I use the perl- > bindings and am therefore not able to catch a DatabaseModifiedError, I > do reopen() a database-handle before every query.Do you hold 5 distinct db handles in the readers, or are they aggregated into one Xapian::Database object that you access? J -- /--------------------------------------------------------------------------\ James Aylett xapian.org james at tartarus.org uncertaintydivision.org
Andreas Marienborg
2008-May-28 10:07 UTC
[Xapian-discuss] another DatabaseModifiedError issue
On May 28, 2008, at 11:33 AM, Markus W?rle wrote:> Hello, > > I use xapian-core-1.0.6 with the corresponding perl bindings. I run a > 1 writer/N reader setup on 5 databases, that is the writer as well as > each reader holds 5 open db handles at a time. Since I use the perl- > bindings and am therefore not able to catch a DatabaseModifiedError, I > do reopen() a database-handle before every query. > > Nevertheless I casually get DatabaseModifiedErrors. > > They occur, if my writer (manually) flush()es two databases with > little cached changes consecutively while having high load on the > searchers. > > I know that DatabaseModifiedErrors get thrown if a database version > get incremented at least two times while a search is running. In fact > there *are* two very fast db updates in this case, and there *are* > queries that might easily exceed the runtime of those updates. But my > updates are on different databases. > > It feels to me, like my reader processes mixing up the version > information of the different databases or something like that? > > My workaround does a 1-second sleep() after each (manual) flush(), > which gives the readers enough time to finish before the next flush() > occurs. This works. > > Unfortunately, I was not able to reproduce the issue with a simple > test-case so far, but i think that in fact there has to be a bug in > xapian to display this behaviour? Or is it a misunderstanding by me > concerning database versions? > > Any ideas? > > Regards, > mrkshttp://trac.xapian.org/ticket/230 This may not apply cleanly to 1.0.6, as I last updated it for 1.0.5. - andreas
On Wed, May 28, 2008 at 11:33:39AM +0200, Markus W?rle wrote:> They occur, if my writer (manually) flush()es two databases with > little cached changes consecutively while having high load on the > searchers. > > I know that DatabaseModifiedErrors get thrown if a database version > get incremented at least two times while a search is running.Well, *can* get thrown. Sometimes you'll get away with it (if the reader only looks at parts of the Btrees which survive the update).> In fact > there *are* two very fast db updates in this case, and there *are* > queries that might easily exceed the runtime of those updates. But my > updates are on different databases. > > It feels to me, like my reader processes mixing up the version > information of the different databases or something like that?That seems extremely unlikely as they are tracked in separate objects - it's not like we have a central pool of information here. I struggle to think of any way this could happen which wouldn't have manifested in much worse ways. The only thing I can think is that perhaps DatabaseModifiedError is been thrown "too eagerly" - i.e. in situations where the database isn't actually modified. But I've not heard any other reports. Could this be due to the automatic flushing which happens (by default every 10000 changed documents)? You can avoid that by using transactions, or by setting environment variable XAPIAN_FLUSH_THRESHOLD to a very large value. Otherwise, what's the minimum time between explicit flushes on the *same* database object? And what's the maximum search time? Perhaps add logging around the flush() and reopen()+get_mset() calls to make sure these times are as you think they are? Cheers, Olly