Cedric Jeanneret
2009-Apr-14 14:15 UTC
[Xapian-discuss] Python bindings - xapian.Database.reopen
Hello, I'm using xapian in a pylons application, with pythons libs/bindings... My indexes are created on other servers, then rsync-ed to my search engine... It seems that sometimes this process do some mess, as my Pylons app returns a big error : Error - xapian.DatabaseModifiedError: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation [snip useless trace] DatabaseModifiedError: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation Ok... so I'm trying to call xapian.Database.reopen().... but how ?? Trying to do so: try: d = xapian.Database('my/db') except xapian.DatabaseModifiedError: d = xapian.Database() d.reopen('my/db') doesn't work... Tryied "d = xapian.Database('my/db').reopen()" fails as well. So... how can we call this function ? I'm unable to find out example nor doc about it. Thanks in advance, C. -- C?dric Jeanneret | System Administrator 021 619 10 32 | Camptocamp SA cedric.jeanneret at camptocamp.com | PSE-A / EPFL
Richard Boulton
2009-Apr-14 14:35 UTC
[Xapian-discuss] Python bindings - xapian.Database.reopen
On Tue, Apr 14, 2009 at 04:15:10PM +0200, Cedric Jeanneret wrote:> Hello, > > I'm using xapian in a pylons application, with pythons libs/bindings... > > My indexes are created on other servers, then rsync-ed to my search > engine... It seems that sometimes this process do some mess, as my Pylons > app returns a big error :This is the problem - if you rsync a database which is being modified, you'll get half the old database and half the new database. It is not safe to rsync a database which is in the process of being modified, because rsync is not an atomic copy operation. In fact, even if the database isn't being modified, you'll get errors like the one you report if you try and search while the rsync is happening (though at least in that case, once the rsync is finished, the database should be valid again). This is why the 1.1.0 release will have support for replication, in a safe way. See http://trac.xapian.org/browser/trunk/xapian-core/docs/replication.rst for details (it has a long section on alternative approaches to replication, including rsync, which may interest you). If you want to try this out, use SVN trunk (which is very close to release, though no promises that we won't need to change something at the last minute). If you must rsync, you need to stop the indexer, take a copy of the database on the client, rsync to update the copy, and then swap that copy in place of the old database on the client. Preferably, use a stub database to control which database is live on the client, and use "rename" to update that stub database file (rename, when used to move a file to replace another file, is an atomic operation. Unless you're on windows.).> Error - xapian.DatabaseModifiedError: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation > [snip useless trace] > DatabaseModifiedError: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operationThis error is slightly misleading in this situation - in fact, due to the rsync, your copy of the database is corrupt.> Ok... so I'm trying to call xapian.Database.reopen().... but how ?? > > Trying to do so: > try: > d = xapian.Database('my/db') > except xapian.DatabaseModifiedError: > d = xapian.Database() > d.reopen('my/db')Just to note; if the error had occurred due to local modifications, you'd only need to call reopen() if you were re-using a database handle. Here, you're making a new database, so you just need to retry the operation. It's academic, though, because the use of rsync has left you with an invalid database which no amount of calling reopen() will fix. -- Richard