Hi guys, We've been in discussion with Richard and Olly on this issue, in various different forums, but as the correct answer isn't immediately obvious, I'm opening it up for wider discussion and comment. The problem is that a xapian tcp-server in 'writable' mode makes no attempt to ensure only one 'active' connection at a time is trying to modify the database. If multiple connections are made to a writable server, the behaviour is undefined (or even it is was defined, it is unlikely to be defined in a way that would make it useful). While some applications can ensure internally that only a single connection is made to such a server, some applications are architected such that multiple processes, possibly even multiple machines, must coordinate this "single writer" approach. This becomes quite difficult without support inside xapian itself. It seems there are 2 general solutions we can implement: * Only ever allow a single connection to the writable server. When a second connection is attempted, we either refuse the connection, or allow the connection just to send back an authorative 'writer already connected' response, and then close the connection. * Implement a kind of 'queue' or some other way to block the incoming connection. In this case we would accept the connection, respond with a message indicating they are in a queue (your call is important!) and then block until the first writer is complete. The client side of the connection then has a choice regarding waiting in the queue, or hanging up and trying again later. In my opinion, the second option sounds the "best", but the first option seems "good enough" and easier to implement. I'm sure there are both other options and opinions on this topic, so I'm soliciting all feedback on this, with the intention of opening a xapian bug to track the status, and ultimately end up with a patch. Thanks, Mark
Richard Boulton
2007-May-01 08:27 UTC
[Xapian-devel] multiple writers and remotetcp backends
Mark Hammond wrote:> It seems there are 2 general solutions we can implement: > > * Only ever allow a single connection to the writable server. When a second > connection is attempted, we either refuse the connection, or allow the > connection just to send back an authorative 'writer already connected' > response, and then close the connection. > > * Implement a kind of 'queue' or some other way to block the incoming > connection. In this case we would accept the connection, respond with a > message indicating they are in a queue (your call is important!) and then > block until the first writer is complete. The client side of the connection > then has a choice regarding waiting in the queue, or hanging up and trying > again later.>> In my opinion, the second option sounds the "best", but the first option > seems "good enough" and easier to implement.I agree. If we implement the first solution, I imagine that many clients will need to emulate the behaviour of the second solution by repeatedly opening a connection to poll for the lock. While the first option is slightly simpler to implement, I think that the majority of the implementation work is likely to be in enforcing the single-writer constraint. I recommend attempting to implement the first solution first anyway, since the second solution requires everything that it does: once we've got reliable checking of the locks on the connection, we can implement a queue on top of that. One thing which both of these will probably need is a change to the remote protocol to allow a connection to specify whether it is writable or readonly at the time the connection is opened. This would allow the lock on the database to be checked and obtained at this point for writable database, rather than waiting for the connection to attempt a write operation. In theory, Xapian itself is meant to prevent there being two instances of a Writable database for the same path in existence concurrently: however, the tcpserver avoids the check for this by opening the database in its server process, and then passing it through a fork to the sub-processes. I don't think there's any way we can check for this state in the core Xapian library, so the tcpserver itself needs to enforce the single-writer constraint. I'm not sure what happens for the windows variant of the server (which is a threaded implementation), but I imagine that there the same instance of the database is accessed by each connection: if this is correct, there are likely to be other problems since the database is not safe for concurrent access. A threaded implementation needs to create a new instance of the database for each thread. > I'm sure there are both other> options and opinions on this topic, so I'm soliciting all feedback on this, > with the intention of opening a xapian bug to track the status, and > ultimately end up with a patch.Opening a bug sooner than later would probably be advisable, so that the history of this discussion is easy to find in future. In particular, discussion of this bug may make it clear whether it should block the 1.0.0 release, and if so we -- Richard
On Tue, May 01, 2007 at 12:36:54PM +1000, Mark Hammond wrote:> The problem is that a xapian tcp-server in 'writable' mode makes no attempt > to ensure only one 'active' connection at a time is trying to modify the > database. If multiple connections are made to a writable server, the > behaviour is undefined (or even it is was defined, it is unlikely to be > defined in a way that would make it useful).I'd not appreciated this happened from the previous discussion - this is certainly a bug. I understood the issue was just that of trying to marshal multiple processes wanting to write to the same remote server in a sane way. Looking at the code, I believe it's also wrong that we open the database and then fork multiple processes which can make use of it, even for a read-only Database. We certainly don't promise that you can use the same Xapian object from different threads. I think similar rules ought to apply over fork. But this matters much more for writers - with the current backends, it happens to work OK for readers I think. So we should probably leave the reader issue for now, as it can be fixed without API or ABI changes, but fix the writer issue. Cheers, Olly
Apparently Analagous Threads
- Xapian 1.4.3 "Db block overwritten - are there multiple writers?"
- xapian-tcpsrv need to reopen database?
- Xapian 1.4.3 "Db block overwritten - are there multiple writers?"
- Question on "single writer, multiple reader"
- Xapian 1.4.5 "Db block overwritten - are there multiple writers?" with Glass