Steije van Schelt
2007-Dec-13 14:54 UTC
[Xapian-discuss] Omega datevalue search fails during scriptindex flush
Hi, I'm experiencing weird behaviour in Xapian/Omega. Here's my situation: * I'm indexing data through scriptindex, the output is as follows: ***************** Replace: 6706476 Replace: 6706477 Replace: 6706478 ***************** Since there is no output after entering several enters, I assume scriptindex is indexing data. * The search I perform on omega is as follows: omega?P=harry&B=XINja&B=XAR0&DEFAULTOP=and&DB=...&FMT=customxml&xDB=...&xFILTERS=--O&TOPDOC=0&HITSPERPAGE=20&MINHITS=50&DATEVALUE=4&START=20071206 * After a while (20-30 seconds), omega just returns a blank page. No errors nothing (not even in apache error logs). If I remove the DATEVALUE=4&START=20071206 everything just works fine. If scriptindex is replacing or adding records the search works fine aswell. Only during flush it seems Xapian won't give any results. I've been searching through the mailinglist but haven't found any relevant posts. Did I just stumble over a bug or is something else wrong? Kind regards, Steije van Schelt.
James Aylett
2007-Dec-13 15:12 UTC
[Xapian-discuss] Omega datevalue search fails during scriptindex flush
On Thu, Dec 13, 2007 at 03:54:52PM +0100, Steije van Schelt wrote:> Since there is no output after entering several enters, I assume > scriptindex is indexing data.Try attaching ptrace (or your system equivalent) to it and see if it's hitting any system calls. Also, what's the process state (via top), and is it using CPU time at all? It might be blocked on something.> * The search I perform on omega is as follows: > > omega?P=harry&B=XINja&B=XAR0&DEFAULTOP=and&DB=...&FMT=customxml&xDB=...&xFILTERS=--O&TOPDOC=0&HITSPERPAGE=20&MINHITS=50&DATEVALUE=4&START=20071206 > > * After a while (20-30 seconds), omega just returns a blank page. No > errors nothing (not even in apache error logs).Sounds like you're hitting the apache internal timeout. You can reconfigure this, but it would be better to find out what's going on. If you run omega on the command line, you get a testing interface - does the same kind of query take as long then? (I expect it will.) [Diversion sidebar Apache's timeout makes it an awkward web server to use on its own in this kind of situation. You can jack up the timeout, but that leaves you vulnerable to DOS attacks (deliberate or not) if you have URIs that will take a long time to return. It's not easy to come up with a really nice alternative, because you'd need to get omega running under FastCGI, or something like that. (Then you could use something that doesn't block its process/thread on content generation in order to front end everything, which should scale better.) ]> If I remove the DATEVALUE=4&START=20071206 everything just works > fine. If scriptindex is replacing or adding records the search works > fine aswell. Only during flush it seems Xapian won't give any > results.We'd need to identify that scriptindex is actually flushing before coming to that conclusion I think. Note that when scriptindex shows that it's replacing/adding records, it isn't going to be hitting the database on disk, so it won't affect the search process (which does lend weight to the idea that it's in flush). However the way Xapian flushes is designed to allow a reader concurrently to the writer, so something isn't quite right here. If you can find out what scriptindex is actually doing while it's sitting there, that should help a little. There's something in the date stuff that clearly isn't helping, but I don't know why not. (The reader shouldn't have to block on the value table, should it?) J -- /--------------------------------------------------------------------------\ James Aylett xapian.org james@tartarus.org uncertaintydivision.org