Rocco Caputo
2006-Jun-28 02:07 UTC
[Xapian-discuss] redundant disk access in xapian-tcpsrv
Hi. SocketServer::run_getdocument() triggers two record_table->get_record (did) calls in the flint backend. One when the database's open_document() is called in non-lazy mode, and another when run_getdocument() calls doc->get_data(). I noticed this while looking through an strace of xapian-tcpsrv to determine the reason it's slower at uncached fetches than my stand- alone Perl client. To be fair, my stand-alone test client also opens the database in non-lazy mode, so it suffers from the same inefficiency that xapian-tcpsrv seems to. Also, calling open_document () in lazy mode removes the redundant lseek() and read() calls, but this doesn't improve performance in my situation. non-lazy open_document() strace: lseek(28, 6414336, SEEK_SET) = 6414336 read(28, "\0\0\3\306\1\6\20\6\26\3\177\37\363\37\346\f\317\37\331"... lseek(28, 363642880, SEEK_SET) = 363642880 read(28, "\0\0\3\232\0\5N\v\245\0\35\36\365\6\305\33\"\31\355\27"... lseek(28, 363659264, SEEK_SET) = 363659264 read(28, "\0\0\3\232\0\10@\25\331\0\31\10Y\27<\25\367\24\233\23\36"... lseek(28, 363642880, SEEK_SET) = 363642880 read(28, "\0\0\3\232\0\5N\v\245\0\35\36\365\6\305\33\"\31\355\27"... lseek(28, 363659264, SEEK_SET) = 363659264 read(28, "\0\0\3\232\0\10@\25\331\0\31\10Y\27<\25\367\24\233\23\36"... gettimeofday({1151443453, 519154}, NULL) = 0 gettimeofday({1151443453, 519212}, NULL) = 0 select(55, NULL, [54], [54], {14, 999942}) = 1 (left {15, 0}) write(54, "O!)!*!\'!$!$!$!,!.6xxxxxxxx!D|!DN"..., 3684) = 3684 gettimeofday({1151443453, 519416}, NULL) = 0 gettimeofday({1151443453, 519472}, NULL) = 0 select(55, NULL, [54], [54], {14, 999944}) = 1 (left {15, 0}) write(54, "Z\n", 2) = 2 lazy strace: lseek(28, 6414336, SEEK_SET) = 6414336 read(28, "\0\0\3\306\1\6\20\6\26\3\177\37\363\37\346\f\317\37\331"... lseek(28, 363642880, SEEK_SET) = 363642880 read(28, "\0\0\3\232\0\5N\v\245\0\35\36\365\6\305\33\"\31\355\27"... lseek(28, 363659264, SEEK_SET) = 363659264 read(28, "\0\0\3\232\0\10@\25\331\0\31\10Y\27<\25\367\24\233\23\36"... gettimeofday({1151450138, 328166}, NULL) = 0 gettimeofday({1151450138, 328222}, NULL) = 0 select(55, NULL, [54], [54], {14, 999944}) = 1 (left {15, 0}) write(54, "O!)!*!\'!$!$!$!,!.6xxxxxxxx!D|!DN"..., 3684) = 3684 gettimeofday({1151450138, 328426}, NULL) = 0 gettimeofday({1151450138, 328484}, NULL) = 0 select(55, NULL, [54], [54], {14, 999942}) = 1 (left {15, 0}) write(54, "Z\n", 2) = 2 -- Rocco Caputo - rcaputo@pobox.com
On Tue, Jun 27, 2006 at 09:07:13PM -0400, Rocco Caputo wrote:> I noticed this while looking through an strace of xapian-tcpsrv to > determine the reason it's slower at uncached fetches than my stand- > alone Perl client. To be fair, my stand-alone test client also opens > the database in non-lazy mode, so it suffers from the same > inefficiency that xapian-tcpsrv seems to.This "lazy" flag isn't available at the API level - it's just there so that the matcher can look at document values without the backend checking that the document exists (since it must exist if we have a posting list entry for it).> Also, calling open_document() in lazy mode removes the redundant > lseek() and read() calls, but this doesn't improve performance in my > situation.The re-read disk blocks are almost certain to be cached so the overhead will be tiny compared to everything else going on. The code's here is a bit odd. If lazy is false, it reads the document data when the document object is created, and then promptly discards it. Looking at when this was added, it looks like this is done simply to check that this document actually exists in the database (and throw an exception if not). We can definitely do that better! Slightly odd that you're getting llseek and read from strace though. Xapian should use pread instead if it's available, and I'd expect pread to be a syscall. What platform is this? What do the PREAD related bits of config.h say? Cheers, Olly