Greets, I've encountered the following while performing test merges (and writing code to handle errors, etc so things can be automated) and wondering about the best way to proceed: xapian-compact -b64k -m src1 src2.... tmp_dst -- works as expected, exit code 0. xapian-check tmp_dst -- produces the following error for the postlist: postlist: baseB blocksize=64K items=28175410 lastblock=117541 revision=1 levels=2 root=3493 B-tree checked okay document id 68511: length 32400 doesn't match 27468 in the termlist table postlist table errors found: 1 Using "delve -d -r68511" I can identify the source index which I then delveilter out so automated batch merging can continue. However, since this was the first time I encountered this particular error, I decided to check the source index (id'd with delve) with xapian-check and interestingly it reports no errors. While composing this msg, I decided to run xapian-check on tmp_dst/postlist.DB and *that* reports no problem (reminder, tmp_dst is the index which failed the previous overall xapian-check): xapian-check tmp_dst/postlist.DB baseB blocksize=64K items=28175410 lastblock=117541 revision=1 levels=2 root=3493 B-tree checked okay postlist table structure checked OK So,... perhaps the title of this msg should be "xapian-check idx fails on postlist, but xapian-check idx/postlist[.DB] succeeds". Is this possibly just a bug in xapian-check? I can query the tmp_dst index and it seems ok otherwise, but I hate to ignore this error in my code since it might not always be this (possibly) innocuous. Using svn 15862 (1.3.0). Thanks Henry
On Tue, Jul 19, 2011 at 11:38:23AM +0200, Henry C. wrote:> Using "delve -d -r68511" I can identify the source index which I then > delveilter out so automated batch merging can continue. However, since > this was the first time I encountered this particular error, I decided > to check the source index (id'd with delve) with xapian-check and > interestingly it reports no errors."delveilter" -> "filter"? This sounds like there's a bug in compaction which is mangling a document length in some cases. I can't see anything obviously wrong. Did you check all the source databases in case you misidentified which it came from? I had a quick look at the code, but there's nothing obviously wrong. So some way to reproduce this would be really useful.> While composing this msg, I decided to run xapian-check on > tmp_dst/postlist.DB > and *that* reports no problem (reminder, tmp_dst is the index which > failed the previous overall xapian-check):This isn't surprising, since it failed the cross-check with the termlist table:> document id 68511: length 32400 doesn't match 27468 in the termlist table > postlist table errors found: 1If you only check one table, then the cross-checks aren't performed, only the lower level checks. Cheers, Olly