oscaruser@programmer.net
2006-Jun-12 20:36 UTC
[Xapian-discuss] xapian-compact seg faulting & Re: [Xapian-discuss] Error msg xapian-compact: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation
Folks, I am receiving strange breakage from the indexer tools (xapian-compact) and from the flint back end in the form of seg fault from xapian-compact, and "The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation". When I first started receiving these issues, I was trying to compact about 150 separate flint dbs. In order to try and isolate the problem, I tried to run xapian-compat against ten (10) dbs at a time. Within a run of those ten, I received seg faulting again for some groups. I ran it against five (5) within the broken groups and discovered that it ran OK; however, the merge of the results threw a "revision discarded" exception error. Since I am trying to develop a production system based on the Xapian/Flint system, this implies I will need to dig in to the source and start to pin point where it is going awry on both accounts. This will be a rather steep learning curve for me because I will need to start tracing through the code, and learning about the flint design. I think it may also be time to develop a tool to validate the integrity of a flint index. Thanks, OSC oscar@gamma:/svr/hda1/gigablast/db$ ~/xapian/bin/xapian-compact -F -m /svr/hda1/omega/data/bsp008[0123456789]/default /tmp/xapian0008 postlist: Reduced by 63.2412% 38056K (60176K -> 22120K) record: Reduced by 52.7859% 2880K (5456K -> 2576K) termlist: Reduced by 52.8348% 15432K (29208K -> 13776K) Segmentation fault oscar@gamma:/svr/hda1/gigablast/db$ ~/xapian/bin/xapian-compact -F -m /svr/hda1/omega/data/bsp008[01234]/default /tmp/xapian0008a postlist: Reduced by 63.817% 16960K (26576K -> 9616K) record: Reduced by 52.8302% 1344K (2544K -> 1200K) termlist: Reduced by 52.9586% 7160K (13520K -> 6360K) position: Reduced by 0.184162% 40K (21720K -> 21680K) value: Size unchanged (0K) oscar@gamma:/svr/hda1/gigablast/db$ ~/xapian/bin/xapian-compact -F -m /svr/hda1/omega/data/bsp008[56789]/default /tmp/xapian0008b postlist: Reduced by 62.5952% 21032K (33600K -> 12568K) record: Reduced by 52.4725% 1528K (2912K -> 1384K) termlist: Reduced by 52.5752% 8248K (15688K -> 7440K) position: Reduced by 0.458996% 120K (26144K -> 26024K) value: Size unchanged (0K) oscar@gamma:/svr/hda1/gigablast/db$ ~/xapian/bin/xapian-compact -F -m /tmp/xapian0008a /tmp/xapian0008b /tmp/xapian0008 postlist: Reduced by 0.288496% 64K (22184K -> 22120K) record: Reduced by 0.309598% 8K (2584K -> 2576K) termlist .../home/oscar/xapian/bin/xapian-compact: The revision being read has been discarded - you should call Xapian::Database::reopen()\ and retry the operation oscar@gamma:/svr/hda1/gigablast/db$> ----- Original Message ----- > From: oscaruser@programmer.net > To: xapian-discuss@lists.xapian.org > Subject: Re: [Xapian-discuss] xapian-compact seg faulting > Date: Sat, 10 Jun 2006 12:11:57 -0800 > > > gamma:/svr/hda1/xapian/default# gdb /home/oscar/xapian/bin/xapian-compact > GNU gdb 6.3-debian > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-linux"...Using host libthread_db > library "/lib/tls/libthread_db.so.1". > > (gdb) set args -F -m /svr/hda1/omega/data/bsp*/default > /svr/hda1/xapian/default > (gdb) run > Starting program: /usr/home/oscar/xapian/bin/xapian-compact -F -m > /svr/hda1/omega/data/bsp*/default /svr/hda1/xapian/default > postlist: Reduced by 62.2293% 640728K (1029624K -> 388896K) > record: Reduced by 52.3299% 55880K (106784K -> 50904K) > termlist: Reduced by 52.9244% 309032K (583912K -> 274880K) > > Program received signal SIGSEGV, Segmentation fault. > 0x40337e84 in mallopt () from /lib/tls/libc.so.6 > (gdb) bt > #0 0x40337e84 in mallopt () from /lib/tls/libc.so.6 > #1 0x40336dcb in free () from /lib/tls/libc.so.6 > #2 0x4026b681 in operator delete () from /usr/lib/libstdc++.so.6 > #3 0x4026b6dc in operator delete[] () from /usr/lib/libstdc++.so.6 > #4 0x400bc525 in FlintTable::close (this=0xbfffc730) at flint_table.cc:1626 > #5 0x400bc79e in ~FlintTable (this=0xbfffc730) at flint_table.cc:1612 > #6 0x0804dd9b in main (argc=274880, argv=0xbfffca74) at ostream.tcc:63 > (gdb) frame 4 > #4 0x400bc525 in FlintTable::close (this=0xbfffc730) at flint_table.cc:1626 > 1626 delete [] C[j].p; > (gdb) list > 1621 if (!dont_close_handle) (void)::close(handle); > 1622 handle = -1; > 1623 } > 1624 > 1625 for (int j = level; j >= 0; j--) { > 1626 delete [] C[j].p; > 1627 } > 1628 delete [] split_p; > 1629 > 1630 delete [] kt.get_address(); > (gdb) p level > $1 = 2 > (gdb) p j > $2 = 134633720 > (gdb) > > > ----- Original Message ----- > > From: "Olly Betts" <olly@survex.com> > > To: oscaruser@programmer.net > > Subject: Re: [Xapian-discuss] xapian-compact seg faulting > > Date: Sat, 10 Jun 2006 02:13:04 +0100 > > > > > > On Fri, Jun 09, 2006 at 03:34:35PM -0800, oscaruser@programmer.net wrote: > > > strace shows: > > > [...] > > > termlist: Reduced by 52.9244% 309032K (583912K -> 274880K) > > > ) = 60 > > > close(3) = 0 > > > --- SIGSEGV (Segmentation fault) @ 0 (0) --- > > > +++ killed by SIGSEGV +++ > > > > That doesn't narrow it down too well. > > > > Can you run under gdb: > > > > gdb --args /home/oscar/xapian/bin/xapian-compact -F -m > > bsp*/default /svr/hda1/omega/data/default > > > > And then at the "(gdb)" prompt type: > > > > run > > > > and then once it dies: > > > > bt > > > > To give a backtrace. Then post the backtrace. > > > > Cheers, > > Olly-- ___________________________________________________ Play 100s of games for FREE! http://games.mail.com/
Olly Betts
2006-Jun-13 02:03 UTC
[Xapian-discuss] xapian-compact seg faulting & Re: [Xapian-discuss] Error msg xapian-compact: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation
This is very odd. Flint currently seems rock solid for everybody using it apart from you where it seems very flaky. If I could reproduce the problem, I should be able to pin down what's wrong and fix it, but I can't from the information you've provided so far. For example, Gmane uses xapian-compact to merge 2 databases every night, and even uses the -F option like you do. For a full rebuild (which I must have done at least 10 times) it merges one database per million documents, so that's more than 30. Tweakers.net use flint on a large system and run xapian-compact frequently and they're very happy with the stability. I know of several other happy flint users (and there are probably more I don't know of). I don't understand what's the difference which is causing you these problems. I suspect if we can work that out this will be fairly easy to resolve. On Mon, Jun 12, 2006 at 11:35:53AM -0800, oscaruser@programmer.net wrote:> I think it may also be time to develop a tool to validate the integrity > of a flint index.If you can run copydatabase on a database, it's in good shape with the possible exception of the postlist table. If you want something more specific, quartzcheck is probably a good starting point. The flint and quartz formats still have quite a bit in common. Cheers, Olly