Eric Parusel
2004-Oct-25 22:49 UTC
[Xapian-discuss] xapian + plpgsql crash? have backtrace...
Hello, I've installed the following onto two boxes, and observed crashes of the postgresql child process (sig11, wrote a core file)... on FreeBSD 4.9: perl-5.6.1 from ports postgresql-7.4.5 from ports p5-postgresql-plperl-7.4.5_1 from ports Xapian 0.8.3 compiled on Debian stable: perl 5.6.1 postgresql-7.4.5 compiled, --with-perl libxapian3 0.8.3-1.0stabl package Attached is the backtrace gathered on my FreeBSD system. Also attached is the pl/pgsql function, and the query which ran the function. I noticed that it crashes the *2nd* time it runs. The first time the function is run, it works fine, returning the expected values. The Xapian db is quite small, only a few entries were added for testing (3 or 4). Is there something I'm doing wrong? Regardless I suppose it should be causing a crash... It seems to have failed at: #0 Xapian::Query::~Query (this=0x0, __in_chrg=2) at ../include/xapian/base.h:129 *shrug* Oh, I've also noticed that when running this separately through a test script from the command line, I can't seem to replicate the crash. Possibly it's because the script remains resident when in pl/pgsql? It does only crash the 2nd time the function is run... Thanks, Eric -------------- next part -------------- bash-2.05# gdb /usr/local/bin/postgres postgres.core GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...(no debugging symbols found)... Core was generated by `postgres'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/local/lib/libgnugetopt.so.1...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/libintl.so.6...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/libssl.so.3...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/libcrypto.so.3...(no debugging symbols found)...done. Reading symbols from /usr/lib/libz.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libreadline.so.4...(no debugging symbols found)...done. Reading symbols from /usr/lib/libcrypt.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libm.so.2...(no debugging symbols found)...done. Reading symbols from /usr/lib/libutil.so.3...(no debugging symbols found)...done. Reading symbols from /usr/lib/libc.so.4...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/libiconv.so.3...(no debugging symbols found)...done. Reading symbols from /usr/lib/libncurses.so.5...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/postgresql/plperl.so...(no debugging symbols found)...done. Reading symbols from /usr/local/lib/perl5/5.6.1/mach/auto/Opcode/Opcode.so...(no debugging symbols found)...done. ---Type <return> to continue, or q <return> to quit--- Reading symbols from /usr/local/lib/perl5/site_perl/5.6.1/mach/auto/Search/Xapian/Xapian.so... (no debugging symbols found)...done. Reading symbols from /usr/local/lib/libxapianqueryparser.so.4...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf done. Reading symbols from /usr/local/lib/libxapian.so.4...btdone. Reading symbols from /usr/lib/libstdc++.so.3...done. Reading symbols from /usr/libexec/ld-elf.so.1...done. #0 Xapian::Query::~Query (this=0x0, __in_chrg=2) at ../include/xapian/base.h:129 129 if (dest && --dest->ref_count == 0) { (gdb) bt #0 Xapian::Query::~Query (this=0x0, __in_chrg=2) at ../include/xapian/base.h:129 #1 0x2871a0b5 in vector<Xapian::Query, allocator<Xapian::Query> >::operator= (this=0x82896d0, __x=@0xbfbfb428) at /usr/include/g++/stl_construct.h:43 #2 0x28717250 in yylex () at /usr/share/bison/bison.simple:640 #3 0x2870c3bb in yyparse () at /usr/share/bison/bison.simple:573 #4 0x28718780 in Xapian::QueryParser::parse_query (this=0x845ecc0, q_=@0xbfbfe700) at /u1/olly/xapian-cvs-snapshot/HEAD/xapian/xapian-core/extra/queryparser.yy:591 #5 0x286dbd7e in XS_Search__Xapian__QueryParser_parse_query () from /usr/local/lib/perl5/site_perl/5.6.1/mach/auto/Search/Xapian/Xapian.so #6 0x28663db4 in Perl_pp_entersub () from /usr/local/lib/postgresql/plperl.so #7 0x2865e65e in Perl_runops_standard () from /usr/local/lib/postgresql/plperl.so #8 0x28623f72 in perl_call_sv () from /usr/local/lib/postgresql/plperl.so #9 0x28623d52 in perl_call_sv () from /usr/local/lib/postgresql/plperl.so #10 0x2861f832 in plperl_call_handler () from /usr/local/lib/postgresql/plperl.so #11 0x2861fa72 in plperl_call_handler () from /usr/local/lib/postgresql/plperl.so #12 0x2861f2ad in plperl_call_handler () from /usr/local/lib/postgresql/plperl.so #13 0x80f2a1e in ExecMakeFunctionResult () #14 0x80f3046 in ExecMakeTableFunctionResult () #15 0x80f4209 in ExecEvalExpr () #16 0x80f4e11 in ExecCleanTargetListLength () #17 0x80f5015 in ExecProject () #18 0x80fbeb4 in ExecResult () #19 0x80f156d in ExecProcNode () ---Type <return> to continue, or q <return> to quit--- #20 0x80f0284 in ExecEndPlan () #21 0x80ef7da in ExecutorRun () #22 0x8157742 in PortalRun () #23 0x815761b in PortalRun () #24 0x8154684 in pg_plan_queries () #25 0x8156a53 in PostgresMain () #26 0x813597c in ClosePostmasterPorts () #27 0x8135187 in ClosePostmasterPorts () #28 0x8133a33 in PostmasterMain () #29 0x81334ac in PostmasterMain () #30 0x8108a83 in main () #31 0x806d5ea in _start () Current language: auto; currently c++ (gdb) -------------- next part -------------- CREATE OR REPLACE FUNCTION find_keywords_xapian(varchar, integer, integer, varchar) RETURNS varchar AS ' use Search::Xapian; my $dbname = shift; my $year = shift; my $stemming = shift; my $query = shift; if ($stemming ne "1" && $stemming ne "0") { elog ERROR, "Invalid stemming specification, either 1 or 0"; } if ($dbname =~ /\\.\\.|\\//) { elog ERROR, "Invalid dbname specification"; } $words =~ s/^{//; $words =~ s/}$//; my $db = Search::Xapian::Database->new(q|/data1/xapian/| . $dbname . q|-| . $year); my $qp = Search::Xapian::QueryParser->new($db); $qp->set_stemming_options(q|english|,$stemming); # Replace 1 with 0 if you want to disable stemming $qp->set_default_op(OP_AND); my $enq = $db->enquire($qp->parse_query($query)); #elog NOTICE, q|Parsing query (| . $enq->get_query()->get_description() . qq|)\\n|; my @matches = $enq->matches(0,1000000); #elog NOTICE, scalar(@matches) . qq| results found\\n|; my $docid_string = q|{|; my $count = 1; foreach my $match ( @matches ) { if ($count == 1) { $docid_string .= $match->get_docid; } else { $docid_string .= q|,| . $match->get_docid; } ++$count; } $docid_string .= q|}|; return $docid_string; ' LANGUAGE plperlu STRICT; SELECT find_keywords_xapian('domain_com', 2004, '0', '{word}');
Olly Betts
2004-Oct-26 02:41 UTC
[Xapian-discuss] xapian + plpgsql crash? have backtrace...
On Mon, Oct 25, 2004 at 02:50:18PM -0700, Eric Parusel wrote:> It seems to have failed at: > #0 Xapian::Query::~Query (this=0x0, __in_chrg=2) at > ../include/xapian/base.h:129It appears to be trying to delete an object with address 0. That shouldn't happen, but the backtrace doesn't really reveal how we get to this point.> Oh, I've also noticed that when running this separately through a test > script from the command line, I can't seem to replicate the crash. > Possibly it's because the script remains resident when in pl/pgsql? > It does only crash the 2nd time the function is run...I'd guess something isn't coping properly with the persistent perl interpreter, but I can't think what. Perhaps Alex will have an idea. Cheers, Olly
Olly Betts
2004-Oct-30 05:42 UTC
[Xapian-discuss] xapian + plpgsql crash? have backtrace...
On Fri, Oct 29, 2004 at 03:17:34PM -0700, Eric Parusel wrote:> Thanks for the instructions -- done, I've attached two files.Looks like the list ate the message (probably over the size limit) but I got the Cc:-ed copy.> The query I used ('testa') return 0 "hits" (intended) > First debug log is for the query that worked > Second debug log is for the same query run right after, which crashes > the pgsql child process. > > Hopefully this makes things a little more clear.. :sWell, I think we can see what's happening, but I'm not quite sure why. --- xapian-plperl-debug-1worked.log Sat Oct 30 04:21:33 2004 +++ xapian-plperl-debug-2crashed.log Sat Oct 30 04:21:31 2004 @@ -1,6 +1 @@ -{0}Xapian debug build initialised -{7}Xapian 36476: [0x2871e67c] void Xapian::Query::Query() called -{7}Xapian 36476: [0x2871e67c] void Xapian::Query::Query() returning -{7}Xapian 36476: [0x82896cc] void Xapian::Query::Query() called -{7}Xapian 36476: [0x82896cc] void Xapian::Query::Query() returning {7}Xapian 36476: [0x845c5f0] void Database::Database() called Here the library debug logger is only initialised once, and 2 Query objects are only constructed the first time. The address of the first is at a very different address to most of the objects, so I suspect it's the static Query object in QueryParser. I think the second is a Query which is a member of a static U object in QueryParser. Neither object is destroyed during either run so that's not the problem. @@ -106,3 +101,3 @@ {2}Xapian 36476: [0x8357018] void Btree::block_to_cursor(0x83570a0, 2, 1010) called -{2}Xapian 36476: [0x8357018] void Btree::read_block(1010, ) called +{2}Xapian 36476: [0x8357018] void Btree::read_block(1010, E) called {2}Xapian 36476: [0x8357018] void Btree::read_block() returning An irrelevant difference - the second parameter to read_block is the address of the buffer to read the data into, so its contents on entry is irrelevant (and shouldn't really be logged - I'll fix that). @@ -132,4 +127,4 @@ {7}Xapian 36476: [0x845c600] void Xapian::Stem::Stem() returning -{7}Xapian 36476: [0x845c6b0] void Xapian::Query::Query() called -{7}Xapian 36476: [0x845c6b0] void Xapian::Query::Query() returning +{7}Xapian 36476: [0x845c6f0] void Xapian::Query::Query() called +{7}Xapian 36476: [0x845c6f0] void Xapian::Query::Query() returning {7}Xapian 36476: [0xbfbfc4b4] void Xapian::Query::Query() called Looks like new returned a different address here for the second run. Odd, but not actually a problem. A quick grep for that address suggest that the first block is allocated to an RSet object after it stops being a Query object. @@ -552,901 +547,3 @@ {7}Xapian 36476: [0x82896cc] Xapian::Query & Xapian::Query::operator=() returning Xapian::Query(testa:(pos=1)) -{7}Xapian 36476: [0xbfbfb3c4] void Xapian::Query::~Query() called -{7}Xapian 36476: [0xbfbfb3c4] void Xapian::Query::~Query() returning -{11}Xapian 36476: [0x82896cc] std::string Xapian::Query::get_description() called [snip] +{7}Xapian 36476: [0x0] void Xapian::Query::~Query() called +{7}Xapian 36476: [0x0] void Xapian::Query::~Query() returning And here we destroy a Query object the first time, but the second time the Query object is at address 0. But if you grep for 0xbfbfb3c4 in the logs it is allocated both times. I'm a bit puzzled as to why though! Is Xapian being used in a threaded way here? QueryParser currently uses a few global static objects (because it uses Bison) which means it's currently not safely reentrant. I plan to fix that, but it's probably rather involved and may even require dropping Bison (but I hope we can solve it by moving to use Bison's C++ skeleton). QueryParser does play with Query objects a lot, so it's a plausible explanation. If that's the issue, a quick fix would be to add a mutex to the XS wrapper for QueryParser and use it to marshal access. If not, I'd suggest trying Search::Xapian 0.8.3.0 which Alex uploaded a day or two ago (thanks Alex!) It mostly adds wrappers for a load of methods we weren't wrapping, but I fixed some miswrapped methods too so it's conceivable it might help... Cheers, Olly