Steije van Schelt
2007-Aug-13 13:58 UTC
[Xapian-discuss] Random results for the same query
Hello, I'm experiencing weird behaviour with the search results from Xapian. For example, when I search for 'php' in the database I made (about 4.000.000 documents, 12 gb in size) I get about 562 results. When I change the offset in the TOPDOC argument in Omega to 20, I get 592 results (and so on and so on...). I've been searching the mailing list, the only post I found about it was: http://thread.gmane.org/gmane.comp.search.xapian.general/2329 The weird thing is that if I disable all extra boolean terms (as stated in the discussion above) I get the same behaviour. I've never experienced this before, it just occured today. Is my database corrupt or is there something weird going on in Xapian? Thanks in advance, Steije van Schelt.
Steije van Schelt wrote:> Hello, > > I'm experiencing weird behaviour with the search results from Xapian. > > For example, when I search for 'php' in the database I made (about > 4.000.000 documents, 12 gb in size) I get about 562 results. When I > change the offset in the TOPDOC argument in Omega to 20, I get 592 > results (and so on and so on...). > > I've been searching the mailing list, the only post I found about it was: > > http://thread.gmane.org/gmane.comp.search.xapian.general/2329 > > The weird thing is that if I disable all extra boolean terms (as stated > in the discussion above) I get the same behaviour. > > I've never experienced this before, it just occured today. Is my > database corrupt or is there something weird going on in Xapian?I think the key word here is 'about' as in 'about 562 results'. AFAIK to save time an exact number of results isn't calculated, so it's not surprising that you're getting different numbers each time. Cheers Charlie> > Thanks in advance, > > Steije van Schelt. > > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss@lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss >
Steije van Schelt
2007-Aug-13 15:06 UTC
[Xapian-discuss] Random results for the same query
Charlie Hull wrote:> I think the key word here is 'about' as in 'about 562 results'. AFAIK > to save time an exact number of results isn't calculated, so it's not > surprising that you're getting different numbers each time.I agree my choice of words was a bit funny... But still, fact remains that this hasn't occured before, even with the same query on the database. I can't explain the fact that Xapian randomly gives correct or incorrect counters for the resultset for a single P query. The only thing that has changed is that I've added documents to the index and changed documents already in the index. My guess is that something became corrupt in the index, has anybody experienced the same behaviour? Thanks in advance, Steije van Schelt
Andreas Marienborg
2007-Aug-15 14:58 UTC
[Xapian-discuss] Random results for the same query
On Aug 15, 2007, at 3:52 PM, Richard Boulton wrote:> Andreas Marienborg wrote: >> Where and how do I set this? I cant find anything on google about >> it except this mail. > > Set the MINHITS CGI parameter: see docs/cgiparams.txt for more > information. >Is this possible without omega? I use Search::Xapian (perl-bindings) with my own setup here - andreas
On Mon, Aug 13, 2007 at 02:57:17PM +0200, Steije van Schelt wrote:> I'm experiencing weird behaviour with the search results from Xapian. > > For example, when I search for 'php' in the database I made (about > 4.000.000 documents, 12 gb in size) I get about 562 results. When I > change the offset in the TOPDOC argument in Omega to 20, I get 592 > results (and so on and so on...). > > I've been searching the mailing list, the only post I found about it was: > > http://thread.gmane.org/gmane.comp.search.xapian.general/2329 > > The weird thing is that if I disable all extra boolean terms (as stated > in the discussion above) I get the same behaviour.That sounds wrong. A single term query should give an exact number of results. If the term "php" occurs in 562 documents, then exactly 562 documents will match a query for "php".> I've never experienced this before, it just occured today. Is my > database corrupt or is there something weird going on in Xapian?It's hard to say what's happening from the information given. What are the CGI parameters for the two searches you mention above? If you add $querydescription to your query template, what does it show the query to be? Cheers, Olly