Daniel Ménard
2008-Oct-28 18:27 UTC
[Xapian-discuss] add_prefix() versus add_boolean_prefix()
Hello, Until now, I was using QueryParser.add_prefix() for all my fields, but I realized that some of them were just "filters" and I'm now experimenting with add_boolean_prefix() for those fields (I'm using xapian core 1.0.8 with php bindings 1.0.8). If I add a boolean prefix 'A' for the field 'author', a request like [test author:john author:doe] gives me the following query : Xapian::Query((test:(pos=1) FILTER (Ajohn OR Adoe))) which looks good for me. But if the same request is written like this : [test author:(john doe)], I get the following query : Xapian::Query(((test:(pos=1) OR doe:(pos=2)) FILTER A(john)) which looks strange for me ("doe" is not a filter anymore, extra parenthesis before "john"). A similar problem appear if I try a phrase search: [test author:"john doe"] gives Xapian::Query(((test:(pos=1) OR doe:(pos=2)) FILTER A"john)) Is it a bug? (these requests are correctly parsed if add_prefix is used instead of add_boolean_prefix). Thanks! PS: If it can be useful, here is the php script I used for my tests : <?php require_once('/lib/xapian/xapian.php'); $request='test author:(john doe)'; $parser=new XapianQueryParser(); $parser->add_prefix('author', 'A'); echo 'with add_prefix : ', $parser->parse_query($request)->get_description(), "\n"; $parser=new XapianQueryParser(); $parser->add_boolean_prefix('author', 'A'); echo 'with add_boolean_prefix : ', $parser->parse_query($request)->get_description(), "\n"; ?> -- Daniel M?nard
Olly Betts
2008-Nov-03 00:03 UTC
[Xapian-discuss] add_prefix() versus add_boolean_prefix()
On Tue, Oct 28, 2008 at 07:27:25PM +0100, Daniel M?nard wrote:> But if the same request is written like this : [test author:(john doe)], > I get the following query : > Xapian::Query(((test:(pos=1) OR doe:(pos=2)) FILTER A(john)) > which looks strange for me ("doe" is not a filter anymore, extra > parenthesis before "john").It's a bad example to use "author:" here, since that would naturally be a free-text search, and it means that examples which looks reasonable don't necessarily make much sense in the actual boolean prefix case. Anyway, this behaviour is as expected currently - you can't apply a boolean prefix to a subexpression so it parses the "(" as part of the term. In this case the subexpression isn't boolean, so as a better example, it's like this where "type:" is a boolean prefix: type:(html pdf) I'm not really sure that makes a lot of sense (all I can think is to treat it as we would: type:html type:pdf, which is to OR filters with the same prefix, which isn't totally obvious behaviour either). I can see that there's a natural meaning for this case, which I don't think we currently handle: type:(html OR pdf)> A similar problem appear if I try a phrase search: [test author:"john > doe"] gives > Xapian::Query(((test:(pos=1) OR doe:(pos=2)) FILTER A"john))I'm not really sure what you expect this to mean - a phrase isn't a boolean sub-expression, and I wouldn't expect boolean filter terms to have positional information. Looking at a better example, what would you expect this to mean? type:"html pdf" Incidentally, http://trac.xapian.org/ticket/128 suggests it should be a single filter term with a space in, which seems a reasonable way to allow that to be specified. So in this case, the term would be: XTYPEhtml pdf Cheers, Olly