Michel Pelletier
2006-Mar-25 00:05 UTC
[Xapian-discuss] bug in query parser or wrong usage?
Hi, I noticed this little oddity when I was trying to construct certain queries against my various boolean terms. It can be illustrated just with the ParserQuery class alone: >>> from xapian import QueryParser >>> p = QueryParser() >>> p.add_boolean_prefix('x', 'X') >>> p.add_boolean_prefix('y', 'Y') These two work: >>> p.parse_query("x:1 AND y:2").get_description() 'Xapian::Query((X1 AND Y2))' >>> p.parse_query("x:1 y:2").get_description() 'Xapian::Query((X1 AND Y2))' but this one is different, seems to be treating OR like a positional term: >>> p.parse_query("(x:3 OR x:4) x:1 y:2").get_description() 'Xapian::Query((or:(pos=1) FILTER (X3 AND X4 AND X1 AND Y2)))' same with this one: >>> p.parse_query("(x:3 OR x:4) AND x:1 OR y:2").get_description() 'Xapian::Query(((or:(pos=1) OR and:(pos=2) OR or:(pos=3)) FILTER (X3 AND X4 AND X1 AND Y2)))' but these two work as expected: >>> p.parse_query("x:3 OR x:4").get_description() 'Xapian::Query((X3 OR X4))' >>> p.parse_query("x:3 OR x:4 AND x:5").get_description() 'Xapian::Query((X3 OR (X4 AND X5)))' seems to somehow be related to having the OR subexpression: >>> p.parse_query("(x:3 OR x:4) AND x:5").get_description() 'Xapian::Query(((or:(pos=1) OR and:(pos=2)) FILTER (X3 AND X4 AND X5)))' >>> p.parse_query("x:5 AND (x:3 OR x:4)").get_description() 'Xapian::Query(((and:(pos=1) OR or:(pos=2)) FILTER (X5 AND X3 AND X4)))' >>> Am I using the parser wrong? or is this a bug? -Michel
On Fri, Mar 24, 2006 at 04:03:56PM -0800, Michel Pelletier wrote:> but this one is different, seems to be treating OR like a positional term: > > >>> p.parse_query("(x:3 OR x:4) x:1 y:2").get_description() > 'Xapian::Query((or:(pos=1) FILTER (X3 AND X4 AND X1 AND Y2)))'This is how boolean filter terms currently work. You can't build boolean expressions from them. I'd say this is a bug, but fixing it properly is a little involved as I think it requires changes to the Query class itself, so it'll take a but of time. A sort-of workaround is to use non-boolean prefixes instead of boolean ones. Cheers, Olly