I have found that lower cased boolean operators such as "and" or
"or"
does not work. Of course I never forget setting FLAG_BOOLEAN_ANY_CASE flag.
QP seems to treat them as terms.
Just look at the following tests regardless of search results!
$ python search.py -v woman AND man
Performing query 'Xapian::Query((woman:(pos=1) AND man:(pos=2)))'
0 results found
$ python search.py -v woman and man
Performing query 'Xapian::Query((woman:(pos=1) AND and:(pos=2) AND
man:(pos=3)))'
0 results found
In the following code in queryparser.lemony, I think every "term"
within if () condition
should be changed to "lcterm". I attach a patch file to fix this.
} else if (flags & FLAG_BOOLEAN_ANY_CASE) {
string lcterm = downcase_term(term);
if (term == "and") {
Parse(pParser, AND, NULL, &state);
continue;
} else if (term == "or") {
Parse(pParser, OR, NULL, &state);
continue;
} else if (term == "not") {
Parse(pParser, NOT, NULL, &state);
continue;
} else if (term == "near") {
Parse(pParser, NEAR, NULL, &state);
continue;
} else if (term == "xor") {
Parse(pParser, XOR, NULL, &state);
continue;
}
}
For better Xapian,
Sungsoo Kim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20060307/daceb76a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: queryparser-flag_boolean_any_case.patch
Type: application/octet-stream
Size: 872 bytes
Desc: not available
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20060307/daceb76a/attachment.obj>
On Tue, Mar 07, 2006 at 06:39:29PM +0900, Sungsoo Kim wrote:> In the following code in queryparser.lemony, I think every "term" > within if () condition should be changed to "lcterm". I attach a patch > file to fix this. > > } else if (flags & FLAG_BOOLEAN_ANY_CASE) { > string lcterm = downcase_term(term); > if (term == "and") { > Parse(pParser, AND, NULL, &state); > continue; > } else if (term == "or") {As you say, it ought to be using lcterm (at least if we want 'woman And man' to work. But I can't see why your example ('woman and man') doesn't already work since we check '(term == "and")'. How are you setting FLAG_BOOLEAN_ANY_CASE? Cheers, Olly
> As you say, it ought to be using lcterm (at least if we want 'woman And > man' to work. But I can't see why your example ('woman and man') > doesn't already work since we check '(term == "and")'.Yes, you are right. I missed the fact that "and" was already lowercase. So it does not need to be converted to lowercase when I search by "woman and man".> How are you setting FLAG_BOOLEAN_ANY_CASE?I have set the flag as shown below. query = qp.parse_query(input, xapian.QueryParser.FLAG_BOOLEAN | xapian.QueryParser.FLAG_BOOLEAN_ANY_CASE | xapian.QueryParser.FLAG_PHRASE | xapian.QueryParser.FLAG_LOVEHATE | xapian.QueryParser.FLAG_WILDCARD) I've figured out why QP does not accept "and" as operator. It is resulted from xapian-qp-utf8-0.9.2.patch. U_isupper(term[0]) should be changed back before the patch. original xapian-0.9.4 if (prefix.empty() && !term.empty() && C_isalpha(term[0])) { if (C_isupper(term[0])) { ... } else if (flags & FLAG_BOOLEAN_ANY_CASE) { ... } } after xapian-qp-utf8-0.9.2.patch if (prefix.empty() && !term.empty() && U_isupper(term[0])) { if (C_isupper(term[0])) { ... } else if (flags & FLAG_BOOLEAN_ANY_CASE) { ... } } For better Xapian, Sungsoo Kim