I have found that lower cased boolean operators such as "and" or "or" does not work. Of course I never forget setting FLAG_BOOLEAN_ANY_CASE flag. QP seems to treat them as terms. Just look at the following tests regardless of search results! $ python search.py -v woman AND man Performing query 'Xapian::Query((woman:(pos=1) AND man:(pos=2)))' 0 results found $ python search.py -v woman and man Performing query 'Xapian::Query((woman:(pos=1) AND and:(pos=2) AND man:(pos=3)))' 0 results found In the following code in queryparser.lemony, I think every "term" within if () condition should be changed to "lcterm". I attach a patch file to fix this. } else if (flags & FLAG_BOOLEAN_ANY_CASE) { string lcterm = downcase_term(term); if (term == "and") { Parse(pParser, AND, NULL, &state); continue; } else if (term == "or") { Parse(pParser, OR, NULL, &state); continue; } else if (term == "not") { Parse(pParser, NOT, NULL, &state); continue; } else if (term == "near") { Parse(pParser, NEAR, NULL, &state); continue; } else if (term == "xor") { Parse(pParser, XOR, NULL, &state); continue; } } For better Xapian, Sungsoo Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20060307/daceb76a/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: queryparser-flag_boolean_any_case.patch Type: application/octet-stream Size: 872 bytes Desc: not available URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20060307/daceb76a/attachment.obj>
On Tue, Mar 07, 2006 at 06:39:29PM +0900, Sungsoo Kim wrote:> In the following code in queryparser.lemony, I think every "term" > within if () condition should be changed to "lcterm". I attach a patch > file to fix this. > > } else if (flags & FLAG_BOOLEAN_ANY_CASE) { > string lcterm = downcase_term(term); > if (term == "and") { > Parse(pParser, AND, NULL, &state); > continue; > } else if (term == "or") {As you say, it ought to be using lcterm (at least if we want 'woman And man' to work. But I can't see why your example ('woman and man') doesn't already work since we check '(term == "and")'. How are you setting FLAG_BOOLEAN_ANY_CASE? Cheers, Olly
> As you say, it ought to be using lcterm (at least if we want 'woman And > man' to work. But I can't see why your example ('woman and man') > doesn't already work since we check '(term == "and")'.Yes, you are right. I missed the fact that "and" was already lowercase. So it does not need to be converted to lowercase when I search by "woman and man".> How are you setting FLAG_BOOLEAN_ANY_CASE?I have set the flag as shown below. query = qp.parse_query(input, xapian.QueryParser.FLAG_BOOLEAN | xapian.QueryParser.FLAG_BOOLEAN_ANY_CASE | xapian.QueryParser.FLAG_PHRASE | xapian.QueryParser.FLAG_LOVEHATE | xapian.QueryParser.FLAG_WILDCARD) I've figured out why QP does not accept "and" as operator. It is resulted from xapian-qp-utf8-0.9.2.patch. U_isupper(term[0]) should be changed back before the patch. original xapian-0.9.4 if (prefix.empty() && !term.empty() && C_isalpha(term[0])) { if (C_isupper(term[0])) { ... } else if (flags & FLAG_BOOLEAN_ANY_CASE) { ... } } after xapian-qp-utf8-0.9.2.patch if (prefix.empty() && !term.empty() && U_isupper(term[0])) { if (C_isupper(term[0])) { ... } else if (flags & FLAG_BOOLEAN_ANY_CASE) { ... } } For better Xapian, Sungsoo Kim