David
2010-Nov-02 10:41 UTC
[Xapian-discuss] How to make QueryParser select entire word like "H.O.T"
Hi, I'm using xapian to build my search engine, but met with a problem. The code snippet is like: ----------------------Code begin------------------------------------------------------------- Xapian::QueryParser qp; qp.add_prefix("Singer", "S"); Xapian::Query query = qp.parse_query("Singer:s.h.e", Xapian::QueryParser::FLAG_PARTIAL|Xapian::QueryParser::FLAG_AUTO_MULTIWORD_SYNONYMS |Xapian::QueryParser::FLAG_PHRASE ); cout << "Performing query `" << query.get_description() << "'" << endl; ----------------------Code end--------------------------------------------------------------- See the output from the stdio, ----------------------Output begin--------------------------------------------------------- Performing query `Xapian::Query((Ss:(pos=1) PHRASE 3 Sh:(pos=2) PHRASE 3 Se:(pos=3)))' ----------------------Output end----------------------------------------------------------- See the problem? Actually "s.h.e" is a music band from Taiwan, and I want to use this as an entire query word to search in the singer field. So any one who know how to let the parser get "Ss.h.e" rather than splitted query ? Thanks :)
Olly Betts
2010-Nov-09 02:56 UTC
[Xapian-discuss] How to make QueryParser select entire word like "H.O.T"
On Tue, Nov 02, 2010 at 06:41:24PM +0800, David wrote:> I'm using xapian to build my search engine, but met with a problem. > The code snippet is like: > ----------------------Code begin------------------------------------------------------------- > Xapian::QueryParser qp; > qp.add_prefix("Singer", "S"); > Xapian::Query query = qp.parse_query("Singer:s.h.e", Xapian::QueryParser::FLAG_PARTIAL|Xapian::QueryParser::FLAG_AUTO_MULTIWORD_SYNONYMS |Xapian::QueryParser::FLAG_PHRASE ); > cout << "Performing query `" << query.get_description() << "'" << endl; > ----------------------Code end--------------------------------------------------------------- > > See the output from the stdio, > ----------------------Output begin--------------------------------------------------------- > Performing query `Xapian::Query((Ss:(pos=1) PHRASE 3 Sh:(pos=2) PHRASE 3 Se:(pos=3)))' > ----------------------Output end----------------------------------------------------------- > > See the problem? Actually "s.h.e" is a music band from Taiwan, and I want to > use this as an entire query word to search in the singer field. > So any one who know how to let the parser get "Ss.h.e" rather than splitted query ?QueryParser doesn't allow you to customise how it interprets word boundaries currently - this is ticket #113: http://trac.xapian.org/ticket/113 There's currently some special handling for acronyms punctuated with ".", but only if capitalised, so this would work as you want: Singer:S.H.E Perhaps this handling should also work for lower case acronyms. I can't think of a good reason for it not to, except that we'd need to fix TermGenerator to match (or else "s.h.e" in the query wouldn't match "s.h.e" in a document), and that is an incompatible change, so would really need to wait for 1.3.0. Cheers, Olly