David Versmisse
2009-Apr-07 08:33 UTC
[Xapian-discuss] Search docs with terms that match a pattern
Hello, I have once again a small question: Is it possible to search a pattern like "*foo*" ? I saw that we can use QueryParser::FLAG_WILDCARD but only with "foo*" patterns. Have you got a trick to do that ? And an other question with wildcards: i tested FLAG_WILDCARD and the QueryParser, but i have some problems with special characters (by example "/"). Is it possible to use escape characters to do that? I tried "\/", but without good results. The problem is that your terms are often "id", or something like that, with a lot of special characters :-( By advance, thank you for your answer, Best regards, David V. -- David Versmisse Itaapy <http://www.itaapy.com> Tel +33 (0)1 42 23 67 45 9 rue Darwin, 75018 Paris Fax +33 (0)1 53 28 27 88 _______________________________________________ Itaapy mailing list Itaapy at ikaaro.org http://mail.ikaaro.org/mailman/listinfo/itaapy
Olly Betts
2009-Apr-08 01:05 UTC
[Xapian-discuss] Search docs with terms that match a pattern
On Tue, Apr 07, 2009 at 10:33:12AM +0200, David Versmisse wrote:> I have once again a small question: Is it possible to search a pattern > like "*foo*" ? I saw that we can use QueryParser::FLAG_WILDCARD but > only with "foo*" patterns. Have you got a trick to do that ?No, only trailing wildcards are supported. To support simultaneous leading and trailing wildcards without the performance sucking, we'd need to maintain a copy of the termlist structured in a way which allows efficient regexp searching.> And an other question with wildcards: i tested FLAG_WILDCARD and the > QueryParser, but i have some problems with special characters (by > example "/"). Is it possible to use escape characters to do that? I > tried "\/", but without good results. The problem is that your terms > are often "id", or something like that, with a lot of special > characters :-(An id term makes a lot more sense as a boolean term rather than a probabilistic one - it's not a word and you don't want it stemmed. And boolean terms can contain "/". But wildcards on boolean terms aren't supported. They could be, but a boolean term can of course contain literal "*", so you'd need to support some sort of escaping. There's some discussion of that here: http://trac.xapian.org/ticket/128 Cheers, Olly