Marcus Ramberg
2007-Mar-05 13:25 UTC
[Xapian-discuss] Making '.' a valid search term character
Hey Is there any way to make the queryparser not split queries on '.'? I'm using Xapian for the search on http://osx.iusethis.com/ and some mac apps have dots in their names. I've been able to index terms with '.' in them, but queryparser splits the search on them. Marcus Ramberg marcus@startsiden.no
Olly Betts
2007-Mar-06 01:22 UTC
[Xapian-discuss] Making '.' a valid search term character
On Mon, Mar 05, 2007 at 02:25:45PM +0100, Marcus Ramberg wrote:> Is there any way to make the queryparser not split queries on '.'?It's not currently configurable, but it's not hard to patch the code to do this. Look in queryparser/queryparser_internal.cc for where '&' is handled. We treat a single embedded '&' as a word character so things like "AT&T" are a single word (but C code like "a&&b" isn't). If you change that to check for '&' or '.' then you'll probably get the effect you want. There's also code to handle "initialisms" specially, so "I.B.M." is treated the same as "IBM". That only applies to single capitals with '.' in between, but you might want to disable that if it's likely to be a problem for you (it's just above where '&' is checked for). Cheers, Olly