On Sat, Jan 06, 2007 at 12:34:25PM -0800, Alexander Lind
wrote:> Currently I pre-parse every incoming query in my own code before I send
> it on to Xapian.
> Among other things, I determine what parts of a query are actual search
> words, and I then manually stem each one of them with Xapians
> Stem::stem_word().
>
> I was wondering if this is the correct approach, or is there a better
> way of doing this, ie have Xapian automatically do it on the words in
> the query itself, when the query string is passed on to
> QueryParser::parse_query() ?
Yes, it's much better to let the QueryParser do the stemming. Just
call Xapian::QueryParser::set_stemmer().
I'd say it's a mistake to try to manipulate user specified query strings
before passing them to the QueryParser. You'll essentially need to
duplicate how the QueryParser parses a query string, but the exact
handling of corner cases (especially in the case of oddly formed
queries, such as a phrase search with unmatched quotes) is open to
change, so even if you reverse engineer the current behaviour, it might
change in a future release.
Cheers,
Olly