thr3ads.net - Xapian discuss - [Xapian-discuss] Queryparser problem.. [Dec 2007]

If this information is useful, please help other people find it:
Share via:

Jesper Krogh

2007-Dec-09 14:06 UTC

[Xapian-discuss] Queryparser problem..

Hi list.

The queryparser in my setup is using strategy STEM_SOME which seem to 
give the best handling of the data in our setup.

But the queryparser doesn't really seem to be consistent.
doc:test
Running query 'Xapian::Query(ZDOCTYPEtest:(pos=1))'

Here it applies stemming to the term before running the query (Z-prefix)

doc:1234
Running query 'Xapian::Query(DOCTYPE1234:(pos=1))'

There it skips the stemming.

What is the reason for behaving different based on user-input?

Thanks.
-- 
Jesper

Olly Betts

2007-Dec-09 14:06 UTC

head link

[Xapian-discuss] Queryparser problem..

On Sun, Dec 09, 2007 at 08:16:17AM +0100, Jesper Krogh
wrote:> The queryparser in my setup is using strategy STEM_SOME which seem to 
> give the best handling of the data in our setup.
> 
> But the queryparser doesn't really seem to be consistent.
> doc:test
> Running query 'Xapian::Query(ZDOCTYPEtest:(pos=1))'
> 
> Here it applies stemming to the term before running the query (Z-prefix)
> 
> doc:1234
> Running query 'Xapian::Query(DOCTYPE1234:(pos=1))'
> 
> There it skips the stemming.
> 
> What is the reason for behaving different based on user-input?
http://www.xapian.org/docs/termgenerator.html

    Now we index all terms lowercased with positional information, and
    also stemmed with a 'Z' prefix (unless they start with a digit)
[...]

Indexing terms which start with a digit twice just bloats the database.
I'm not aware of a language where words can start with a digit, and it
can actually harm retrieval if we attempt to stem part numbers and other
codes.

Cheers,
    Olly

Xapian discuss - Dec 2007 - Queryparser problem..

[Xapian-discuss] Queryparser problem..

[Xapian-discuss] Queryparser problem..