Displaying 2 results from an estimated 2 matches for "stopwords_ignore".
2010 May 27
1
Problem with stop words by indexing
...ow to improve TermGenerator
> in 1.3.x.
1.3.x release is a little bit far away for my use case (I speak here only about the capacity of removing unstemmed stop words).
I have (in termegenerator_internal.cc, line 129) changed the default value of stop_mode from STOPWORDS_INDEX_UNSTEMMED_ONLY to STOPWORDS_IGNORE and xapian does now exactly what I want.
Wouldn't be possible to simply add a property "stopper_strategy" to the termgenerator (or to the stopper) class and a method to modify it (like set_stopper_strategy() ?
Emmanuel
2007 Jun 28
1
TermGenerator and SimpleStopper
Hi,
I'm using SimpleStopper with TermGenerator in a Python indexing
script, in an attempt to keep my index size down (currently 30K per
doc, and I have 200 million docs to index, which I think implies
6TB.) However, unprefixed (positional?) terms are not affected by
the stopper, though Z-prefixed terms are.
I assume this is intentional for phrase queries, but I need to reduce
my