search for: stopwords_ignore

Displaying 2 results from an estimated 2 matches for "stopwords_ignore".

2010 May 27
1
Problem with stop words by indexing
...ow to improve TermGenerator > in 1.3.x. 1.3.x release is a little bit far away for my use case (I speak here only about the capacity of removing unstemmed stop words). I have (in termegenerator_internal.cc, line 129) changed the default value of stop_mode from STOPWORDS_INDEX_UNSTEMMED_ONLY to STOPWORDS_IGNORE and xapian does now exactly what I want. Wouldn't be possible to simply add a property "stopper_strategy" to the termgenerator (or to the stopper) class and a method to modify it (like set_stopper_strategy() ? Emmanuel
2007 Jun 28
1
TermGenerator and SimpleStopper
Hi, I'm using SimpleStopper with TermGenerator in a Python indexing script, in an attempt to keep my index size down (currently 30K per doc, and I have 200 million docs to index, which I think implies 6TB.) However, unprefixed (positional?) terms are not affected by the stopper, though Z-prefixed terms are. I assume this is intentional for phrase queries, but I need to reduce my