search for: simplestopp

Displaying 12 results from an estimated 12 matches for "simplestopp".

Did you mean: simplestopper
2007 Jun 28
1
TermGenerator and SimpleStopper
Hi, I'm using SimpleStopper with TermGenerator in a Python indexing script, in an attempt to keep my index size down (currently 30K per doc, and I have 200 million docs to index, which I think implies 6TB.) However, unprefixed (positional?) terms are not affected by the stopper, though Z-prefixed terms are. I assu...
2008 Mar 12
1
how can i use stopwords?
Hi, I do not understand the stopword function... I've set the termgenerator like this: $self->{'Stemmer'} = new Search::Xapian::Stem(german2); $self->{'Stopper'} = new Search::Xapian::SimpleStopper(); $self->{'TermGenerator'} = new Search::Xapian::TermGenerator; $self->{'TermGenerator'}->set_stemmer( $self->{'Stemmer'} ); $self->{'TermGenerator'}->set_stopper( $self->{'Stopper'} ); I've thought that xapian now exclude the sto...
2010 Apr 05
1
Problem with stop words by indexing
...process and I have no stemming. I have tried with a simple example but it does not work at all. I have my writableDatabase and my termGenerator (indexer) and they work well both together: I can index texts and search trough the database correctly. But if I add (before indexing my texts): Xapian::SimpleStopper stopper; stopper.add("testword"); indexer.set_stopper(&stopper); ... the result is exactly the same as before. I have checked with delve and "testword" is indexed. Do I use the SimpleStopper in a right way? Regards Emmanuel
2007 Jun 11
3
Xapian 1.0.1 released
I've now uploaded Xapian 1.0.1, which you can download from the usual place: http://www.xapian.org/download.php This release mainly comprises bug fixes and performance improvements. The "simple" examples (for both C++ and the bindings) have also been overhauled and now use the QueryParser and TermGenerator classes, which makes for simpler examples and should better reflect
2007 Jun 11
3
Xapian 1.0.1 released
I've now uploaded Xapian 1.0.1, which you can download from the usual place: http://www.xapian.org/download.php This release mainly comprises bug fixes and performance improvements. The "simple" examples (for both C++ and the bindings) have also been overhauled and now use the QueryParser and TermGenerator classes, which makes for simpler examples and should better reflect
2017 Jun 14
2
KMeans Clusterer - Going forward
...of terms). Getting the useful terms within a document in its document vector can improve its accuracy, due to less noise terms. Two important things to be done in this direction are : 1) Stemming This is easier because xapian already provides stemmed terms. 2) Stopword removal Use either Xapian::SimpleStopper or create a subclass of Xapian::Stopper to determine whether a term that is fed to it is a stopword or not. But for determining which terms are stopwords, I was wondering whether we'd be using the stopword list within xapian/languages/stopwords or will we have to create one within the cluster...
2009 Apr 23
1
Expanding the search in PHP
I tried using the simpleexpand.php from http://xapian.org/docs/bindings/php/examples/simpleexpand.php5 I get different results between PHP and the Omega expand (see below), I'd like to have the same functionality in PHP. Could anyone suggest how to do it? Is there an example I could use? Thanks, Frank And got the following results from PHP: Zdefin: weight = 46.963883268652 Zconfigur:
2010 May 27
1
Problem with stop words by indexing
...t it does not > work at all. > > > I have my writableDatabase and my termGenerator > (indexer) and they work > well both together: I can index texts and search > trough the database > correctly. > > > > But if I add (before indexing my texts): > > Xapian::SimpleStopper stopper; > > stopper.add("testword"); > > indexer.set_stopper(&stopper); > > > > ... the result is exactly the same as before. I have > checked with delve > and "testword" is indexed. > > http://article.gmane.org/gmane.comp.search.xapi...
2012 Jun 04
1
Search not finding queries with stop words.
...stemmer(new Search::Xapian::Stem("english")); $qp->set_stemming_strategy(STEM_SOME); $qp->set_default_op($defaultop); ... my $par = $qp->parse_query($query); my $enq = $xDatabase->enquire( $par ); and in the db create script: my $stopper = Search::Xapian::SimpleStopper->new(); foreach my $word (@ar) { $stopper->add($word); } ... my $doc = Search::Xapian::Document->new(); my $indexer = Search::Xapian::TermGenerator->new(); my $stemmer = Search::Xapian::Stem->new('english'); $doc...
2010 Jul 26
2
related documents
Hi All, I would like to take a doc in the xapian DB and find all related documents by relevance e.g. so when you view one document it says "Related entries X Y Z". I'm aware of the "Morelikethis" Lucene plugin that is supposed to do something like this, by generating a query from a document based on term frequency. Has anyone developed a tool to generate a query from a
2008 Mar 27
2
Proper noun stemming
Hi All I was wondering if anyone had a solution for the following problem. I user QueryParser to stem my documents before adding them to a database. During the stemming process I would like to find a way of keeping proper nouns that span two or more words together as a phrase. For example "New York" or "Gordon Brown" or "Prime Minister" get spilt up. I see
2006 May 10
1
Documentation for the PHP OO wrapper
...e to resolve : - some methods are documented but do not exist in the wrapper (e.g. empty(), max_size(), swap(), create()...) - some forms of methods use arguments types which are not in the wrapper and should not be documented (e.g : get_mset with a MatchDecider, get_eset with an ExpandDecider, SimpleStopper::__construct() with iterators...) - some methods exist in the wrapper but are not documented. This is everything which is specific to the bindings : http://svn.xapian.org/trunk/xapian-bindings/php/docs/bindings.html?view=co and some which are not documented like (Simple)Stopper::apply(). - some...