search for: termgenerator

Displaying 20 results from an estimated 62 matches for "termgenerator".

2015 Jun 10
1
make check xapian-bindings-1.2.21 & Search-Xapian-1.2.21.0
...s=1, Tests=2, 1 wallclock secs ( 0.13 usr 0.01 sys + 0.23 cusr 0.01 csys = 0.38 CPU) Result: PASS PASS: t/stem.t ./t/symbol-test.t .. ok All tests successful. Files=1, Tests=3, 7 wallclock secs ( 0.13 usr 0.02 sys + 5.26 cusr 0.44 csys = 5.85 CPU) Result: PASS PASS: t/symbol-test.t ./t/termgenerator.t .. 1/28 # Failed test 2 in ./t/termgenerator.t at line 26 # ./t/termgenerator.t line 26 is: ok( $ti ne $doc->termlist_end()); # Failed test 5 in ./t/termgenerator.t at line 30 # ./t/termgenerator.t line 30 is: ok( $pi ne $ti->positionlist_end() ); # Test 7 got: "Xapian::PositionItera...
2007 Dec 29
3
Term-Flags
Hi, Is it necessary to set the down below flag to the TermGenerator, if I want the "Did you mean ..." spelling corrections? Xapian::TermGenerator::flags::FLAG_SPELLING Thank you very much Markus
2008 Mar 12
1
how can i use stopwords?
Hi, I do not understand the stopword function... I've set the termgenerator like this: $self->{'Stemmer'} = new Search::Xapian::Stem(german2); $self->{'Stopper'} = new Search::Xapian::SimpleStopper(); $self->{'TermGenerator'} = new Search::Xapian::TermGenerator; $self->{'TermGenerator'}->set_stemmer( $self->{'Stemmer...
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All, I'm using the TermGenerator::index_text() on version 1.0.4 with the FLAG_SPELLING turned on, because the new spelling suggestion stuff seems awesome, but I'm getting a segv. (gdb) bt #0 0xb7ae153c in Xapian::WritableDatabase::add_spelling (this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/ base.h:1...
2012 Nov 26
1
Word missing after stemmed with Norwegian in Search::Xapian::TermGenerator
Hi all Xapian-devel, Gist: https://gist.github.com/10d2222d8bffe8d7631d I'm using Xapian-TermGenerator to extract Norwegian sentences to vsm (vector space model) using TermGenerator. But when I test generating vsm from 'Truet med ? stevne misforn?yd PC-kunde - PC-leverand?ren Asus likte sv?rt d?rlig kundens misforn?yde leserbrev.' It doen't return 'asus' result in vsm. So I'...
2007 Jun 28
1
TermGenerator and SimpleStopper
Hi, I'm using SimpleStopper with TermGenerator in a Python indexing script, in an attempt to keep my index size down (currently 30K per doc, and I have 200 million docs to index, which I think implies 6TB.) However, unprefixed (positional?) terms are not affected by the stopper, though Z-prefixed terms are. I assume this is intentiona...
2007 Dec 17
1
Crashes with spelling enabled and perl.
...o enable spelling correction: use strict; use warnings; use Search::Xapian; my $db = Search::Xapian::WritableDatabase->new("test.db", Search::Xapian::DB_CREATE_OR_OPEN); if (!defined($db)) { die("Failed to open xapian_database: $!"); } my $indexer = Search::Xapian::TermGenerator->new(); $indexer->set_flags(Search::Xapian::FLAG_SPELLING); my $document = Search::Xapian::Document->new(); $indexer->set_document($document); $indexer->index_text(lc('test'), 1); $db->add_document($document); undef $db; Here's the patch to enable spelling against Sea...
2007 Jun 15
1
TermGenerator in PHP4
(xapian 1.0.1) Should TermGenerator in the PHP4 bindings be called XapianTermGenerator? Thanks, Tim.
2014 Feb 27
2
Summer of Code help
I think there is a development in the bug #616. The exception obtained is: Exception in thread "main" java.lang.IllegalArgumentException: No enum class org.xapian.TermGenerator$flags with value 0 at org.xapian.TermGenerator$flags.swigToEnum(TermGenerator.java:143) at org.xapian.TermGenerator.setFlags(TermGenerator.java:71) at org.xapian.examples.SimpleIndex.main(SimpleIndex.java:54) Error seems to occur in the swigToEnum method. So I checked http://www.swig.org/Doc2.0...
2008 Mar 27
2
Proper noun stemming
Hi All I was wondering if anyone had a solution for the following problem. I user QueryParser to stem my documents before adding them to a database. During the stemming process I would like to find a way of keeping proper nouns that span two or more words together as a phrase. For example "New York" or "Gordon Brown" or "Prime Minister" get spilt up. I see
2010 Jun 09
1
TermGenerator incorrectly tokenizes German text which contains special characters
..._php bindings. I run Apache 2.2 on Windows using PHP 5.2.13 with the pre build xapian bindings from Flax: Xapian Support enabled Xapian Compiled Version @PACKAGE_VERSION@ Xapian Linked Version 1.2.0 The problem is that after indexing text which contains special characters like ?, ?, ? and ?, using TermGenerator::index_text ( http://xapian.org/docs/sourcedoc/html/classXapian_1_1TermGenerator.html#b358784fa685139e8bdd71d37f39573e), terms get cut off (stopped) after the special character. For example the term gesundheitssch?dlich is indexed as gesundheitssch? and Zgesundheitssch? (stemmed). All character en...
2015 Jul 26
1
Get term from document by position
...st, so I?ve made a private gist of the two files here: <https://gist.github.com/jaylett/ce8455b37e2b84422346>. > > Actually, when I run it I get 0 matches, which would explain why you?re just getting the start of the document. However if I adjust things (match the stemming strategy for TermGenerator to that for QueryParser), it still gives me the opening rather than a useful snippet. Sorry, my mistake. The modified test.cpp file should be this (i just added indexer.set_stemming_strategy(Xapian::TermGenerator::STEM_ALL_Z), line 34): ============= Begin of the modified test.cpp file=======...
2010 May 27
1
Problem with stop words by indexing
...om a ?crit: > On Mon, Apr 05, 2010 at 07:13:02PM +0200, Emmanuel Engelhart wrote: > > I try to remove stop words during the index process > and I have no stemming. > I have tried with a simple example but it does not > work at all. > > > I have my writableDatabase and my termGenerator > (indexer) and they work > well both together: I can index texts and search > trough the database > correctly. > > > > But if I add (before indexing my texts): > > Xapian::SimpleStopper stopper; > > stopper.add("testword"); > > indexer.set_stopp...
2007 May 30
1
QueryParser prefixing terms when stemming?
I'm new to Xapian and we just recently upgraded to version 1.0.0.0. However, something seems to have changed during the upgrade and I need help figuring out how my code should be written. In version 0.9.9.1 of Search::Xapian, the following code results in this output "Xapian::Query(pet:(pos=1))". my $qp = new Search::Xapian::QueryParser; $qp->set_stemmer(new
2010 Apr 05
1
Problem with stop words by indexing
Hi, I try to remove stop words during the index process and I have no stemming. I have tried with a simple example but it does not work at all. I have my writableDatabase and my termGenerator (indexer) and they work well both together: I can index texts and search trough the database correctly. But if I add (before indexing my texts): Xapian::SimpleStopper stopper; stopper.add("testword"); indexer.set_stopper(&stopper); ... the result is exactly the same as before. I hav...
2011 Sep 14
1
Integrated Chinese tokenizer SCWS in xapian-core
Xapian is a very excellent open source search engine library, but there is no native support for Chinese word segmentation in queryparser and termgenerator. Therefore, I modified small amount of source codes, integrated into the SCWS tokenizer, that is the same open-source and developped by myself. Anyone can obtain the patch from below URL. After patching, Xapian::QueryParser::parse_query and Xapian::Termgenerator::index_text will support chinese...
2014 Feb 24
2
Summer of Code help
Hello Olly, I read about Xapian and SWIG and the bindings that Xapian has with other languages. According to what I've read, I understand that Xapian is a search engine library written in C/C++. It can be integrated with web applications which handle large amount of data. But since the web applications may be written in a variety of languages a binding is required for the web app to be able
2008 Sep 16
1
Some Questions From the beginner of Xapian
...ue) What's the purpose of this method? Document will related to the terms, but what's the purpose of this? (2) add_posting method will add term to a documents. void add_posting (const std::string &tname, Xapian::termpos tpos, Xapian::termcount wdfinc=1) I noticed that Xapian::TermGenerator has follow method void index_text (const Xapian::Utf8Iterator &itor, Xapian::termcount weight=1, const std::string &prefix="") What's the differences and relationship between these two functions? Thanks a lot! Sam -------------- next part -------------- An HTML atta...
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
...g wrong? It seems that the API is used incorrectly, but I cannot find the problem... --- 8< --- #!/usr/bin/perl use Search::Xapian qw(:all); use strict; my $xa = new Search::Xapian::WritableDatabase ("/tmp/xapian", DB_CREATE_OR_OVERWRITE); my $indexer = Search::Xapian::TermGenerator->new(); $indexer->set_flags(Search::Xapian::FLAG_SPELLING); my $doc = new Search::Xapian::Document; $indexer->set_document($doc); $indexer->index_text("hello 123 blah blah"); $xa->add_document($doc); --- >8 --- Output: terminate called after throwing an instance of &...
2017 Feb 08
1
searching for " in phrase and other special chars
Hello, I'm reading xapian-core/docs/queryparser.rst and haven't been able to find a way to escape " (double-quote) inside quoted phrases. Is this possible? I'm also wondering if searching for other special characters, such as a literal '*', is possible without triggering a wildcard match. It would be helpful for some source code searches. Thanks!