Displaying 20 results from an estimated 62 matches for "termgenerator".
2015 Jun 10
1
make check xapian-bindings-1.2.21 & Search-Xapian-1.2.21.0
...s=1, Tests=2, 1 wallclock secs ( 0.13 usr 0.01 sys + 0.23 cusr 0.01 csys = 0.38 CPU)
Result: PASS
PASS: t/stem.t
./t/symbol-test.t .. ok
All tests successful.
Files=1, Tests=3, 7 wallclock secs ( 0.13 usr 0.02 sys + 5.26 cusr 0.44 csys = 5.85 CPU)
Result: PASS
PASS: t/symbol-test.t
./t/termgenerator.t .. 1/28 # Failed test 2 in ./t/termgenerator.t at line 26
# ./t/termgenerator.t line 26 is: ok( $ti ne $doc->termlist_end());
# Failed test 5 in ./t/termgenerator.t at line 30
# ./t/termgenerator.t line 30 is: ok( $pi ne $ti->positionlist_end() );
# Test 7 got: "Xapian::PositionItera...
2007 Dec 29
3
Term-Flags
Hi,
Is it necessary to set the down below flag to the TermGenerator,
if I want the "Did you mean ..." spelling corrections?
Xapian::TermGenerator::flags::FLAG_SPELLING
Thank you very much
Markus
2008 Mar 12
1
how can i use stopwords?
Hi,
I do not understand the stopword function...
I've set the termgenerator like this:
$self->{'Stemmer'} = new Search::Xapian::Stem(german2);
$self->{'Stopper'} = new Search::Xapian::SimpleStopper();
$self->{'TermGenerator'} = new Search::Xapian::TermGenerator;
$self->{'TermGenerator'}->set_stemmer( $self->{'Stemmer...
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All,
I'm using the TermGenerator::index_text() on version 1.0.4 with the
FLAG_SPELLING turned on, because the new spelling suggestion stuff
seems awesome, but I'm getting a segv.
(gdb) bt
#0 0xb7ae153c in Xapian::WritableDatabase::add_spelling
(this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/
base.h:1...
2012 Nov 26
1
Word missing after stemmed with Norwegian in Search::Xapian::TermGenerator
Hi all Xapian-devel,
Gist: https://gist.github.com/10d2222d8bffe8d7631d
I'm using Xapian-TermGenerator to extract Norwegian sentences to vsm
(vector space model) using TermGenerator. But when I test generating vsm
from 'Truet med ? stevne misforn?yd PC-kunde - PC-leverand?ren Asus likte
sv?rt d?rlig kundens misforn?yde leserbrev.' It doen't return 'asus' result
in vsm.
So I'...
2007 Jun 28
1
TermGenerator and SimpleStopper
Hi,
I'm using SimpleStopper with TermGenerator in a Python indexing
script, in an attempt to keep my index size down (currently 30K per
doc, and I have 200 million docs to index, which I think implies
6TB.) However, unprefixed (positional?) terms are not affected by
the stopper, though Z-prefixed terms are.
I assume this is intentiona...
2007 Dec 17
1
Crashes with spelling enabled and perl.
...o enable spelling correction:
use strict;
use warnings;
use Search::Xapian;
my $db = Search::Xapian::WritableDatabase->new("test.db",
Search::Xapian::DB_CREATE_OR_OPEN);
if (!defined($db)) {
die("Failed to open xapian_database: $!");
}
my $indexer = Search::Xapian::TermGenerator->new();
$indexer->set_flags(Search::Xapian::FLAG_SPELLING);
my $document = Search::Xapian::Document->new();
$indexer->set_document($document);
$indexer->index_text(lc('test'), 1);
$db->add_document($document);
undef $db;
Here's the patch to enable spelling against Sea...
2007 Jun 15
1
TermGenerator in PHP4
(xapian 1.0.1)
Should TermGenerator in the PHP4 bindings be called XapianTermGenerator?
Thanks,
Tim.
2014 Feb 27
2
Summer of Code help
I think there is a development in the bug #616.
The exception obtained is:
Exception in thread "main" java.lang.IllegalArgumentException: No enum
class org.xapian.TermGenerator$flags with value 0
at org.xapian.TermGenerator$flags.swigToEnum(TermGenerator.java:143)
at org.xapian.TermGenerator.setFlags(TermGenerator.java:71)
at org.xapian.examples.SimpleIndex.main(SimpleIndex.java:54)
Error seems to occur in the swigToEnum method.
So I checked
http://www.swig.org/Doc2.0...
2008 Mar 27
2
Proper noun stemming
Hi All
I was wondering if anyone had a solution for the following problem.
I user QueryParser to stem my documents before adding them to a
database. During the stemming process I would like to find a way of
keeping proper nouns that span two or more words together as a phrase.
For example "New York" or "Gordon Brown" or "Prime Minister" get spilt
up. I see
2010 Jun 09
1
TermGenerator incorrectly tokenizes German text which contains special characters
..._php bindings. I
run Apache 2.2 on Windows using PHP 5.2.13 with the pre build xapian
bindings from Flax:
Xapian Support enabled Xapian
Compiled Version @PACKAGE_VERSION@
Xapian Linked Version 1.2.0
The problem is that after indexing text which contains special characters
like ?, ?, ? and ?, using TermGenerator::index_text (
http://xapian.org/docs/sourcedoc/html/classXapian_1_1TermGenerator.html#b358784fa685139e8bdd71d37f39573e),
terms get cut off (stopped) after the special character. For example the
term gesundheitssch?dlich is indexed as gesundheitssch? and Zgesundheitssch?
(stemmed).
All character en...
2015 Jul 26
1
Get term from document by position
...st, so I?ve made a private gist of the two files here: <https://gist.github.com/jaylett/ce8455b37e2b84422346>.
>
> Actually, when I run it I get 0 matches, which would explain why you?re just getting the start of the document. However if I adjust things (match the stemming strategy for TermGenerator to that for QueryParser), it still gives me the opening rather than a useful snippet.
Sorry, my mistake. The modified test.cpp file should be this (i just added
indexer.set_stemming_strategy(Xapian::TermGenerator::STEM_ALL_Z), line 34):
============= Begin of the modified test.cpp file=======...
2010 May 27
1
Problem with stop words by indexing
...om a ?crit:
> On Mon, Apr 05, 2010 at 07:13:02PM +0200, Emmanuel Engelhart wrote:
> > I try to remove stop words during the index process
> and I have no stemming.
> I have tried with a simple example but it does not
> work at all.
>
> > I have my writableDatabase and my termGenerator
> (indexer) and they work
> well both together: I can index texts and search
> trough the database
> correctly.
> >
> > But if I add (before indexing my texts):
> > Xapian::SimpleStopper stopper;
> > stopper.add("testword");
> > indexer.set_stopp...
2007 May 30
1
QueryParser prefixing terms when stemming?
I'm new to Xapian and we just recently upgraded to version 1.0.0.0.
However, something seems to have changed during the upgrade and I
need help figuring out how my code should be written.
In version 0.9.9.1 of Search::Xapian, the following code results in
this output "Xapian::Query(pet:(pos=1))".
my $qp = new Search::Xapian::QueryParser;
$qp->set_stemmer(new
2010 Apr 05
1
Problem with stop words by indexing
Hi,
I try to remove stop words during the index process and I have no stemming.
I have tried with a simple example but it does not work at all.
I have my writableDatabase and my termGenerator (indexer) and they work
well both together: I can index texts and search trough the database
correctly.
But if I add (before indexing my texts):
Xapian::SimpleStopper stopper;
stopper.add("testword");
indexer.set_stopper(&stopper);
... the result is exactly the same as before. I hav...
2011 Sep 14
1
Integrated Chinese tokenizer SCWS in xapian-core
Xapian is a very excellent open source search engine library, but there is no native support for Chinese word segmentation in queryparser and termgenerator.
Therefore, I modified small amount of source codes, integrated into the SCWS tokenizer, that is the same open-source and developped by myself.
Anyone can obtain the patch from below URL. After patching, Xapian::QueryParser::parse_query and Xapian::Termgenerator::index_text will support chinese...
2014 Feb 24
2
Summer of Code help
Hello Olly,
I read about Xapian and SWIG and the bindings that Xapian has with other
languages.
According to what I've read, I understand that
Xapian is a search engine library written in C/C++. It can be
integrated with web applications which handle large amount of data.
But since the web applications may be written in a variety of languages a
binding is required for the web app to be able
2008 Sep 16
1
Some Questions From the beginner of Xapian
...ue)
What's the purpose of this method? Document will related to the terms, but what's the purpose of this?
(2) add_posting method will add term to a documents.
void add_posting (const std::string &tname, Xapian::termpos tpos, Xapian::termcount wdfinc=1)
I noticed that
Xapian::TermGenerator has follow method
void index_text (const Xapian::Utf8Iterator &itor, Xapian::termcount weight=1, const std::string &prefix="")
What's the differences and relationship between these two functions?
Thanks a lot!
Sam
-------------- next part --------------
An HTML atta...
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
...g wrong? It seems that the API is used
incorrectly, but I cannot find the problem...
--- 8< ---
#!/usr/bin/perl
use Search::Xapian qw(:all);
use strict;
my $xa = new Search::Xapian::WritableDatabase ("/tmp/xapian",
DB_CREATE_OR_OVERWRITE);
my $indexer = Search::Xapian::TermGenerator->new();
$indexer->set_flags(Search::Xapian::FLAG_SPELLING);
my $doc = new Search::Xapian::Document;
$indexer->set_document($doc);
$indexer->index_text("hello 123 blah blah");
$xa->add_document($doc);
--- >8 ---
Output:
terminate called after throwing an instance of &...
2017 Feb 08
1
searching for " in phrase and other special chars
Hello,
I'm reading xapian-core/docs/queryparser.rst and haven't been
able to find a way to escape " (double-quote) inside quoted
phrases.
Is this possible?
I'm also wondering if searching for other special characters,
such as a literal '*', is possible without triggering a wildcard
match. It would be helpful for some source code searches.
Thanks!