Displaying 20 results from an estimated 7000 matches similar to: "patch - add SnippetGenerator class"
2016 Jul 29
3
Pull requests: CJK words and Snippet generator
Hi James,
thanks for the feedback.
On Thu, Jul 28, 2016, at 00:22, James Aylett wrote:
> This sounds great! I know sufficiently little about CJK that I won't
> try to comment on that at all :)
I've just opened a pull request for the CJK tokenizer:
https://github.com/xapian/xapian/pull/114
> I wonder if we can arrange suitable defaults to use your
> implementation with the
2015 Jun 10
1
make check xapian-bindings-1.2.21 & Search-Xapian-1.2.21.0
Eric Lindblad
http://www.ericlindblad.blogspot.com
- - -
Slackware-14.0
bash-4.2# make check
Making check in perl
make[1]: Entering directory `/home/eric/xapian-bindings-1.2.21/perl'
make check-am
make[2]: Entering directory `/home/eric/xapian-bindings-1.2.21/perl'
make check-TESTS
make[3]: Entering directory `/home/eric/xapian-bindings-1.2.21/perl'
./t/01use.t .. ok
All tests
2015 Jul 26
1
Get term from document by position
mple (see attachment).
>
> Attachments get stripped out by the mailing list, so I?ve made a private gist of the two files here: <https://gist.github.com/jaylett/ce8455b37e2b84422346>.
>
> Actually, when I run it I get 0 matches, which would explain why you?re just getting the start of the document. However if I adjust things (match the stemming strategy for TermGenerator to
2008 Mar 12
1
how can i use stopwords?
Hi,
I do not understand the stopword function...
I've set the termgenerator like this:
$self->{'Stemmer'} = new Search::Xapian::Stem(german2);
$self->{'Stopper'} = new Search::Xapian::SimpleStopper();
$self->{'TermGenerator'} = new Search::Xapian::TermGenerator;
$self->{'TermGenerator'}->set_stemmer( $self->{'Stemmer'} );
2007 Dec 17
1
Crashes with spelling enabled and perl.
Hi Guys,
Here's a simple test case that causes a segfault with the perl
bindings patched to enable spelling correction:
use strict;
use warnings;
use Search::Xapian;
my $db = Search::Xapian::WritableDatabase->new("test.db",
Search::Xapian::DB_CREATE_OR_OPEN);
if (!defined($db)) {
die("Failed to open xapian_database: $!");
}
my $indexer =
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All,
I'm using the TermGenerator::index_text() on version 1.0.4 with the
FLAG_SPELLING turned on, because the new spelling suggestion stuff
seems awesome, but I'm getting a segv.
(gdb) bt
#0 0xb7ae153c in Xapian::WritableDatabase::add_spelling
(this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/
base.h:154
#1 0xb7becf47 in
2014 Feb 27
2
Summer of Code help
I think there is a development in the bug #616.
The exception obtained is:
Exception in thread "main" java.lang.IllegalArgumentException: No enum
class org.xapian.TermGenerator$flags with value 0
at org.xapian.TermGenerator$flags.swigToEnum(TermGenerator.java:143)
at org.xapian.TermGenerator.setFlags(TermGenerator.java:71)
at org.xapian.examples.SimpleIndex.main(SimpleIndex.java:54)
2010 May 27
1
Problem with stop words by indexing
Le jeu 15/04/10 02:36, "Olly Betts" olly at survex.com a ?crit:
> On Mon, Apr 05, 2010 at 07:13:02PM +0200, Emmanuel Engelhart wrote:
> > I try to remove stop words during the index process
> and I have no stemming.
> I have tried with a simple example but it does not
> work at all.
>
> > I have my writableDatabase and my termGenerator
> (indexer) and they
2007 Dec 29
3
Term-Flags
Hi,
Is it necessary to set the down below flag to the TermGenerator,
if I want the "Did you mean ..." spelling corrections?
Xapian::TermGenerator::flags::FLAG_SPELLING
Thank you very much
Markus
2012 Nov 26
1
Word missing after stemmed with Norwegian in Search::Xapian::TermGenerator
Hi all Xapian-devel,
Gist: https://gist.github.com/10d2222d8bffe8d7631d
I'm using Xapian-TermGenerator to extract Norwegian sentences to vsm
(vector space model) using TermGenerator. But when I test generating vsm
from 'Truet med ? stevne misforn?yd PC-kunde - PC-leverand?ren Asus likte
sv?rt d?rlig kundens misforn?yde leserbrev.' It doen't return 'asus' result
in vsm.
2011 Sep 14
1
Integrated Chinese tokenizer SCWS in xapian-core
Xapian is a very excellent open source search engine library, but there is no native support for Chinese word segmentation in queryparser and termgenerator.
Therefore, I modified small amount of source codes, integrated into the SCWS tokenizer, that is the same open-source and developped by myself.
Anyone can obtain the patch from below URL. After patching, Xapian::QueryParser::parse_query and
2018 Nov 30
1
Xapian Benchmark results
Hi,
I am currently trying to benchmark a multithreaded xapian implementation on
a chameleon baremetal instance written in C++. My workload is a 3 Gig
wikipedia xml dump consisting of ~286 file of different sizes. My results
are showing me that indexing on xapian is an order of magnitude faster than
my lucene and lucene plusplus implementations. This is a result that I did
not expect. Just want to
2013 May 15
1
Match positions of a queryresult
Hello,
I've just started learning Xapian and I'm facing the following problem.
I've indexed many text files (using a TermGenerator from std::string), each
document in my database is a single file on the disk.
The search works pretty well and finds the files that match the query
string, but I can't figure out how I can determine the location of the
actual matched terms. I want to
2007 Jun 15
1
TermGenerator in PHP4
(xapian 1.0.1)
Should TermGenerator in the PHP4 bindings be called XapianTermGenerator?
Thanks,
Tim.
2017 Jan 04
0
Formulating Advanced Queries with Xapian-Omega
On Thu, Dec 29, 2016 at 05:44:50PM +0100, Giulio Teslano wrote:
> a. What other types of extended wild card(s) options are there ?
>
> or is this still currently limited to these two characters '*?' ?
As I said, the branch "adds support for arbitrary glob-style wildcard
patterns (where * matches 0 or more characters and ? a single
2010 Apr 05
1
Problem with stop words by indexing
Hi,
I try to remove stop words during the index process and I have no stemming.
I have tried with a simple example but it does not work at all.
I have my writableDatabase and my termGenerator (indexer) and they work
well both together: I can index texts and search trough the database
correctly.
But if I add (before indexing my texts):
Xapian::SimpleStopper stopper;
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
This is my test case, what am I doing wrong? It seems that the API is used
incorrectly, but I cannot find the problem...
--- 8< ---
#!/usr/bin/perl
use Search::Xapian qw(:all);
use strict;
my $xa = new Search::Xapian::WritableDatabase ("/tmp/xapian",
DB_CREATE_OR_OVERWRITE);
my $indexer = Search::Xapian::TermGenerator->new();
2008 Sep 16
1
Some Questions From the beginner of Xapian
Dear, guys:
I am a beginner of Xapian, when reading the documents, I encountered follow questions.
(1) I see the Xapian::Document has a method
void add_value (Xapian::valueno valueno, const std::string &value)
What's the purpose of this method? Document will related to the terms, but what's the purpose of this?
(2) add_posting method will add term to a documents.
void
2007 Jun 28
1
TermGenerator and SimpleStopper
Hi,
I'm using SimpleStopper with TermGenerator in a Python indexing
script, in an attempt to keep my index size down (currently 30K per
doc, and I have 200 million docs to index, which I think implies
6TB.) However, unprefixed (positional?) terms are not affected by
the stopper, though Z-prefixed terms are.
I assume this is intentional for phrase queries, but I need to reduce
my
2016 Jul 26
2
Pull requests: CJK words and Snippet generator
Hi,
The Cyrus IMAP mail server uses Xapian as search engine. Recently,
FastMail has sponsored implementation of two Xapian features: CJK word
splitting and a generator for search snippets. I've been working on both
features and we would be happy to get them merged into Xapian master.
The CJK word tokenizer uses the word segmentation algorithms of the
International Components for Unicode