similar to: Cannot index with dynamic spelling data (Perl/Search::Xapian)

Displaying 20 results from an estimated 600 matches similar to: "Cannot index with dynamic spelling data (Perl/Search::Xapian)"

2007 Dec 29
3
Term-Flags
Hi, Is it necessary to set the down below flag to the TermGenerator, if I want the "Did you mean ..." spelling corrections? Xapian::TermGenerator::flags::FLAG_SPELLING Thank you very much Markus
2007 Dec 17
1
Crashes with spelling enabled and perl.
Hi Guys, Here's a simple test case that causes a segfault with the perl bindings patched to enable spelling correction: use strict; use warnings; use Search::Xapian; my $db = Search::Xapian::WritableDatabase->new("test.db", Search::Xapian::DB_CREATE_OR_OPEN); if (!defined($db)) { die("Failed to open xapian_database: $!"); } my $indexer =
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All, I'm using the TermGenerator::index_text() on version 1.0.4 with the FLAG_SPELLING turned on, because the new spelling suggestion stuff seems awesome, but I'm getting a segv. (gdb) bt #0 0xb7ae153c in Xapian::WritableDatabase::add_spelling (this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/ base.h:154 #1 0xb7becf47 in
2008 Mar 27
2
Proper noun stemming
Hi All I was wondering if anyone had a solution for the following problem. I user QueryParser to stem my documents before adding them to a database. During the stemming process I would like to find a way of keeping proper nouns that span two or more words together as a phrase. For example "New York" or "Gordon Brown" or "Prime Minister" get spilt up. I see
2010 Jun 09
1
TermGenerator incorrectly tokenizes German text which contains special characters
Dear Xapian users, I try to index some German text with Xapian using the xapian_php bindings. I run Apache 2.2 on Windows using PHP 5.2.13 with the pre build xapian bindings from Flax: Xapian Support enabled Xapian Compiled Version @PACKAGE_VERSION@ Xapian Linked Version 1.2.0 The problem is that after indexing text which contains special characters like ?, ?, ? and ?, using
2008 Jan 15
7
PHP indexing, what's the PHP method for indexscript
Currently I have the following indexscript: pid : unique=Q boolean=Q field=pid postdate : field=startdate author_name: unhtml boolean=XAUTHORNAME field=author author_id: boolean=XAUTHORID field=authorid url : field=url sample : weight=1 index field=sample How can I create the same indexing using PHP? With this, I can get an searchable index, but I have no idea how to set the fields, so that I
2014 Jan 27
4
Perl Search::Xapian
Hi, Trying to learn Search::Xapian and be better at perl at the same time, I'm stuck, at the DB_CREATE_OR_OPEN error. Perl says this: ~/dev/sandbox/Xapian-perl$ ./Index1-Xap.pl 100-objects-v1.csv db "db" is not exported by the Search::Xapian module Can't continue after import errors at ./Index1-Xap.pl line 7. BEGIN failed--compilation aborted at ./Index1-Xap.pl line 7. What I
2018 Jun 20
2
Welcome to the "Xapian-discuss" mailing list
Hi, I'm new to Xapian and wanted to know if it has a specific feature. I want to be able to check the relation between two terms on a page based on how close they are together on the page. I want to use a combination of n-gram based labeling and the "slop" feature found in Elasticsearch. Does Xapian have this/a similar feature? I haven't been able to find any programs that have
2014 Feb 27
2
Summer of Code help
I think there is a development in the bug #616. The exception obtained is: Exception in thread "main" java.lang.IllegalArgumentException: No enum class org.xapian.TermGenerator$flags with value 0 at org.xapian.TermGenerator$flags.swigToEnum(TermGenerator.java:143) at org.xapian.TermGenerator.setFlags(TermGenerator.java:71) at org.xapian.examples.SimpleIndex.main(SimpleIndex.java:54)
2018 Nov 30
1
Xapian Benchmark results
Hi, I am currently trying to benchmark a multithreaded xapian implementation on a chameleon baremetal instance written in C++. My workload is a 3 Gig wikipedia xml dump consisting of ~286 file of different sizes. My results are showing me that indexing on xapian is an order of magnitude faster than my lucene and lucene plusplus implementations. This is a result that I did not expect. Just want to
2015 Jul 26
1
Get term from document by position
mple (see attachment). > > Attachments get stripped out by the mailing list, so I?ve made a private gist of the two files here: <https://gist.github.com/jaylett/ce8455b37e2b84422346>. > > Actually, when I run it I get 0 matches, which would explain why you?re just getting the start of the document. However if I adjust things (match the stemming strategy for TermGenerator to
2010 Oct 21
2
In-memory databases vs PHP Bindings
I can't quite connect the dots on this, perhaps someone can help. I'm simply trying to create an in-memory database comprising a single document, so that I can run a load of queries against it and see if any of them match the new document (this is to enable users to have 'subscriptions' to saved searches and be alerted every time a new item is published that matches their
2012 Jun 04
1
Search not finding queries with stop words.
I have a search in perl that looks a bit like: my $qp = new Search::Xapian::QueryParser(); $qp->set_stemmer(new Search::Xapian::Stem("english")); $qp->set_stemming_strategy(STEM_SOME); $qp->set_default_op($defaultop); ... my $par = $qp->parse_query($query); my $enq = $xDatabase->enquire( $par ); and in the db create script: my $stopper =
2014 Mar 17
2
[GSOC 2014] Indexing INEX dataset
Hi Olly, Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1) by below line, automatically adjust the wdfs and field lengths? indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S"); if it does not then we should include that part in the patch too. I like to create a patch for xapian-letor for resolving common code of xapian.
2012 Nov 03
1
get the title from the document
Dear all, I am working on a very simple project, in which I wanna get the title from the document. For instance, this is what I have done so far. ///////////// code? for building the index file ??????? # Load content ??????? content = open(filePath).read() ??????? # Prepare document ??????? document = xapian.Document() ??????? document.set_data(content) ??????? # Store fileName ???????
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote: > > > > On current trunk, we index the title with prefix "S" by default in > > omindex, though with a wdf inc of 5 rather than 1: > > > > indexer.index_text(title, 5, "S"); > > > > So I don't think you need that change to omindex now. > > Yes, but please
2018 Jun 21
0
Welcome to the "Xapian-discuss" mailing list
Please keep replies on the mailing list — more people can help (and benefit) that way :) So OP_NEAR looks for its terms close to each other (hence "near"). The window is how far away they can be. Probably the easiest way to play with this is using the NEAR syntax in the query parser. So if you had a plain text document: I am walking, always walking. And index it in a very simple
2011 Jun 20
1
Revision: 15699: $tg->index_text ($text, $weight) fails with "No matching function for overloaded 'TermGenerator_index_text'"
Hi, I've been out of touch recently, so perhaps I've missed something (the last time I checked the svn pulse the Perl code was under search-xapian/ - looks like things have moved to swig). The latest trunk (revision 15699) has a problem with Perl: $tg->index_text ($text, $weight); It fails with "No matching function for overloaded 'TermGenerator_index_text'..." I
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 12:02:15PM +0100, Parth Gupta wrote: > During the indexing with omindex, only you need to make sure is indexing > with prefix 'S' for title as explained here in Letor documentation: > xapian-letor/docs/letor.rst > > Previously when I edited omindex.cc it was modified as can be seen >
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
Hi Parth, I?ve implemented SVMRanker class and also sorted out most of current Letor APIs. Now I?m trying to use INEX dataset to verify my implement. But I stuck in the indexing part. You said in the documentation that we have to add prefix when indexing. Also I notice that you set some metadata in omindex.cc of your version. But the omindex.cc has changed since 2011. I think that?s why my result