thr3ads.net - similar to: "No subject"

Displaying 20 results from an estimated 11000 matches similar to: "No subject"

2011 May 04

Problem in Indexing

Hello All, I am trying to index a collection of files. Details of the collection is given below. Types of Files: text files with .txt extension Size of the collection: 11400 documents [1.6 GB] This takes a lot of time to index and indexing for last 20 hrs or so. I am using omindex. I notice that around 2900 docs are indexed very smoothly and suddenly after that indexing becomes very sluggish.

Revision: 15699: $tg->index_text ($text, $weight) fails with "No matching function for overloaded 'TermGenerator_index_text'"

2011 Jun 20

Revision: 15699: $tg->index_text ($text, $weight) fails with "No matching function for overloaded 'TermGenerator_index_text'"

Hi, I've been out of touch recently, so perhaps I've missed something (the last time I checked the svn pulse the Perl code was under search-xapian/ - looks like things have moved to swig). The latest trunk (revision 15699) has a problem with Perl: $tg->index_text ($text, $weight); It fails with "No matching function for overloaded 'TermGenerator_index_text'..." I

Problem indexing text with spelling enabled in Perl

2007 Nov 14

Problem indexing text with spelling enabled in Perl

Hi All, I'm using the TermGenerator::index_text() on version 1.0.4 with the FLAG_SPELLING turned on, because the new spelling suggestion stuff seems awesome, but I'm getting a segv. (gdb) bt #0 0xb7ae153c in Xapian::WritableDatabase::add_spelling (this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/ base.h:154 #1 0xb7becf47 in

Crashes with spelling enabled and perl.

2007 Dec 17

Crashes with spelling enabled and perl.

Hi Guys, Here's a simple test case that causes a segfault with the perl bindings patched to enable spelling correction: use strict; use warnings; use Search::Xapian; my $db = Search::Xapian::WritableDatabase->new("test.db", Search::Xapian::DB_CREATE_OR_OPEN); if (!defined($db)) { die("Failed to open xapian_database: $!"); } my $indexer =

[GSOC 2014] Indexing INEX dataset

2014 Mar 17

[GSOC 2014] Indexing INEX dataset

Hi Olly, Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1) by below line, automatically adjust the wdfs and field lengths? indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S"); if it does not then we should include that part in the patch too. I like to create a patch for xapian-letor for resolving common code of xapian.

How to filter search result with query with has white space.

2013 Sep 22

How to filter search result with query with has white space.

Hello, include <iostream>#include <string>#include <xapian.h>struct document{ std::string title; std::string content; std::string url;}; void indexData(document d) { try { Xapian::WritableDatabase db("/Users/ramesh/Desktop/xapian", Xapian::DB_CREATE_OR_OPEN); Xapian::TermGenerator indexer; Xapian::Stem

How to filter search result with query with has white space.

2013 Sep 22

How to filter search result with query with has white space.

Cannot index with dynamic spelling data (Perl/Search::Xapian)

2010 Oct 24

Cannot index with dynamic spelling data (Perl/Search::Xapian)

This is my test case, what am I doing wrong? It seems that the API is used incorrectly, but I cannot find the problem... --- 8< --- #!/usr/bin/perl use Search::Xapian qw(:all); use strict; my $xa = new Search::Xapian::WritableDatabase ("/tmp/xapian", DB_CREATE_OR_OVERWRITE); my $indexer = Search::Xapian::TermGenerator->new();

TermGenerator incorrectly tokenizes German text which contains special characters

2010 Jun 09

TermGenerator incorrectly tokenizes German text which contains special characters

Dear Xapian users, I try to index some German text with Xapian using the xapian_php bindings. I run Apache 2.2 on Windows using PHP 5.2.13 with the pre build xapian bindings from Flax: Xapian Support enabled Xapian Compiled Version @PACKAGE_VERSION@ Xapian Linked Version 1.2.0 The problem is that after indexing text which contains special characters like ?, ?, ? and ?, using

Search not finding queries with stop words.

2012 Jun 04

Search not finding queries with stop words.

I have a search in perl that looks a bit like: my $qp = new Search::Xapian::QueryParser(); $qp->set_stemmer(new Search::Xapian::Stem("english")); $qp->set_stemming_strategy(STEM_SOME); $qp->set_default_op($defaultop); ... my $par = $qp->parse_query($query); my $enq = $xDatabase->enquire( $par ); and in the db create script: my $stopper =

Perl Search::Xapian

2014 Jan 27

Perl Search::Xapian

Hi, Trying to learn Search::Xapian and be better at perl at the same time, I'm stuck, at the DB_CREATE_OR_OPEN error. Perl says this: ~/dev/sandbox/Xapian-perl$ ./Index1-Xap.pl 100-objects-v1.csv db "db" is not exported by the Search::Xapian module Can't continue after import errors at ./Index1-Xap.pl line 7. BEGIN failed--compilation aborted at ./Index1-Xap.pl line 7. What I

[GSOC 2014] Indexing INEX dataset

2014 Mar 11

[GSOC 2014] Indexing INEX dataset

On Tue, Mar 11, 2014 at 12:02:15PM +0100, Parth Gupta wrote: > During the indexing with omindex, only you need to make sure is indexing > with prefix 'S' for title as explained here in Letor documentation: > xapian-letor/docs/letor.rst > > Previously when I edited omindex.cc it was modified as can be seen >

omindex and scriptindex question

2005 Mar 31

omindex and scriptindex question

Hi, I was researching indexing of text in omindex and scriptindex. While indexing text with omindex.cc possition of terms is saved with gap. This is not happening with scriptindex.cc While this is happening ? Another question is why in omindex.cc the term possition starts with 0 while in scriptindex it starts from 1 ? Code snippet from omindex.cc // Add postings for terms to the document

Searching using prefixes

2011 Jul 27

Searching using prefixes

Hi guys I'm trying to figure out how I can use probabilistic searching on a given field within a document; I've written to the list about this before, but haven't quite figured out what's required and, following a little research, I think I understand what I need to do but I'd like a clarification on this. o We have a database of a number of documents, with fields: title,

[GSOC 2014] Indexing INEX dataset

2014 Mar 22

[GSOC 2014] Indexing INEX dataset

For unsupervised approaches like BM25 this approach works well but letor does not need special weighting for title in this form as it itself assigns weights to title features separately. But I see your concern it would be a problem when BM25 is used on the index with this setup. Hence its preferable to take a note of this uplift in title weight for xapian-letor and normalize it everywhere

Some Questions From the beginner of Xapian

2008 Sep 16

Some Questions From the beginner of Xapian

Dear, guys: I am a beginner of Xapian, when reading the documents, I encountered follow questions. (1) I see the Xapian::Document has a method void add_value (Xapian::valueno valueno, const std::string &value) What's the purpose of this method? Document will related to the terms, but what's the purpose of this? (2) add_posting method will add term to a documents. void

Integrated Chinese tokenizer SCWS in xapian-core

2011 Sep 14

Integrated Chinese tokenizer SCWS in xapian-core

Xapian is a very excellent open source search engine library, but there is no native support for Chinese word segmentation in queryparser and termgenerator. Therefore, I modified small amount of source codes, integrated into the SCWS tokenizer, that is the same open-source and developped by myself. Anyone can obtain the patch from below URL. After patching, Xapian::QueryParser::parse_query and

Problem getting Xapian working with Burmese

2010 Jan 28

Problem getting Xapian working with Burmese

On Fri, Aug 21, 2009 at 02:44:44PM +0200, emmanuel at engelhart.org wrote: >> I want to update my request. >> Is my question bad formulated? too trivial? ... or maybe pretty >> complicated/unclear? > >I think nobody answered as it was hard to follow your example because >the Burmese characters seem to have been mangled (at least the message I >received wasn't

[GSOC 2014] Indexing INEX dataset

2014 Mar 11

[GSOC 2014] Indexing INEX dataset

On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote: > > > > On current trunk, we index the title with prefix "S" by default in > > omindex, though with a wdf inc of 5 rather than 1: > > > > indexer.index_text(title, 5, "S"); > > > > So I don't think you need that change to omindex now. > > Yes, but please

Welcome to the "Xapian-discuss" mailing list

2018 Jun 21

Welcome to the "Xapian-discuss" mailing list

Please keep replies on the mailing list — more people can help (and benefit) that way :) So OP_NEAR looks for its terms close to each other (hence "near"). The window is how far away they can be. Probably the easiest way to play with this is using the NEAR syntax in the query parser. So if you had a plain text document: I am walking, always walking. And index it in a very simple

similar to: No subject