similar to: Problem in Indexing

Displaying 20 results from an estimated 1200 matches similar to: "Problem in Indexing"

2014 Mar 17
2
[GSOC 2014] Indexing INEX dataset
Hi Olly, Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1) by below line, automatically adjust the wdfs and field lengths? indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S"); if it does not then we should include that part in the patch too. I like to create a patch for xapian-letor for resolving common code of xapian.
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 12:02:15PM +0100, Parth Gupta wrote: > During the indexing with omindex, only you need to make sure is indexing > with prefix 'S' for title as explained here in Letor documentation: > xapian-letor/docs/letor.rst > > Previously when I edited omindex.cc it was modified as can be seen >
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote: > > > > On current trunk, we index the title with prefix "S" by default in > > omindex, though with a wdf inc of 5 rather than 1: > > > > indexer.index_text(title, 5, "S"); > > > > So I don't think you need that change to omindex now. > > Yes, but please
2011 Jun 20
1
Revision: 15699: $tg->index_text ($text, $weight) fails with "No matching function for overloaded 'TermGenerator_index_text'"
Hi, I've been out of touch recently, so perhaps I've missed something (the last time I checked the svn pulse the Perl code was under search-xapian/ - looks like things have moved to swig). The latest trunk (revision 15699) has a problem with Perl: $tg->index_text ($text, $weight); It fails with "No matching function for overloaded 'TermGenerator_index_text'..." I
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
Hi Parth, I?ve implemented SVMRanker class and also sorted out most of current Letor APIs. Now I?m trying to use INEX dataset to verify my implement. But I stuck in the indexing part. You said in the documentation that we have to add prefix when indexing. Also I notice that you set some metadata in omindex.cc of your version. But the omindex.cc has changed since 2011. I think that?s why my result
2014 Mar 22
2
[GSOC 2014] Indexing INEX dataset
For unsupervised approaches like BM25 this approach works well but letor does not need special weighting for title in this form as it itself assigns weights to title features separately. But I see your concern it would be a problem when BM25 is used on the index with this setup. Hence its preferable to take a note of this uplift in title weight for xapian-letor and normalize it everywhere
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All, I'm using the TermGenerator::index_text() on version 1.0.4 with the FLAG_SPELLING turned on, because the new spelling suggestion stuff seems awesome, but I'm getting a segv. (gdb) bt #0 0xb7ae153c in Xapian::WritableDatabase::add_spelling (this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/ base.h:154 #1 0xb7becf47 in
2011 Feb 20
0
No subject
feeling many PHP (only) users out there would probably not understand why their code is failing, and therefore waste a lot of time trying to figure out what's going on, or just give up all together. I think that there'd be a place for a better designed xapian wrapper to make it accessible to the wider PHP community ('the masses'), so if i ever get a chance i might take a look at
2005 Mar 31
1
omindex and scriptindex question
Hi, I was researching indexing of text in omindex and scriptindex. While indexing text with omindex.cc possition of terms is saved with gap. This is not happening with scriptindex.cc While this is happening ? Another question is why in omindex.cc the term possition starts with 0 while in scriptindex it starts from 1 ? Code snippet from omindex.cc // Add postings for terms to the document
2010 Jun 09
1
TermGenerator incorrectly tokenizes German text which contains special characters
Dear Xapian users, I try to index some German text with Xapian using the xapian_php bindings. I run Apache 2.2 on Windows using PHP 5.2.13 with the pre build xapian bindings from Flax: Xapian Support enabled Xapian Compiled Version @PACKAGE_VERSION@ Xapian Linked Version 1.2.0 The problem is that after indexing text which contains special characters like ?, ?, ? and ?, using
2014 Jan 27
4
Perl Search::Xapian
Hi, Trying to learn Search::Xapian and be better at perl at the same time, I'm stuck, at the DB_CREATE_OR_OPEN error. Perl says this: ~/dev/sandbox/Xapian-perl$ ./Index1-Xap.pl 100-objects-v1.csv db "db" is not exported by the Search::Xapian module Can't continue after import errors at ./Index1-Xap.pl line 7. BEGIN failed--compilation aborted at ./Index1-Xap.pl line 7. What I
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
This is my test case, what am I doing wrong? It seems that the API is used incorrectly, but I cannot find the problem... --- 8< --- #!/usr/bin/perl use Search::Xapian qw(:all); use strict; my $xa = new Search::Xapian::WritableDatabase ("/tmp/xapian", DB_CREATE_OR_OVERWRITE); my $indexer = Search::Xapian::TermGenerator->new();
2007 Jun 01
2
Is aaf multi_search broken?
Hi all, I want to use acts_as_ferret''s multi_search to search two model classes (Reviewable and Blog) at a time like @results = Reviewable.multi_search("jemen", [Blog]) and I''m always getting the error You have a nil object when you didn''t expect it! You might have expected an instance of Array. The error occurred while evaluating nil.map
2020 Feb 08
2
prioritizing aggregated DBs
Olly Betts <olly at survex.com> wrote: > On Fri, Feb 07, 2020 at 09:33:08PM +0000, Eric Wong wrote: > > Hey all, I've been using ->add_database for a few years > > to tie sharded DBs together and it works great. > > > > Now, I want to be able to search across several DBs > > which aren't sharded, say: linux-DB, glibc-DB, freebsd-DB. > >
2020 Feb 07
2
prioritizing aggregated DBs
Hey all, I've been using ->add_database for a few years to tie sharded DBs together and it works great. Now, I want to be able to search across several DBs which aren't sharded, say: linux-DB, glibc-DB, freebsd-DB. I want to search for something across all of them, but prioritize results to favor one or some of those DBs over others. Is there a way to do that without reindexing? Or
2010 Jan 28
3
Problem getting Xapian working with Burmese
On Fri, Aug 21, 2009 at 02:44:44PM +0200, emmanuel at engelhart.org wrote: >> I want to update my request. >> Is my question bad formulated? too trivial? ... or maybe pretty >> complicated/unclear? > >I think nobody answered as it was hard to follow your example because >the Burmese characters seem to have been mangled (at least the message I >received wasn't
2008 Jan 15
7
PHP indexing, what's the PHP method for indexscript
Currently I have the following indexscript: pid : unique=Q boolean=Q field=pid postdate : field=startdate author_name: unhtml boolean=XAUTHORNAME field=author author_id: boolean=XAUTHORID field=authorid url : field=url sample : weight=1 index field=sample How can I create the same indexing using PHP? With this, I can get an searchable index, but I have no idea how to set the fields, so that I
2008 Sep 16
1
Some Questions From the beginner of Xapian
Dear, guys: I am a beginner of Xapian, when reading the documents, I encountered follow questions. (1) I see the Xapian::Document has a method void add_value (Xapian::valueno valueno, const std::string &value) What's the purpose of this method? Document will related to the terms, but what's the purpose of this? (2) add_posting method will add term to a documents. void
2014 Mar 11
2
[GSOC 2013] Question about indexing INEX dataset
Hi, I?m trying to use Omega to index INEX dataset for Letor. But omindex told me these xml files are unknown. Olly told me I could tell omindex to handle them as HTML. (Thanks Olly :) ) Is it appropriate? Parth, could you give me some suggestions? Thank you! Jiarong Wei
2011 Sep 14
1
Integrated Chinese tokenizer SCWS in xapian-core
Xapian is a very excellent open source search engine library, but there is no native support for Chinese word segmentation in queryparser and termgenerator. Therefore, I modified small amount of source codes, integrated into the SCWS tokenizer, that is the same open-source and developped by myself. Anyone can obtain the patch from below URL. After patching, Xapian::QueryParser::parse_query and