search for: index_text

Displaying 20 results from an estimated 43 matches for "index_text".

2011 Jun 20
1
Revision: 15699: $tg->index_text ($text, $weight) fails with "No matching function for overloaded 'TermGenerator_index_text'"
Hi, I've been out of touch recently, so perhaps I've missed something (the last time I checked the svn pulse the Perl code was under search-xapian/ - looks like things have moved to swig). The latest trunk (revision 15699) has a problem with Perl: $tg->index_text ($text, $weight); It fails with "No matching function for overloaded 'TermGenerator_index_text'..." I take it the missing code in xapian-bindings/perl/Search/Xapian.pm is the issue? Regards Henry
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All, I'm using the TermGenerator::index_text() on version 1.0.4 with the FLAG_SPELLING turned on, because the new spelling suggestion stuff seems awesome, but I'm getting a segv. (gdb) bt #0 0xb7ae153c in Xapian::WritableDatabase::add_spelling (this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/ base.h:154 #1 0xb7b...
2014 Mar 17
2
[GSOC 2014] Indexing INEX dataset
Hi Olly, Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1) by below line, automatically adjust the wdfs and field lengths? indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S"); if it does not then we should include that part in the patch too. I like to create a patch for xapian-letor for resolving common code of xapian. Cheers, Parth. On Wed, Mar 12, 2014 at 3:13 AM, Jiarong Wei <vcam...
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote: > > > > On current trunk, we index the title with prefix "S" by default in > > omindex, though with a wdf inc of 5 rather than 1: > > > > indexer.index_text(title, 5, "S"); > > > > So I don't think you need that change to omindex now. > > Yes, but please make sure to change 5 to 1 otherwise divide the final count > statistics by 5 . :) We really need to resolve any instances where letor requires code in other parts...
2010 Jun 09
1
TermGenerator incorrectly tokenizes German text which contains special characters
...I run Apache 2.2 on Windows using PHP 5.2.13 with the pre build xapian bindings from Flax: Xapian Support enabled Xapian Compiled Version @PACKAGE_VERSION@ Xapian Linked Version 1.2.0 The problem is that after indexing text which contains special characters like ?, ?, ? and ?, using TermGenerator::index_text ( http://xapian.org/docs/sourcedoc/html/classXapian_1_1TermGenerator.html#b358784fa685139e8bdd71d37f39573e), terms get cut off (stopped) after the special character. For example the term gesundheitssch?dlich is indexed as gesundheitssch? and Zgesundheitssch? (stemmed). All character encodings are...
2014 Jan 27
4
Perl Search::Xapian
...>) { my $description = $csvline->{DESCRIPTION}; my $title = $csvline->{TITLE}; my $identifier = $csvline->{id_NUMBER}; # We make a doc and tell the term generator to use this. my $doc = Search::Xapian::Document->new(); $tg->set_document($doc); $tg->index_text($title, 1, 'S'); $tg->index_text($description, 1, 'XD'); # index fields without prefixes for general search. $tg->index_text($title); $tg->increase_termpos(); $tg->index_text($description); # Store all the feilds for display purposes. # this...
2005 Mar 31
1
omindex and scriptindex question
...ap. This is not happening with scriptindex.cc While this is happening ? Another question is why in omindex.cc the term possition starts with 0 while in scriptindex it starts from 1 ? Code snippet from omindex.cc // Add postings for terms to the document Xapian::termpos pos = 1; pos = index_text(title, newdocument, stemmer, pos); pos = index_text(dump, newdocument, stemmer, pos + 100); pos = index_text(keywords, newdocument, stemmer, pos + 100); Code snippet from scriptindex.cc Xapian::termpos wordcount = 0; ........... for (i = v.begin(); i != v.end(); ++i) { ......................
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
...= new Search::Xapian::WritableDatabase ("/tmp/xapian", DB_CREATE_OR_OVERWRITE); my $indexer = Search::Xapian::TermGenerator->new(); $indexer->set_flags(Search::Xapian::FLAG_SPELLING); my $doc = new Search::Xapian::Document; $indexer->set_document($doc); $indexer->index_text("hello 123 blah blah"); $xa->add_document($doc); --- >8 --- Output: terminate called after throwing an instance of 'Xapian::InvalidOperationError' Aborted It works fine without "$indexer->set_flags(Search::Xapian::FLAG_SPELLING);", but then spelling correct...
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
...838 and block 1532-1559. > > But now we have the same as xapian-letor/bin/xapian-letor-update.cc so > before starting with questletor.cc you need to run it once for each db and > in this case all you need to make sure is below line in omindex.cc while > indexing. > > indexer.index_text(title, 1,"S"); On current trunk, we index the title with prefix "S" by default in omindex, though with a wdf inc of 5 rather than 1: indexer.index_text(title, 5, "S"); So I don't think you need that change to omindex now. Cheers, Olly
2007 Jun 01
2
Is aaf multi_search broken?
...et/lib/class_methods.rb:131:in `id_multi_search'' #{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:113:in `multi_search'' #{RAILS_ROOT}/app/controllers/search_controller.rb:53:in `search'' I have configured indexing like this: acts_as_ferret :fields => [:index_text, :index_locations], :single_index => true acts_as_ferret :fields => [:index_text, :index_locations], :single_index => true Maybe I''m doing something wrong? Thanks, Starburger -- Posted via http://www.ruby-forum.com/.
2007 Dec 17
1
Crashes with spelling enabled and perl.
...N); if (!defined($db)) { die("Failed to open xapian_database: $!"); } my $indexer = Search::Xapian::TermGenerator->new(); $indexer->set_flags(Search::Xapian::FLAG_SPELLING); my $document = Search::Xapian::Document->new(); $indexer->set_document($document); $indexer->index_text(lc('test'), 1); $db->add_document($document); undef $db; Here's the patch to enable spelling against Search-Xapian-1.0.4.0: http://rusty.devel.infogears.com/xap-perl-spelling.diff Here's the backtrace against 1.0.4: Program received signal SIGSEGV, Segmentation fault. [Switch...
2012 Jun 04
1
Search not finding queries with stop words.
...>new(); my $stemmer = Search::Xapian::Stem->new('english'); $doc->set_data($jsonText); $indexer->set_stemmer($stemmer); $indexer->set_stopper($stopper); $indexer->set_document($doc); $indexer->index_text($docBody); $indexer->increase_termpos(); $indexer->index_text($subject); ... (other index_text and add_value calls) $xdb->add_document($doc); If I look for something like index of elements, I get no results even though that phrase exists (no, I don...
2020 Feb 08
2
prioritizing aggregated DBs
...ght > contribution from the PostingSource for matching documents). Cool. I'll keep that in mind down the line. That could be a while since some users are still on 1.2 and tend to stick to what's provided by enterprise/LTS distros. > > Or would I fiddle with wdf_inc for all ->index_text and ->add_term > > calls on a per-DB basis? > > That would probably work if you don't want to be able to vary the > prioritisation dynamically. That's a compromise I'll have to make, for now. Thanks for the response!
2020 Feb 07
2
prioritizing aggregated DBs
...rch across several DBs which aren't sharded, say: linux-DB, glibc-DB, freebsd-DB. I want to search for something across all of them, but prioritize results to favor one or some of those DBs over others. Is there a way to do that without reindexing? Or would I fiddle with wdf_inc for all ->index_text and ->add_term calls on a per-DB basis? Thanks.
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
Hi Parth, I?ve implemented SVMRanker class and also sorted out most of current Letor APIs. Now I?m trying to use INEX dataset to verify my implement. But I stuck in the indexing part. You said in the documentation that we have to add prefix when indexing. Also I notice that you set some metadata in omindex.cc of your version. But the omindex.cc has changed since 2011. I think that?s why my result
2013 Sep 22
2
How to filter search result with query with has white space.
...rs/ramesh/Desktop/xapian", Xapian::DB_CREATE_OR_OPEN); Xapian::TermGenerator indexer; Xapian::Stem stemmer("english"); indexer.set_stemmer(stemmer); Xapian::Document doc; doc.set_data(d.title); indexer.set_document(doc); indexer.index_text(d.title,1,"title"); indexer.index_text(d.content,1,"content"); indexer.index_text(d.url,1,"url"); doc.add_boolean_term("title"+d.title); db.replace_document(d.url,doc); db.commit(); } catch (const Xapian::Error &amp...
2013 Sep 22
2
How to filter search result with query with has white space.
...rs/ramesh/Desktop/xapian", Xapian::DB_CREATE_OR_OPEN); Xapian::TermGenerator indexer; Xapian::Stem stemmer("english"); indexer.set_stemmer(stemmer); Xapian::Document doc; doc.set_data(d.title); indexer.set_document(doc); indexer.index_text(d.title,1,"title"); indexer.index_text(d.content,1,"content"); indexer.index_text(d.url,1,"url"); doc.add_boolean_term("title"+d.title); db.replace_document(d.url,doc); db.commit(); } catch (const Xapian::Error &amp...
2011 Jul 27
3
Searching using prefixes
...required and, following a little research, I think I understand what I need to do but I'd like a clarification on this. o We have a database of a number of documents, with fields: title, subtitle, summary and table of contents o By default, we pass these fields into the TermGenerator::index_text function to generate terms and add these to a Xapian::Document, applying a weighting where required o We then search these fields using XapianQueryParser::parse_query o This gives a result which searches all of the fields for the required string I'd like to add the ability to search J...
2011 May 04
1
Problem in Indexing
...of the collection: 11400 documents [1.6 GB] This takes a lot of time to index and indexing for last 20 hrs or so. I am using omindex. I notice that around 2900 docs are indexed very smoothly and suddenly after that indexing becomes very sluggish. I have tried couple of tricks like replacing the index_text() call to index_text_without_positions(). I also tried after setting the XAPIAN_FLUSH_THRESHLOD to 1500 documents from 10000 default. Above mentioned time is after this tricks. Any help will be appreciated. Thanks, Parth. -------------- next part -------------- An HTML attachment was scrubbed......
2008 Sep 16
1
Some Questions From the beginner of Xapian
...hod? Document will related to the terms, but what's the purpose of this? (2) add_posting method will add term to a documents. void add_posting (const std::string &tname, Xapian::termpos tpos, Xapian::termcount wdfinc=1) I noticed that Xapian::TermGenerator has follow method void index_text (const Xapian::Utf8Iterator &itor, Xapian::termcount weight=1, const std::string &prefix="") What's the differences and relationship between these two functions? Thanks a lot! Sam -------------- next part -------------- An HTML attachment was scrubbed... URL: <http:/...