Displaying 20 results from an estimated 43 matches for "index_text".
2011 Jun 20
1
Revision: 15699: $tg->index_text ($text, $weight) fails with "No matching function for overloaded 'TermGenerator_index_text'"
Hi,
I've been out of touch recently, so perhaps I've missed something (the last
time I checked the svn pulse the Perl code was under search-xapian/ - looks
like things have moved to swig).
The latest trunk (revision 15699) has a problem with Perl:
$tg->index_text ($text, $weight);
It fails with "No matching function for overloaded 'TermGenerator_index_text'..."
I take it the missing code in xapian-bindings/perl/Search/Xapian.pm is the issue?
Regards
Henry
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All,
I'm using the TermGenerator::index_text() on version 1.0.4 with the
FLAG_SPELLING turned on, because the new spelling suggestion stuff
seems awesome, but I'm getting a segv.
(gdb) bt
#0 0xb7ae153c in Xapian::WritableDatabase::add_spelling
(this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/
base.h:154
#1 0xb7b...
2014 Mar 17
2
[GSOC 2014] Indexing INEX dataset
Hi Olly,
Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1)
by below line, automatically adjust the wdfs and field lengths?
indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S");
if it does not then we should include that part in the patch too. I like to
create a patch for xapian-letor for resolving common code of xapian.
Cheers,
Parth.
On Wed, Mar 12, 2014 at 3:13 AM, Jiarong Wei <vcam...
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote:
> >
> > On current trunk, we index the title with prefix "S" by default in
> > omindex, though with a wdf inc of 5 rather than 1:
> >
> > indexer.index_text(title, 5, "S");
> >
> > So I don't think you need that change to omindex now.
>
> Yes, but please make sure to change 5 to 1 otherwise divide the final count
> statistics by 5 . :)
We really need to resolve any instances where letor requires code in
other parts...
2010 Jun 09
1
TermGenerator incorrectly tokenizes German text which contains special characters
...I
run Apache 2.2 on Windows using PHP 5.2.13 with the pre build xapian
bindings from Flax:
Xapian Support enabled Xapian
Compiled Version @PACKAGE_VERSION@
Xapian Linked Version 1.2.0
The problem is that after indexing text which contains special characters
like ?, ?, ? and ?, using TermGenerator::index_text (
http://xapian.org/docs/sourcedoc/html/classXapian_1_1TermGenerator.html#b358784fa685139e8bdd71d37f39573e),
terms get cut off (stopped) after the special character. For example the
term gesundheitssch?dlich is indexed as gesundheitssch? and Zgesundheitssch?
(stemmed).
All character encodings are...
2014 Jan 27
4
Perl Search::Xapian
...>) {
my $description = $csvline->{DESCRIPTION};
my $title = $csvline->{TITLE};
my $identifier = $csvline->{id_NUMBER};
# We make a doc and tell the term generator to use this.
my $doc = Search::Xapian::Document->new();
$tg->set_document($doc);
$tg->index_text($title, 1, 'S');
$tg->index_text($description, 1, 'XD');
# index fields without prefixes for general search.
$tg->index_text($title);
$tg->increase_termpos();
$tg->index_text($description);
# Store all the feilds for display purposes.
# this...
2005 Mar 31
1
omindex and scriptindex question
...ap.
This is not happening with scriptindex.cc
While this is happening ?
Another question is why in omindex.cc the term possition starts with 0 while
in scriptindex it starts from 1 ?
Code snippet from omindex.cc
// Add postings for terms to the document
Xapian::termpos pos = 1;
pos = index_text(title, newdocument, stemmer, pos);
pos = index_text(dump, newdocument, stemmer, pos + 100);
pos = index_text(keywords, newdocument, stemmer, pos + 100);
Code snippet from scriptindex.cc
Xapian::termpos wordcount = 0;
...........
for (i = v.begin(); i != v.end(); ++i) {
......................
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
...= new Search::Xapian::WritableDatabase ("/tmp/xapian",
DB_CREATE_OR_OVERWRITE);
my $indexer = Search::Xapian::TermGenerator->new();
$indexer->set_flags(Search::Xapian::FLAG_SPELLING);
my $doc = new Search::Xapian::Document;
$indexer->set_document($doc);
$indexer->index_text("hello 123 blah blah");
$xa->add_document($doc);
--- >8 ---
Output:
terminate called after throwing an instance of 'Xapian::InvalidOperationError'
Aborted
It works fine without "$indexer->set_flags(Search::Xapian::FLAG_SPELLING);", but
then spelling correct...
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
...838 and block 1532-1559.
>
> But now we have the same as xapian-letor/bin/xapian-letor-update.cc so
> before starting with questletor.cc you need to run it once for each db and
> in this case all you need to make sure is below line in omindex.cc while
> indexing.
>
> indexer.index_text(title, 1,"S");
On current trunk, we index the title with prefix "S" by default in
omindex, though with a wdf inc of 5 rather than 1:
indexer.index_text(title, 5, "S");
So I don't think you need that change to omindex now.
Cheers,
Olly
2007 Jun 01
2
Is aaf multi_search broken?
...et/lib/class_methods.rb:131:in
`id_multi_search''
#{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/class_methods.rb:113:in
`multi_search''
#{RAILS_ROOT}/app/controllers/search_controller.rb:53:in `search''
I have configured indexing like this:
acts_as_ferret :fields => [:index_text, :index_locations], :single_index
=> true
acts_as_ferret :fields => [:index_text, :index_locations], :single_index
=> true
Maybe I''m doing something wrong?
Thanks,
Starburger
--
Posted via http://www.ruby-forum.com/.
2007 Dec 17
1
Crashes with spelling enabled and perl.
...N);
if (!defined($db)) {
die("Failed to open xapian_database: $!");
}
my $indexer = Search::Xapian::TermGenerator->new();
$indexer->set_flags(Search::Xapian::FLAG_SPELLING);
my $document = Search::Xapian::Document->new();
$indexer->set_document($document);
$indexer->index_text(lc('test'), 1);
$db->add_document($document);
undef $db;
Here's the patch to enable spelling against Search-Xapian-1.0.4.0:
http://rusty.devel.infogears.com/xap-perl-spelling.diff
Here's the backtrace against 1.0.4:
Program received signal SIGSEGV, Segmentation fault.
[Switch...
2012 Jun 04
1
Search not finding queries with stop words.
...>new();
my $stemmer = Search::Xapian::Stem->new('english');
$doc->set_data($jsonText);
$indexer->set_stemmer($stemmer);
$indexer->set_stopper($stopper);
$indexer->set_document($doc);
$indexer->index_text($docBody);
$indexer->increase_termpos();
$indexer->index_text($subject);
... (other index_text and add_value calls)
$xdb->add_document($doc);
If I look for something like index of elements, I get no results even
though that phrase exists (no, I don...
2020 Feb 08
2
prioritizing aggregated DBs
...ght
> contribution from the PostingSource for matching documents).
Cool. I'll keep that in mind down the line. That could be a
while since some users are still on 1.2 and tend to stick to
what's provided by enterprise/LTS distros.
> > Or would I fiddle with wdf_inc for all ->index_text and ->add_term
> > calls on a per-DB basis?
>
> That would probably work if you don't want to be able to vary the
> prioritisation dynamically.
That's a compromise I'll have to make, for now. Thanks for the
response!
2020 Feb 07
2
prioritizing aggregated DBs
...rch across several DBs
which aren't sharded, say: linux-DB, glibc-DB, freebsd-DB.
I want to search for something across all of them, but
prioritize results to favor one or some of those DBs over
others. Is there a way to do that without reindexing?
Or would I fiddle with wdf_inc for all ->index_text and
->add_term calls on a per-DB basis?
Thanks.
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
Hi Parth,
I?ve implemented SVMRanker class and also sorted out most of current Letor APIs.
Now I?m trying to use INEX dataset to verify my implement. But I stuck in the indexing part. You said in the documentation that we have to add prefix when indexing. Also I notice that you set some metadata in omindex.cc of your version. But the omindex.cc has changed since 2011. I think that?s why my result
2013 Sep 22
2
How to filter search result with query with has white space.
...rs/ramesh/Desktop/xapian",
Xapian::DB_CREATE_OR_OPEN);
Xapian::TermGenerator indexer;
Xapian::Stem stemmer("english");
indexer.set_stemmer(stemmer);
Xapian::Document doc;
doc.set_data(d.title);
indexer.set_document(doc);
indexer.index_text(d.title,1,"title");
indexer.index_text(d.content,1,"content");
indexer.index_text(d.url,1,"url");
doc.add_boolean_term("title"+d.title);
db.replace_document(d.url,doc);
db.commit();
} catch (const Xapian::Error &...
2013 Sep 22
2
How to filter search result with query with has white space.
...rs/ramesh/Desktop/xapian",
Xapian::DB_CREATE_OR_OPEN);
Xapian::TermGenerator indexer;
Xapian::Stem stemmer("english");
indexer.set_stemmer(stemmer);
Xapian::Document doc;
doc.set_data(d.title);
indexer.set_document(doc);
indexer.index_text(d.title,1,"title");
indexer.index_text(d.content,1,"content");
indexer.index_text(d.url,1,"url");
doc.add_boolean_term("title"+d.title);
db.replace_document(d.url,doc);
db.commit();
} catch (const Xapian::Error &...
2011 Jul 27
3
Searching using prefixes
...required and, following a
little research, I think I understand what I need to do but I'd like a
clarification on this.
o We have a database of a number of documents, with fields: title,
subtitle, summary and table of contents
o By default, we pass these fields into the
TermGenerator::index_text function to generate terms and add these to a
Xapian::Document, applying a weighting where required
o We then search these fields using XapianQueryParser::parse_query
o This gives a result which searches all of the fields for the
required string
I'd like to add the ability to search J...
2011 May 04
1
Problem in Indexing
...of the collection: 11400 documents [1.6 GB]
This takes a lot of time to index and indexing for last 20 hrs or so. I am
using omindex.
I notice that around 2900 docs are indexed very smoothly and suddenly after
that indexing becomes very sluggish.
I have tried couple of tricks like replacing the index_text() call to
index_text_without_positions(). I also tried after setting the
XAPIAN_FLUSH_THRESHLOD to 1500 documents from 10000 default. Above mentioned
time is after this tricks.
Any help will be appreciated.
Thanks,
Parth.
-------------- next part --------------
An HTML attachment was scrubbed......
2008 Sep 16
1
Some Questions From the beginner of Xapian
...hod? Document will related to the terms, but what's the purpose of this?
(2) add_posting method will add term to a documents.
void add_posting (const std::string &tname, Xapian::termpos tpos, Xapian::termcount wdfinc=1)
I noticed that
Xapian::TermGenerator has follow method
void index_text (const Xapian::Utf8Iterator &itor, Xapian::termcount weight=1, const std::string &prefix="")
What's the differences and relationship between these two functions?
Thanks a lot!
Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http:/...