search for: add_docu

Displaying 20 results from an estimated 53 matches for "add_docu".

Did you mean: add_doc
2006 Sep 14
2
Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit
Hi David, > Deleted documents don''t get deleted until commit is called Ok, but FYI, my experiments show that #commit doesn''t affect #doc_count, even across ruby sessions. On a different note, I''d like to request a variation of #add_document which returns the doc_id of the document added, as opposed to self. I''m trying to track down an issue with a large test index [600MB, 500k docs] in which I need to update a document. The old document is deleted then added again, but doesn''t show up in my searches. A #doc_cou...
2006 Aug 26
4
[0.10.0] Index#add_document bug with strange value ?
Perhaps, I found where is my problem (during a big import). Why this silly (really silly :)) example crash ? http://pastie.caboo.se/10357 /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:211:in `add_document'': IO Error occured at <except.c>:79 in xraise (IOError) Error occured in fs_store.c:225 - fso_flush_i flushing src of length -2 from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:211:in `<<'' from /usr/lib/ruby/1.8/monitor.rb:229:in `synchron...
2007 Jan 29
2
Segmentation fault in Index::Index#add_document
Hello, Here''s the code that segfaults: http://pastie.caboo.se/36467 I could have submitted a patch, but I''m not sure whether this segfault is caused by Ferret or Ruby. This seems to be triggered only when combining a split and a gsub on an empty string of the returned array, and trying to insert it directly into the index. However, there''s no problem when you
2007 May 15
1
Document ID 0 is invalid... but not always...
...http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=143 $doc->add_term('metadata'); $db->replace_document(-1, $doc); // or 4294967295 = (2^32)-1 $doc=new XapianDocument(); $doc->set_data('data'); $doc->add_term('data'); $docId=$db->add_document($doc); // get_lastdocid()+1 overflows, will return 0 echo "doc #$docId added\n"; The two docs are added without error, but the second one will get the docId 0 and of course won't be accessible. Another variant will cause xapian to overwrite an existing record : (pseudo cod...
2010 Jul 26
2
related documents
Hi All, I would like to take a doc in the xapian DB and find all related documents by relevance e.g. so when you view one document it says "Related entries X Y Z". I'm aware of the "Morelikethis" Lucene plugin that is supposed to do something like this, by generating a query from a document based on term frequency. Has anyone developed a tool to generate a query from a
2007 Feb 07
2
My new record: Indexing 20 millions docs = 79m9.378s
Gentoo Linux 2.6 8 AMD Opteron 64-bit Processors 32GB Memory -------------------------------------------------------------------------------- Environment: ------------------ XAPIAN_FLUSH_THRESHOLD=21000000 XAPIAN_FLUSH_THRESHOLD_LENGTH=16000000 XAPIAN_PREFER_FLINT=True Indexing 20 million documents: --stemmer=none ------------------------------------------- real 79m9.378s user 77m28.696s
2007 Apr 12
2
Ferret 0.11.4.win32 indexing speed vs Ferret 0.10.9.win32
Firstly, thanks Dave for all your hard work. Ferret Rocks!, I am just testing 0.11.4.win32 and it seems to work just fine, however the index creation phase of my app is perhaps 3x slower under 0.11.4 vs 0.10.9 Details follow: System: windows xp sp2, index on local hard disk, Ruby 1.8.6 Run #1, Ferret 0.10.9 - Reboot - Build index, 35,000 rows added in 297 seconds - Run #2, Ferret 0.11.4 -
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
...ian", DB_CREATE_OR_OVERWRITE); my $indexer = Search::Xapian::TermGenerator->new(); $indexer->set_flags(Search::Xapian::FLAG_SPELLING); my $doc = new Search::Xapian::Document; $indexer->set_document($doc); $indexer->index_text("hello 123 blah blah"); $xa->add_document($doc); --- >8 --- Output: terminate called after throwing an instance of 'Xapian::InvalidOperationError' Aborted It works fine without "$indexer->set_flags(Search::Xapian::FLAG_SPELLING);", but then spelling correction does not work. The error/exception occurs at in...
2006 Aug 17
3
Ferret locks up when adding items to an index
...2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/ document_writer.rb:88:in `invert_document'' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/ document_writer.rb:58:in `add_document'' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/ index_writer.rb:158:in `add_document'' from /Applications/Locomotive2/Bundles/rails112.locobundle/ i386/lib/ruby/gems/1.8/gems/ferret-0.9...
2006 Nov 07
1
Memory consumption too high
...ry doesn''t move up at all. However, sometimes it blows up horribly with a NoMemoryError. I''m running it from the script\console. Here is the stack trace from runnning MyObject.rebuild_index D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret/index.rb:277 :in `add_document'': failed to allocate memory (NoMemoryError) from D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret /index.rb:277:in `<<'' from D:/dev/ruby/lib/ruby/1.8/monitor.rb:229:in `synchronize'' from D:/dev/ruby/lib/ruby/gems/1.8/...
2007 Apr 09
5
highlight crashes
I am trying to use highlight, but I am getting this kind of thing: /usr/local/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:197:in `highlight'': IO Error occured at <except.c>:93 in xraise (IOError) Error occured in index.c:1222 - lazy_df_get_bytes len = -5, but should be greater than 0 from
2010 Jun 10
0
Exception: Key too long
...ons (ie, "Exception: Key too long: length >> was...") > > You are hitting the Btree key size limit. For flint and chert, this > translates to a term length limit of 245 bytes. > If you are using Xapian >= 1.0.3 then the term limit should be checked > when you call add_document() or replace_document(). I'm using trunk, r13989. Ok, I have my stupid hat on this morning, so please bear with me: ... # $raw_text could contain up to 110k of text. $analyzer->index_text ($raw_text, ...); $index->add_spelling(...foreach word in $raw_text...); ... $index->add_d...
2008 Apr 20
1
Exception DatabaseCorruptError under php
...t it throws: Fatal error: Uncaught exception 'Exception' with message 'DatabaseCorruptError: Failed to unlink /var/lib/xapian/trade.ar/termlist.baseA: No such file or directory' in /usr/share/php5/xapian.php:1140 Stack trace: #0 /usr/share/php5/xapian.php(1140): writabledatabase_add_document(Resource id #18, Object(XapianDocument)) #1 /home/indexer/CDetails.php(431): XapianWritableDatabase->add_document(Object(XapianDocument)) #2 /home/indexer/CDetails.php(379): CDetails->postDetails('E', '2000-01-07', '001', '-1', '4', '200BA...
2013 Jun 19
2
Compact databases and removing stale records at the same time
...ostingIterator it; for (it = srcdb.postlist_begin(""); it != srcdb.postlist_end(""); it++) { Xapian::docid did = *it; Xapian::Document doc = srcdb.get_document(did); std::string cyrusid = doc.get_value(SLOT_CYRUSID); if (cb(cyrusid.c_str(), rock)) { destdb.add_document(doc); } } /* commit all changes explicitly */ destdb.commit(); } FYI: SLOT_CYRUSID is just 0. Thanks heaps for your help on this. Honestly, it's not a deal-breaker for us to use this much CPU. It's a pain, but it's still heaps cheaper than re-indexing everything, an...
2014 Apr 13
2
Adding an external library to Xapian
...&& strcmp(*argv, "--") != 0) { - query_string += ' '; - query_string += *argv++; + query_string += ' '; + query_string += *argv++; } // Create an RSet with the listed docids in. Xapian::RSet rset; if (*argv) { - while (*++argv) { - rset.add_document(atoi(*argv)); - } + while (*++argv) { + rset.add_document(atoi(*argv)); + } } + // Log the query + db.log(query_string); + // DB syns + Xapian::TermIterator tmiter = db.synonyms_begin(query_string); + Xapian::TermIterator tmiterend = db.synonyms_end(query_string); + for(;tmiter != tm...
2014 Apr 13
2
Adding an external library to Xapian
My code is not on Github. I am using the tarball as of now. The following it the error that occurred: http://pastebin.com/cVJrjUZX On Sun, Apr 13, 2014 at 8:16 PM, James Aylett <james-xapian at tartarus.org>wrote: > On 13 Apr 2014, at 15:37, Pallavi Gudipati <pallavigudipati at gmail.com> > wrote: > > > A linker error is encountered even after following the above
2010 Oct 21
2
In-memory databases vs PHP Bindings
...()->get_document(); // Create a database that just contains the one document // TODO:AB:20101020: Work out how to build an in-memory Xapian database via PHP bindings $xdb_doc = new XapianWritableDatabase(PROJROOT.'/tmp/xapian/doc'.$postid, Xapian::DB_CREATE_OR_OVERWRITE); $xdb_doc->add_document($xdoc); $xdb_doc->commit(); Also, FYI, the documentation here seems incomplete: http://xapian.org/docs/apidoc/html/classXapian_1_1TermIterator.html I had to inspect the bindings to find the rather useful get_term() method of the TermIterator class! It does mention the use of the * opera...
2018 Mar 30
2
sorting large msets
...->begin_transaction; for my $j (0..2000) { my $doc = Search::Xapian::Document->new; my $num = Search::Xapian::sortable_serialise(($i * 1000) + $j); $doc->add_value(0, $num); $doc->set_data("$i $j"); $doc->add_boolean_term('T' . 'mail'); $xdb->add_document($doc); $doc = Search::Xapian::Document->new; $doc->add_value(0, $num); $doc->set_data("$i $j"); $doc->add_boolean_term('T' . 'ghost'); $xdb->add_document($doc); } $xdb->commit_transaction; } my $enquire = Search::Xapian::Enquire->new($...
2008 Sep 27
3
Query::MatchAll
Why there still been rank when using Query::MatchAll() ?
2006 May 02
4
Indexing Speed?
..." sort of code provided by both libraries. My ruby code is: (abridged) @index = Index::Index.new(:path => inIndexPath) def createIndex(inRepositoryPath) Find.find(inRepositoryPath) do |path| if FileTest.file?(path) File.open(path) do |file| @index.add_document(:file =>path, :content => file.readlines) end My Java code is basically a direct port. Has anyone else noticed this difference in speed? Am I doing something wrong? Is this speed normal? Any advice gratefully received. Thanks, Steven -- Posted via http://www.ruby-forum.com/.