Displaying 20 results from an estimated 53 matches for "add_docu".
Did you mean:
add_doc
2006 Sep 14
2
Possiible Bug ? indexWriter#doc_count countsdeleted docs after #commit
Hi David,
> Deleted documents don''t get deleted until commit is called
Ok, but FYI, my experiments show that #commit doesn''t affect #doc_count,
even across ruby sessions.
On a different note, I''d like to request a variation of #add_document
which returns the doc_id of the document added, as opposed to self.
I''m trying to track down an issue with a large test index [600MB, 500k
docs] in which I need to update a document. The old document is deleted
then added again, but doesn''t show up in my searches.
A #doc_cou...
2006 Aug 26
4
[0.10.0] Index#add_document bug with strange value ?
Perhaps, I found where is my problem (during a big import).
Why this silly (really silly :)) example crash ?
http://pastie.caboo.se/10357
/usr/lib/ruby/site_ruby/1.8/ferret/index.rb:211:in `add_document'': IO
Error occured at <except.c>:79 in xraise (IOError)
Error occured in fs_store.c:225 - fso_flush_i
flushing src of length -2
from /usr/lib/ruby/site_ruby/1.8/ferret/index.rb:211:in `<<''
from /usr/lib/ruby/1.8/monitor.rb:229:in `synchron...
2007 Jan 29
2
Segmentation fault in Index::Index#add_document
Hello,
Here''s the code that segfaults:
http://pastie.caboo.se/36467
I could have submitted a patch, but I''m not sure
whether this segfault is caused by Ferret or Ruby.
This seems to be triggered only when combining
a split and a gsub on an empty string of the returned
array, and trying to insert it directly into the
index.
However, there''s no problem when you
2007 May 15
1
Document ID 0 is invalid... but not always...
...http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=143
$doc->add_term('metadata');
$db->replace_document(-1, $doc); // or 4294967295 = (2^32)-1
$doc=new XapianDocument();
$doc->set_data('data');
$doc->add_term('data');
$docId=$db->add_document($doc); // get_lastdocid()+1 overflows, will
return 0
echo "doc #$docId added\n";
The two docs are added without error, but the second one will get the
docId 0 and of course won't be accessible.
Another variant will cause xapian to overwrite an existing record :
(pseudo cod...
2010 Jul 26
2
related documents
Hi All,
I would like to take a doc in the xapian DB and find all related
documents by relevance e.g. so when you view one document it says
"Related entries X Y Z".
I'm aware of the "Morelikethis" Lucene plugin that is supposed to do
something like this, by generating a query from a document based on term
frequency.
Has anyone developed a tool to generate a query from a
2007 Feb 07
2
My new record: Indexing 20 millions docs = 79m9.378s
Gentoo Linux 2.6
8 AMD Opteron 64-bit Processors
32GB Memory
--------------------------------------------------------------------------------
Environment:
------------------
XAPIAN_FLUSH_THRESHOLD=21000000
XAPIAN_FLUSH_THRESHOLD_LENGTH=16000000
XAPIAN_PREFER_FLINT=True
Indexing 20 million documents:
--stemmer=none
-------------------------------------------
real 79m9.378s
user 77m28.696s
2007 Apr 12
2
Ferret 0.11.4.win32 indexing speed vs Ferret 0.10.9.win32
Firstly, thanks Dave for all your hard work. Ferret Rocks!,
I am just testing 0.11.4.win32 and it seems to work just fine, however
the index creation phase of my app is perhaps 3x slower under 0.11.4 vs
0.10.9
Details follow:
System: windows xp sp2, index on local hard disk, Ruby 1.8.6
Run #1, Ferret 0.10.9
- Reboot
- Build index, 35,000 rows added in 297 seconds
-
Run #2, Ferret 0.11.4
-
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
...ian",
DB_CREATE_OR_OVERWRITE);
my $indexer = Search::Xapian::TermGenerator->new();
$indexer->set_flags(Search::Xapian::FLAG_SPELLING);
my $doc = new Search::Xapian::Document;
$indexer->set_document($doc);
$indexer->index_text("hello 123 blah blah");
$xa->add_document($doc);
--- >8 ---
Output:
terminate called after throwing an instance of 'Xapian::InvalidOperationError'
Aborted
It works fine without "$indexer->set_flags(Search::Xapian::FLAG_SPELLING);", but
then spelling correction does not work. The error/exception occurs at
in...
2006 Aug 17
3
Ferret locks up when adding items to an index
...2/Bundles/rails112.locobundle/
i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/
document_writer.rb:88:in `invert_document''
from /Applications/Locomotive2/Bundles/rails112.locobundle/
i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/
document_writer.rb:58:in `add_document''
from /Applications/Locomotive2/Bundles/rails112.locobundle/
i386/lib/ruby/gems/1.8/gems/ferret-0.9.5/lib/ferret/index/
index_writer.rb:158:in `add_document''
from /Applications/Locomotive2/Bundles/rails112.locobundle/
i386/lib/ruby/gems/1.8/gems/ferret-0.9...
2006 Nov 07
1
Memory consumption too high
...ry doesn''t move up at all. However,
sometimes it blows up horribly with a NoMemoryError. I''m running it
from the script\console.
Here is the stack trace from runnning MyObject.rebuild_index
D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret/index.rb:277
:in `add_document'': failed to allocate memory (NoMemoryError)
from
D:/dev/ruby/lib/ruby/gems/1.8/gems/ferret-0.10.9-mswin32/lib/ferret
/index.rb:277:in `<<''
from D:/dev/ruby/lib/ruby/1.8/monitor.rb:229:in `synchronize''
from
D:/dev/ruby/lib/ruby/gems/1.8/...
2007 Apr 09
5
highlight crashes
I am trying to use highlight, but I am getting this kind of thing:
/usr/local/lib/ruby/gems/1.8/gems/ferret-0.11.4/lib/ferret/index.rb:197:in
`highlight'': IO Error occured at <except.c>:93 in xraise (IOError)
Error occured in index.c:1222 - lazy_df_get_bytes
len = -5, but should be greater than 0
from
2010 Jun 10
0
Exception: Key too long
...ons (ie, "Exception: Key too long: length
>> was...")
>
> You are hitting the Btree key size limit. For flint and chert, this
> translates to a term length limit of 245 bytes.
> If you are using Xapian >= 1.0.3 then the term limit should be checked
> when you call add_document() or replace_document().
I'm using trunk, r13989.
Ok, I have my stupid hat on this morning, so please bear with me:
...
# $raw_text could contain up to 110k of text.
$analyzer->index_text ($raw_text, ...);
$index->add_spelling(...foreach word in $raw_text...);
...
$index->add_d...
2008 Apr 20
1
Exception DatabaseCorruptError under php
...t it throws:
Fatal error: Uncaught exception 'Exception' with message
'DatabaseCorruptError: Failed to unlink
/var/lib/xapian/trade.ar/termlist.baseA: No such file or directory'
in /usr/share/php5/xapian.php:1140
Stack trace:
#0 /usr/share/php5/xapian.php(1140):
writabledatabase_add_document(Resource id #18, Object(XapianDocument))
#1 /home/indexer/CDetails.php(431):
XapianWritableDatabase->add_document(Object(XapianDocument))
#2 /home/indexer/CDetails.php(379): CDetails->postDetails('E',
'2000-01-07', '001', '-1', '4', '200BA...
2013 Jun 19
2
Compact databases and removing stale records at the same time
...ostingIterator it;
for (it = srcdb.postlist_begin(""); it != srcdb.postlist_end(""); it++) {
Xapian::docid did = *it;
Xapian::Document doc = srcdb.get_document(did);
std::string cyrusid = doc.get_value(SLOT_CYRUSID);
if (cb(cyrusid.c_str(), rock)) {
destdb.add_document(doc);
}
}
/* commit all changes explicitly */
destdb.commit();
}
FYI: SLOT_CYRUSID is just 0.
Thanks heaps for your help on this. Honestly, it's not a deal-breaker for us to use this much CPU. It's a pain, but it's still heaps cheaper than re-indexing everything, an...
2014 Apr 13
2
Adding an external library to Xapian
...&& strcmp(*argv, "--") != 0) {
- query_string += ' ';
- query_string += *argv++;
+ query_string += ' ';
+ query_string += *argv++;
}
// Create an RSet with the listed docids in.
Xapian::RSet rset;
if (*argv) {
- while (*++argv) {
- rset.add_document(atoi(*argv));
- }
+ while (*++argv) {
+ rset.add_document(atoi(*argv));
+ }
}
+ // Log the query
+ db.log(query_string);
+ // DB syns
+ Xapian::TermIterator tmiter = db.synonyms_begin(query_string);
+ Xapian::TermIterator tmiterend = db.synonyms_end(query_string);
+ for(;tmiter != tm...
2014 Apr 13
2
Adding an external library to Xapian
My code is not on Github. I am using the tarball as of now. The following
it the error that occurred:
http://pastebin.com/cVJrjUZX
On Sun, Apr 13, 2014 at 8:16 PM, James Aylett <james-xapian at tartarus.org>wrote:
> On 13 Apr 2014, at 15:37, Pallavi Gudipati <pallavigudipati at gmail.com>
> wrote:
>
> > A linker error is encountered even after following the above
2010 Oct 21
2
In-memory databases vs PHP Bindings
...()->get_document();
// Create a database that just contains the one document
// TODO:AB:20101020: Work out how to build an in-memory Xapian database via
PHP bindings
$xdb_doc = new XapianWritableDatabase(PROJROOT.'/tmp/xapian/doc'.$postid,
Xapian::DB_CREATE_OR_OVERWRITE);
$xdb_doc->add_document($xdoc);
$xdb_doc->commit();
Also, FYI, the documentation here seems incomplete:
http://xapian.org/docs/apidoc/html/classXapian_1_1TermIterator.html
I had to inspect the bindings to find the rather useful get_term() method of
the TermIterator class! It does mention the use of the * opera...
2018 Mar 30
2
sorting large msets
...->begin_transaction;
for my $j (0..2000) {
my $doc = Search::Xapian::Document->new;
my $num = Search::Xapian::sortable_serialise(($i * 1000) + $j);
$doc->add_value(0, $num);
$doc->set_data("$i $j");
$doc->add_boolean_term('T' . 'mail');
$xdb->add_document($doc);
$doc = Search::Xapian::Document->new;
$doc->add_value(0, $num);
$doc->set_data("$i $j");
$doc->add_boolean_term('T' . 'ghost');
$xdb->add_document($doc);
}
$xdb->commit_transaction;
}
my $enquire = Search::Xapian::Enquire->new($...
2008 Sep 27
3
Query::MatchAll
Why there still been rank when using Query::MatchAll() ?
2006 May 02
4
Indexing Speed?
..." sort of code provided by both
libraries.
My ruby code is: (abridged)
@index = Index::Index.new(:path => inIndexPath)
def createIndex(inRepositoryPath)
Find.find(inRepositoryPath) do |path|
if FileTest.file?(path)
File.open(path) do |file|
@index.add_document(:file =>path, :content =>
file.readlines)
end
My Java code is basically a direct port.
Has anyone else noticed this difference in speed? Am I doing something
wrong? Is this speed normal?
Any advice gratefully received.
Thanks,
Steven
--
Posted via http://www.ruby-forum.com/.