Displaying 20 results from an estimated 30 matches for "replace_docu".
2007 Jul 24
1
Xapian::DocNotFoundError on replace_document? (Called from Search::Xapian)
...0.2 (flint) and matching Search::Xapian.
I'm getting:
terminate called after throwing an instance of
'Xapian::DocNotFoundError', which dumps core.
at first it was after adding my 2nd document (to an empty db, although
I don't know if that has any bearing) to the database with a
replace_document() call.
I shifted the first document off the processing stack (moved the file
away), and then it crashed doing a replace_document on the 4th
(previously 5th) document.
Any idea how I can further debug this? I can't get debugging symbols
in backtraces (there's a core, and I don't...
2004 May 11
2
"Error reading block xxx: got end of file"
Xapian (0.7.5) is spitting out this error on a regular basis:
org.xapian.errors.DatabaseError: Error reading block 136618: got end of=20=
file
=A0=A0=A0=A0=A0=A0=A0 at=20
org.xapian.XapianJNI.writabledatabase_repalce_document(Native Method)
=A0=A0=A0=A0=A0=A0=A0 at=20
org.xapian.WritableDatabase.replaceDocument(WritableDatabase.java:67)
I don't have a gdb backtrace, only the Java
2024 Dec 13
1
Using a document id as metadata key and merges
...because of the
> document id collisions (I was just using add_document() on the temporary
> dbs). It was not immediately obvious because this only affects snippets
> generation.
I assume you're merging using Xapian::Database::compact() (or the
xapian-compact tool)?
> Would using replace_document() on the temporary dbs, with unique document
> ids (modulo) ensure that the document ids are preserved during the merge so
> that the metadata keys remain valid ?
Compaction maps document ids by adding/subtracting a per-source-database
offset. By default this is calculated to abut the r...
2024 Dec 12
1
Using a document id as metadata key and merges
...It
brings a significant speed increase in some cases.
I just realised that the merge lost many metadata entries because of the
document id collisions (I was just using add_document() on the temporary
dbs). It was not immediately obvious because this only affects snippets
generation.
Would using replace_document() on the temporary dbs, with unique document
ids (modulo) ensure that the document ids are preserved during the merge so
that the metadata keys remain valid ?
Or is there another obvious approach which I am missing ?
Cheers,
J.F. Dockes
2007 May 15
1
Document ID 0 is invalid... but not always...
...$db=new XapianWritableDatabase('pathtodb',
Xapian::DB_CREATE_OR_OVERWRITE);
$doc=new XapianDocument();
$doc->set_data('metadata'); // waiting for
http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=143
$doc->add_term('metadata');
$db->replace_document(-1, $doc); // or 4294967295 = (2^32)-1
$doc=new XapianDocument();
$doc->set_data('data');
$doc->add_term('data');
$docId=$db->add_document($doc); // get_lastdocid()+1 overflows, will
return 0
echo "doc #$docId added\n";
The two docs are...
2012 Jan 08
1
Testing document size preallocation.
...55753923d
So here I am creating a database with no values for each small
document and one with a bunch of blank values (uuid_blank). Once
those are flushed then I reopen them and start replacing the documents
of each with identical documents that have an identical large set of
values. I am using replace_document and a specific document ID.
Is there a specific problem that I'm up against that shows that
preallocation is up to 2 times slower for replacing an identically
sized document rather than adding to its final serialized size?
- Shane
2018 Apr 06
1
sorting large msets
...n just stop once we've found 200
> > > matches.
With a few million documents, that ENQ_ASCENDING sounds promising :)
So, it looks like if I had ideal ordering, I could do something
along the lines of:
my $doc_id = $db->get_metadata('last_doc_id') || 0xffffffff;
$db->replace_document($doc_id--, $_) foreach (@doc);
$db->set_metadata('last_doc_id', $doc_id);
And get killer performance.
Olly Betts <olly at survex.com> wrote:
> On Sat, Mar 31, 2018 at 12:58:19AM +0000, Eric Wong wrote:
> > Would it be possible to teach Xapian to optimize its storag...
2018 Feb 27
1
modifying the DB while iterating is user error, right?
Hello, I noticed a problem with DatabaseCorruptError exceptions
with public-inbox and I guess it's user error...
The problem is public-inbox was calling replace_document to
modify the DB while iterating through a PostingIterator. At
first I thought it was a glass problem, but I've hit it with
chert on my dataset, too.
I have a standalone Perl script to reproduce the problem at
https://yhbt.net/skel.bug.perl and 81M gzipped dataset which
reproduces the pro...
2006 Oct 19
1
Writing with xapian-tcpsrv and php
Hi,
I think, there is missing constructor function supporting remote
writing for XapianWritableDatabase class in the php bindings (0.9.7).
This code:
$db = new XapianWritableDatabase(remote_open($db_host, $db_port),
$action);
returns:
Fatal error: No matching function for overloaded
'new_XapianWritableDatabase' (...)
$db = new XapianWritableDatabase($path, $action); works fine.
2016 Jan 14
3
Strange index consistency issue
..., the data is replaceable without too much effort, so that
reliable detection of an issue is almost as good as assurance that it won't
occur. The latter seems very difficult to attain when running in an
uncontrolled environment.
There is one weird thing though, which is why, in this situation,
replace_document() appears to repeatedly accepts data which goes into a
black hole.
Cheers,
jf
2009 Feb 12
1
problem when using xapian's static libs in windows
...nt.obj) : error LNK2001: ????????? "public: virtual void __thiscall RemoteDatabase::delete_document(unsigned int)" (?delete_document at RemoteDatabase@@UAEXI at Z)
libbackend.lib(dbfactory_remote.obj) : error LNK2001: ????????? "public: virtual unsigned int __thiscall RemoteDatabase::replace_document(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class Xapian::Document const &)" (?replace_document at RemoteDatabase@@UAEIABV?$basic_string at DU?$char_traits at D@std@@V?$allocator at D@2@@std@@ABVDocument at Xapia...
2008 Jan 15
7
PHP indexing, what's the PHP method for indexscript
Currently I have the following indexscript:
pid : unique=Q boolean=Q field=pid
postdate : field=startdate
author_name: unhtml boolean=XAUTHORNAME field=author
author_id: boolean=XAUTHORID field=authorid
url : field=url
sample : weight=1 index field=sample
How can I create the same indexing using PHP?
With this, I can get an searchable index, but I have no idea how to set the fields, so that I
2016 Jan 08
2
Strange index consistency issue
...ots, misc Recoll bugs, etc.
The strange thing here is that xapian-check does not seem to detect anything.
In a nutshell, some document numbers seem to point to a data blackhole: the
docids are returned when searching for the file/doc unique identifying
term, but then get_document() fails. A later replace_document() succeeds,
but on the next indexing pass, same issue.
// success
docid = db.postlist_begin(uniterm)
// then failure:
xdoc = db.get_document(*docid)
In this situation, Recoll will try to update the doc. replace_document()
then succeeds, and this repeats...
2014 Jan 27
4
Perl Search::Xapian
...prefixes for general search.
$tg->index_text($title);
$tg->increase_termpos();
$tg->index_text($description);
# Store all the feilds for display purposes.
# this is a TODO
my $idterm = "Q".$identifier;
$doc->add_boolean_term($idterm);
$db->replace_document($idterm, $doc);
}
close $fh;
----------------snip---------------
(\ /)
( . .) Jon's website is here:
c(")(") http://www.securityrabbit.com
2004 Sep 09
2
InMemory backend
I've just added a feature test for the new WritableDatabase methods -
replace_document() and delete_document() with a unique term. This
initially failed for inmemory due to bugs in the backend. They weren't
trivial to fix and my initial attempt at a fix caused other tests to
fail.
I've come to the conclusion that the code there probably should be
retired. It was writt...
2005 Jul 15
2
Problem with Perl bindings (enquire)
Hello list,
looks like one can open a Xapian database in read-only mode and do the
following:
$db = Search::Xapian::Database->new("/foo/bar/");
$enq = $db->enquire("XIDblub");
the same doesn't seem to be possible with a database opened in read-write
mode:
$db = Search::Xapian::WritableDatabase->new("/foo/bar/",
2008 Aug 19
1
Fwd: Strange error with PHP bindings [some more details]
Finally I noticed something suspect:
[2008-08-19 09:11:25] [DEBUG] DAO_Articles::add_xindex() - document added id
: 255, title : Gli anelli con sigil...
this is a debug line from my application, add_xindex function simply adds
the document to xapian database, the error always happens when I try to add
an article with id = 255, this can not be a casualty (I also tried to change
the order of
2009 Jun 18
1
delete and update
Hi All,
I need to update or delete some documents from a Xapian database. and
I haven't been able to find anything in the API , Is there a way to do it
? What would be the easiest way to do it ?
Thanks.
2010 Jun 10
0
Exception: Key too long
...eption: Key too long: length
>> was...")
>
> You are hitting the Btree key size limit. For flint and chert, this
> translates to a term length limit of 245 bytes.
> If you are using Xapian >= 1.0.3 then the term limit should be checked
> when you call add_document() or replace_document().
I'm using trunk, r13989.
Ok, I have my stupid hat on this morning, so please bear with me:
...
# $raw_text could contain up to 110k of text.
$analyzer->index_text ($raw_text, ...);
$index->add_spelling(...foreach word in $raw_text...);
...
$index->add_document($xpdoc);
......
2011 May 30
1
Most efficient update of already existing document?
Hello,
What is the most efficient way to update some content of document with new info gathered later after it's first indexed?
For example I first index a lot of documents text (lets say it's mailbox), and after all documents are indexed I determine each document uniqued ID (which I wasn't able to determine on initial indexing) and I want update all documents with this ID.
As I