Do.
2011-May-30 21:50 UTC
[Xapian-discuss] Most efficient update of already existing document?
Hello, What is the most efficient way to update some content of document with new info gathered later after it's first indexed? For example I first index a lot of documents text (lets say it's mailbox), and after all documents are indexed I determine each document uniqued ID (which I wasn't able to determine on initial indexing) and I want update all documents with this ID. As I understood API there is only way to update document is to replace it. But I don't know how effective/fast is that replacement. What is better way to store new ID in document (that will be single change) that will be updated most efficiently by replace: term, data, or value? Best regads.
Olly Betts
2011-May-31 13:10 UTC
[Xapian-discuss] Most efficient update of already existing document?
On Tue, May 31, 2011 at 01:50:29AM +0400, Do. wrote:> What is the most efficient way to update some content of document with > new info gathered later after it's first indexed?Xapian::document doc = db.get_document(did); // make changes to doc. db.replace_document(did, doc); The Document and Database objects together keep track of which classes of things you changed, and then compare within those classes to make minimal changes to the database, so it's pretty efficient. The most recent improvements to this were in 1.1.4, backported to 1.0.18, so you can probably rely on having them these days.> As I understood API there is only way to update document is to replace > it. But I don't know how effective/fast is that replacement. What is > better way to store new ID in document (that will be single change) > that will be updated most efficiently by replace: term, data, or > value?How to store it should really depend on how you intend to use it. If you want to be able to find a particular document by its id efficiently then you really need to index it as a term. Cheers, Olly