Alec Thomas
2007-Feb-09 00:18 UTC
[Xapian-devel] Fetching document content by Q term in Python
Hello,
I'd like to be able to retrieve the indexes stored copy of the document
text and tried the following:
terms = self.db.allterms()
terms.skip_to('Q' + uri.encode('utf-8'))
term = terms.next()
doc = self.db.get_document(term[1])
print doc.get_data()
I just wildly guessed that [1] was the docid, but of course it isn't. So the
question is, how do I get a docid out of a term?
Or if I'm completely on the wrong track, how do I get the document from
a Q term?
Thanks,
Alec
Olly Betts
2007-Feb-09 08:01 UTC
[Xapian-devel] Fetching document content by Q term in Python
On Fri, Feb 09, 2007 at 11:18:10AM +1100, Alec Thomas wrote:> I'd like to be able to retrieve the indexes stored copy of the document > text and tried the following: > > terms = self.db.allterms() > terms.skip_to('Q' + uri.encode('utf-8')) > term = terms.next() > doc = self.db.get_document(term[1]) > print doc.get_data() > > I just wildly guessed that [1] was the docid, but of course it isn't. So the > question is, how do I get a docid out of a term?This will print the data from each document indexed by a particular term: term = 'Q' + uri.encode('utf-8') for docid in self.db.postlist(term): doc = self.db.get_document(docid) print doc.get_data() You get a PostingIter from db.postlist(term) - see python/docs/bindings.html for details.> Or if I'm completely on the wrong track, how do I get the document from > a Q term?Alternatively, you can run a search for the Q-prefixed term. The above is a little less work though. Cheers, Olly