search for: get_docu

Displaying 20 results from an estimated 61 matches for "get_docu".

Did you mean: get_doc
2015 Mar 11
2
stub-file and get_doccount
Hello, i switched from one big index to a stub file with many indexes and running into a problem. i have a tool to fetch a random document via: get_doccount random id up to get_doccount get_document with that id after changing to stub file this failes. Is there a nice way to get a random document from a stub file? ?MfG? Felix Ostmann
2023 Aug 23
1
DatabaseModifiedError while iterating on mset
I'm already retrying the ->get_mset operations; but now I'm wondering where I'd hit DatabaseModifiedErrors while inside a Xapian::MSetIterator loop. I assume ->get_document is a place where it gets thrown; but once a document is retrieved, can iterating through terms in one document (using TermIterator) also throw DB modified? I'm dumping multiple terms per-document to a stream. While retrying ->get_document seems straightforward, retrying midway through a...
2007 Feb 09
1
Fetching document content by Q term in Python
Hello, I'd like to be able to retrieve the indexes stored copy of the document text and tried the following: terms = self.db.allterms() terms.skip_to('Q' + uri.encode('utf-8')) term = terms.next() doc = self.db.get_document(term[1]) print doc.get_data() I just wildly guessed that [1] was the docid, but of course it isn't. So the question is, how do I get a docid out of a term? Or if I'm completely on the wrong track, how do I get the document from a Q term? Thanks, Alec
2010 Jun 24
1
Quickest way to retrieve data for a large match set?
...ebug; my $e = $self->index->enquire ($q); #my $hits = $e->get_mset(0, $self->index->get_doccount, $self->index- >get_doccount); my (@hits) = $e->matches (0, $self->index->get_doccount, $self->index- >get_doccount); my (@results) = map +thaw($_->get_document->get_data), @hits; return \@results; } I'd like to know if there's anything I can do to improve the speed of fetching the results (in other words, am I doing it wrong)?
2007 Sep 30
1
Perl example of using termitrator?
...uire->get_matching_terms_end(id);termIt++) { string term = *termIt; } Or something similar. However when I attempt to translate that into perl, I am trying:( I am working in the blind here) foreach my $match ( @matches ) { my %hit; my %ht; my $doc = $match->get_document(); my $per = $match->get_percent(); my $id = $match->get_docid(); my $bterm = $enq->get_matching_terms_begin($id); for(my $xit=$bterm;$xit != $enq->get_matching_terms_end($id);$xit++) { my $term=$xit; print $term; } W...
2007 Mar 03
1
Error handling in the bindings
...bindings catch exceptions raised by Xapian and rethrow them using the standard SWIG error types: for example, if any Xapian::DatabaseError exception is raised, SWIG will use its standard SWIG_IOError error type to report the error. In Python, this leads to code like the following: try: db.get_document(1) except RuntimeError, e: if str(e).startswith('DocNotFoundError:'): # Handle a DocNotFoundError (Incidentally, the error raised in python for this actually is a RuntimeError, not an IOError as I would have expected from reading the SWIG code, but that's not a majo...
2007 Mar 03
1
Error handling in the bindings
...bindings catch exceptions raised by Xapian and rethrow them using the standard SWIG error types: for example, if any Xapian::DatabaseError exception is raised, SWIG will use its standard SWIG_IOError error type to report the error. In Python, this leads to code like the following: try: db.get_document(1) except RuntimeError, e: if str(e).startswith('DocNotFoundError:'): # Handle a DocNotFoundError (Incidentally, the error raised in python for this actually is a RuntimeError, not an IOError as I would have expected from reading the SWIG code, but that's not a majo...
2013 Jun 19
2
Compact databases and removing stale records at the same time
...and hadn't even looked at. I don't think I've ever actually seen it called. I'll simplify it. > > /* copy all matching documents to the new DB */ > > for (Xapian::MSetIterator i = matches.begin() ; i != matches.end() ; ++i) { > > Xapian::Document doc = i.get_document(); > > This requires creating an in-memory structure of size get_doccount(), so > won't scale well to really big databases. My test DB is about 90k documents. Lots of terms though, particularly some of the emails which contain thousands of lines of syslog output. [brong at imap...
2007 Feb 12
0
[859] trunk/wxruby2/doc/textile/docchildframe.txtl: Added ''methods'' section; removed C++ members
...Methods </ins><span class="cx"> </span><del>-"Document/view overview":docviewoverview.html, "Frame":frame.html </del><ins>+<div id="methods"> +* "DocChildFrame.new":#DocChildFrame_new +* "DocChildFrame#get_document":#DocChildFrame_getdocument +* "DocChildFrame#get_view":#DocChildFrame_getview +* "DocChildFrame#on_activate":#DocChildFrame_onactivate +* "DocChildFrame#on_close_window":#DocChildFrame_onclosewindow +* "DocChildFrame#set_document":#DocChildFrame_set...
2007 Feb 12
0
[858] trunk/wxruby2/doc/textile/docmdichildframe.txtl: Added ''methods'' section; removed C++ members
...;<ins>+<div id="methods"> </ins><span class="cx"> </span><del>-h3(#DocMDIChildFrame_mchilddocument). DocMDIChildFrame#m__child_document </del><ins>+* "DocMDIChildFrame.new":#DocMDIChildFrame_new +* "DocMDIChildFrame#get_document":#DocMDIChildFrame_getdocument +* "DocMDIChildFrame#get_view":#DocMDIChildFrame_getview +* "DocMDIChildFrame#on_activate":#DocMDIChildFrame_onactivate +* "DocMDIChildFrame#on_close_window":#DocMDIChildFrame_onclosewindow +* "DocMDIChildFrame#set_document&...
2023 Aug 27
1
DatabaseModifiedError while iterating on mset
On Wed, Aug 23, 2023 at 01:53:27PM +0000, Eric Wong wrote: > I'm already retrying the ->get_mset operations; but now I'm > wondering where I'd hit DatabaseModifiedErrors while inside a > Xapian::MSetIterator loop. > > I assume ->get_document is a place where it gets thrown; > but once a document is retrieved, can iterating through > terms in one document (using TermIterator) also throw DB modified? If you only look at the terms and wdfs then you could only get DatabaseModifiedError on the call to create the TermIterator sinc...
2006 Jan 06
1
Xapian binding for C#.
Xapians! Anyone know when Xapian bindings for C# will be ready? Thanks, Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20060105/257652e0/attachment.htm
2013 Jun 19
2
Compact databases and removing stale records at the same time
...atabase */ Xapian::WritableDatabase *destdb = new Xapian::WritableDatabase(dest, Xapian::DB_CREATE_OR_OPEN); destdb->begin_transaction(); /* copy all matching documents to the new DB */ for (Xapian::MSetIterator i = matches.begin() ; i != matches.end() ; ++i) { Xapian::Document doc = i.get_document(); std::string cyrusid = doc.get_value(SLOT_CYRUSID); if (cb(cyrusid.c_str(), rock)) { destdb->add_document(doc); count++; /* commit occasionally */ if (count % 1024 == 0) { destdb->commit_transaction(); destdb->begin_transaction(); } } } /* comm...
2023 Aug 28
1
DatabaseModifiedError while iterating on mset
...wrote: > On Wed, Aug 23, 2023 at 01:53:27PM +0000, Eric Wong wrote: > > I'm already retrying the ->get_mset operations; but now I'm > > wondering where I'd hit DatabaseModifiedErrors while inside a > > Xapian::MSetIterator loop. > > > > I assume ->get_document is a place where it gets thrown; > > but once a document is retrieved, can iterating through > > terms in one document (using TermIterator) also throw DB modified? > > If you only look at the terms and wdfs then you could only get > DatabaseModifiedError on the call to cre...
2007 Jun 12
5
index browser inconsistent with IndexReader
Hi, We have an index of around 1M web pages as part of our web app. The app uses ferret by way of RDig to perform searches. We have noticed anecdotally that some searches don''t work the way we thought they should, as if documents were missing from the index. Yesterday we came upon a concrete instance of this. Our documents have several fields, one of which is called :keywords and
2010 Oct 21
2
In-memory databases vs PHP Bindings
...far, using a disk-based index with an automatic backend (third line from the end is the critical one): // Find the document in the posts index $xenq = new XapianEnquire($xdb_posts); $xenq->set_query(new XapianQuery("UIDpost".$postid)); $xdoc = $xenq->get_mset(0, 1)->begin()->get_document(); // Create a database that just contains the one document // TODO:AB:20101020: Work out how to build an in-memory Xapian database via PHP bindings $xdb_doc = new XapianWritableDatabase(PROJROOT.'/tmp/xapian/doc'.$postid, Xapian::DB_CREATE_OR_OVERWRITE); $xdb_doc->add_document($x...
2012 Jan 20
3
get_docid???
my $mset = $enq->get_mset($nstart,$nrecords); for(my $mit=$mset->begin(); $mit != $mset->end();$mit++) { my $doc = $mit->get_document(); my $dat = $doc->get_data(); my $id = $doc->get_docid(); } [Fri Jan 20 10:35:06 2012] newmail.cgi: Can't locate auto/Search/Xapian/Document/get_docid.al in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl...
2016 Jan 08
2
Strange index consistency issue
...coll, because of crashes, reboots, misc Recoll bugs, etc. The strange thing here is that xapian-check does not seem to detect anything. In a nutshell, some document numbers seem to point to a data blackhole: the docids are returned when searching for the file/doc unique identifying term, but then get_document() fails. A later replace_document() succeeds, but on the next indexing pass, same issue. // success docid = db.postlist_begin(uniterm) // then failure: xdoc = db.get_document(*docid) In this situation, Recoll will try to update the doc. replace_document...
2016 May 05
2
GSoC 2016 - Introduction
Hello, Thanks James for the reply. That cleared a few things out. Apologies for replying late because of exams going on. I was going through the previous clustering API to understand how it worked and it seems like the the approach for construction of the termlists which are used for distance metrics use TF-IDF weighting with cosine similarity, which is very similar to the approach I would need
2006 May 10
1
Documentation for the PHP OO wrapper
...ml" by using a modified version of combine.xslt. It would allow to say things like : - ignore methods xxx which have the signature yyy - add documentation for method PositionIterator::get_termpos from what doxygen generated for "operator *" - add documentation for a new method MSet::get_document_percentage (doc in Doxygen format would follow...) I started to test that approach : the code is between comments in the xslt file (search for 'mappings'), and the xml file I used is here (to be copied in "all.xml") : http://www.bdsp.tm.fr/aed/xapian/wrapperDoc/mappings.xm...