similar to: How to check docid

Displaying 20 results from an estimated 1000 matches similar to: "How to check docid"

2007 Apr 05
1
Re: [Xapian-commits] 8107: trunk/xapian-core/ trunk/xapian-core/backends/
olly wrote: > Log message (7 lines): > backends/database.cc: Database::Internal can't call the > PostingIterator(PostingIterator::Internal*) ctor (at least under > g++ 3.3.5) because it isn't a friend (only class Database is). For the record, Mark just reported this to me under windows so it was a problem there too, but it does work under GCC 4.1. No idea which compiler is
2007 Feb 09
1
Fetching document content by Q term in Python
Hello, I'd like to be able to retrieve the indexes stored copy of the document text and tried the following: terms = self.db.allterms() terms.skip_to('Q' + uri.encode('utf-8')) term = terms.next() doc = self.db.get_document(term[1]) print doc.get_data() I just wildly guessed that [1] was the docid, but of course it isn't. So the question is, how do I
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum
2020 Aug 27
4
Xapian on Android?
Friends, I would like to hear from anyone who has experience deploying Xapian on Android. I'm new to Xapian, but I know it is used by a couple partners for offline projects on Linux and Windows. Our small nonprofit, WiderNet, provides off-line access to thousands of Web sites for people who lack Internet connectivity (www.widernet.org). Over 2,000 universities, schools, health care sites,
2007 Jun 19
2
Deleted documents not deleted
I seem to be seeing cases where I call db.delete_document(somedocid) with no error, then flush() and delete the database object, but the document is still there after process exit. The write lock is normally deleted, so it appears that the database close finished normally. If I then then call delete_document(somedocid) from another command/process, this time it goes away. I've been seeing
2004 May 11
2
"Error reading block xxx: got end of file"
Xapian (0.7.5) is spitting out this error on a regular basis: org.xapian.errors.DatabaseError: Error reading block 136618: got end of=20= file =A0=A0=A0=A0=A0=A0=A0 at=20 org.xapian.XapianJNI.writabledatabase_repalce_document(Native Method) =A0=A0=A0=A0=A0=A0=A0 at=20 org.xapian.WritableDatabase.replaceDocument(WritableDatabase.java:67) I don't have a gdb backtrace, only the Java
2005 Jul 20
1
docid type redifine
Hello all. I need to redefine a docid type (and all dependent types) like this: typedef unsigned long long docid; I think it would be enough to edit "include/xapian/types.h", but it isn't so. 1) I've added : string om_tostring(unsigned long long val) { CONVERT_TO_STRING("%llu") } in common/utils.{h,cc} 2) In include/enquire.h (line 438) I've found the
2005 Jun 29
2
Sort by docid
Hello, I wonder if there is a way to cause Xapian to order a result set purely by docid. In other words, once the result set has been determined, I'd like the results to be returned to me ordered by their docid, as opposed to by their match relevance. The problem at hand is that I'm building a search engine for a mailing list and I would like to return matches sorted by date; ordering by
2023 May 03
1
manual flushing thresholds for deletes?
Olly Betts <olly at survex.com> wrote: > On Mon, Mar 27, 2023 at 11:22:09AM +0000, Eric Wong wrote: > > Olly Betts <olly at survex.com> wrote: > > > 10 seems too long. You want the mean word length weighted by frequency > > > of occurrence. For English that's typically around 5 characters, which > > > is 5 bytes. If we go for +1 that's:
2009 Apr 23
1
PHP Total document
I was also wondering if someone could tell me how to extract the total number of documents contained in a database via PHP. Thanks, Frank
2010 Jun 21
1
How to search in many database?
Hi, I'm newbie in xapian. I just use xapian for a few week ago and I would like to know: How I can search in many database at once time? Please send some answer to me. p.s. Sorry about my english. Regrad Mr.T _________________________________________________________________ Hotmail: ??????????????????????????????????????????????????????????????????????
2012 Jan 20
3
get_docid???
my $mset = $enq->get_mset($nstart,$nrecords); for(my $mit=$mset->begin(); $mit != $mset->end();$mit++) { my $doc = $mit->get_document(); my $dat = $doc->get_data(); my $id = $doc->get_docid(); } [Fri Jan 20 10:35:06 2012] newmail.cgi: Can't locate auto/Search/Xapian/Document/get_docid.al in @INC (@INC contains: /etc/perl
2018 Jan 03
2
Storing the documents text: data record or value ?
Hi, Following the Recoll snippets generation performance problem caused by the new positions list storage scheme in Xapian 1.4, I am experimenting with generating snippets from the complete document text stored in the index. This increases the index size much less than I would have expected (around 10-15% apparently with my home directory data), which is good news obviously. I have tried
2007 Apr 09
1
Re: [Xapian-commits] 8157: trunk/xapian-core/ trunk/xapian-core/backends/flint/ trunk/xapian-core/backends/quartz/
olly wrote: > Log message (6 lines): > backends/flint/flint_database.cc: Delete the corresponding entry > (if any) from doclens in delete_document(). Add assertion to > add_document_() that the corresponding entry in doclens isn't > already set, but in a non-debug build overwrite any existing > entry as that's more likely to be correct. >
2023 Aug 19
1
does Xapian::Enquire hold an MVCC revision?
Olly Betts <olly at survex.com> wrote: > On Fri, Aug 18, 2023 at 10:41:52AM +0000, Eric Wong wrote: > > Olly Betts <olly at survex.com> wrote: > > > While the match is running, get_mset(2000, 1000) needs to track > > > 3000 entries so this won't reduce your heap usage (at least not > > > peak usage). > > > > > > Is the heap
2013 Mar 08
2
Gsoc-2013
Hi, I am Chinmay Naik, an undergraduate in Computer Science at Bangalore Institute of Technology, Bangalore. I am an experienced programmer and good with C,C++,Python,Java,OpenGL and would love to participate in Gsoc-13. >From the ideas listed, i am interested to work on the project "posting list encoding improvements". I am a newbie to Xapian but would like to get involved and get a
2013 Jun 19
2
Compact databases and removing stale records at the same time
I'm trying to compact (or at least merge) multiple databases, while stripping search records which are no longer required. Backstory: I've inherited the Cyrus IMAPd xapian-based search code from Greg Banks when he left Opera. One of the unfinished parts was removing expunged emails from the search database. We moved from having a single search database to supporting multiple
2010 Oct 22
1
overlapping docids when searching on multiple databases?
Just a quick question - it seems to me that it's entirely possible to get overlapping docids when searching on multiple databases? For instance: open database1 add database2 to database1 search db1+db2 if docid 10 exists in both databases, is there any way of telling which which database to retrieve the document from? /Per Jessen, Z?rich
2009 Jul 15
2
XAPIAN_FLUSH_THRESHOLD
I'm playing around with a machine that has 2 GB of memory. Indexing about 5GB of data average of 2MB per document. The documents are plain text. I notice the omindex's memory fott print get's biger an bigger then the machine starts to swap and it all slows down to a crawl. In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000 Am I right in saying that for my setup
2023 Mar 27
1
manual flushing thresholds for deletes?
On Mon, Mar 27, 2023 at 11:22:09AM +0000, Eric Wong wrote: > Olly Betts <olly at survex.com> wrote: > > 10 seems too long. You want the mean word length weighted by frequency > > of occurrence. For English that's typically around 5 characters, which > > is 5 bytes. If we go for +1 that's: > > Actually, 10 may be too short in my case since there's a