thr3ads.net - search: "postlists"

Displaying 20 results from an estimated 99 matches for "postlists".

Did you mean: postlist

postlist: Tag containing meta information is corrupt.

2010 Jan 18

postlist: Tag containing meta information is corrupt.

Greetings, Using latest svn. I've noticed the following error when performing index merging: postlist: baseB blocksize=8K items=33962 lastblock=534 revision=1 levels=2 root=459 B-tree checked okay Tag containing meta information is corrupt. postlist table errors found: 1 I can still search on this index (I've only checked very small indexes), but merging is now a problem since I check

xapian-compact ok, xapian-check failure

2011 Jul 19

xapian-compact ok, xapian-check failure

Greets, I've encountered the following while performing test merges (and writing code to handle errors, etc so things can be automated) and wondering about the best way to proceed: xapian-compact -b64k -m src1 src2.... tmp_dst -- works as expected, exit code 0. xapian-check tmp_dst -- produces the following error for the postlist: postlist: baseB blocksize=64K items=28175410

postlist chunking

2004 Aug 23

postlist chunking

Postlists are split up into chunks, so that skip_to can avoid reading all the postlist. Currently the chunk threshold is 2048, but this is checked before adding an entry, so the postlist chunk can actually grow a little larger. Something like 2060 at most. Unfortunately this isn't a good threshold with...

Xapian 1.3.5 snapshot performance and index size

2016 Apr 12

Xapian 1.3.5 snapshot performance and index size

Olly Betts writes: > On Mon, Apr 11, 2016 at 09:54:36AM +0200, Jean-Francois Dockes wrote: > > The question which remains for me is if I should run xapian-compact > > after an initial indexing operation. I guess that this depends on the > > amount of expected updates and that there is no easy answer ? > > I think it's not obvious whether it's a good plan

Compact databases and removing stale records at the same time

2013 Jun 19

Compact databases and removing stale records at the same time

On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum

notmuch: Xapian exception during database creation

2017 Dec 29

notmuch: Xapian exception during database creation

Running notmuch from git on Debian testing[1] with the mail and database sitting on a ZFS filesystem, adding mail to a new database: > agrajag-testing ~/s/notmuch % ./notmuch new > Found 605510 total files (that's not much mail). > add_file: A Xapian exception occurred36m 37s remaining). > A Xapian exception occurred adding message: Unexpected end of posting list for

Segmentation fault in matcher/queryoptimiser

2017 Jul 31

Segmentation fault in matcher/queryoptimiser

Since a couple of weeks we are experiencing occasional segmentation faults within Xapian 1.5. We can't reproduce the crashes, but we have strong hints that they are due to memory corruption. We have narrowed down our root cause analysis to phrase searches on multi-databases that fail on reading the `hint` field in the`QueryOptimiser`class [1]. We'd appreciate any hints on how to fix this.

notmuch: Xapian exception during database creation

2017 Dec 29

notmuch: Xapian exception during database creation

On Fri, Dec 29, 2017 at 03:00:47PM +0000, David Edmondson wrote: > Running notmuch from git on Debian testing[1] with the mail and database > sitting on a ZFS filesystem, adding mail to a new database: > > > agrajag-testing ~/s/notmuch % ./notmuch new > > Found 605510 total files (that's not much mail). > > add_file: A Xapian exception occurred36m 37s remaining).

Xapian index size 475GB = 170 million documents (URLs)

2010 Dec 18

Xapian index size 475GB = 170 million documents (URLs)

Xapians, I am maintaining about two indexes for my search engines which approximately is each the same size. I would like to share this knowledge with you, since many of you have never seen Xapian index of this size. And of course you can search the index by yourself at - http://myhealthcare.com/ - http://find1friend.com/ I need 2 x 100 million more documents into each index, and I hope it will

Optimized VSEncoding

2014 Mar 13

Optimized VSEncoding

...rds). It is true that position lists don't need unpacking and repacking, but if encoding actually takes 150 times longer in practice, that's probably not something we want to do by default. It could be an option for users who want to squeeze out every last bit of search speed though. For postlists, we would need to decode and reencode chunks when documents are deleted or modified - there I think the extra encode time would also be too much for general use. But linux.postlist is rather large - we'll probably never encode such a big chunk of data in one go. Our postlist chunks are curren...

Amount of writes during index creation

2019 Feb 03

Amount of writes during index creation

On Thu, Jan 31, 2019 at 08:44:44PM +0100, Jean-Francois Dockes wrote: > I have run a number of tests, with data mostly from a project gutenberg dvd > and other books, with relatively modest index sizes, from 1 to 24 GB. > > Quite curiously, in this zone, with all Xapian versions I tried, the ratio > from index size to the amount of writes is roughly proportional to the index >

Compact databases and removing stale records at the same time

2013 Jun 19

Compact databases and removing stale records at the same time

I'm trying to compact (or at least merge) multiple databases, while stripping search records which are no longer required. Backstory: I've inherited the Cyrus IMAPd xapian-based search code from Greg Banks when he left Opera. One of the unfinished parts was removing expunged emails from the search database. We moved from having a single search database to supporting multiple

XAPIAN_FLUSH_THRESHOLD

2009 Jul 15

XAPIAN_FLUSH_THRESHOLD

I'm playing around with a machine that has 2 GB of memory. Indexing about 5GB of data average of 2MB per document. The documents are plain text. I notice the omindex's memory fott print get's biger an bigger then the machine starts to swap and it all slows down to a crawl. In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000 Am I right in saying that for my setup

chert-update creates a db with some errors

2011 Jan 11

chert-update creates a db with some errors

I've some problems converting a xapian db, created with core 1.1.3 (using chert), to the new chert format. I'm using xapian-chert-update, compiled from the core-1.2.4. The conversion seems to run without errors: #./xapian-core-1.2.4/bin/xapian-chert-update old new postlist: Reduced by 33.3333% 16K (48K -> 32K) record: Size unchanged (8K) termlist: doesn't exist position: Size

Re: [Xapian-commits] 8107: trunk/xapian-core/ trunk/xapian-core/backends/

2007 Apr 05

Re: [Xapian-commits] 8107: trunk/xapian-core/ trunk/xapian-core/backends/

olly wrote: > Log message (7 lines): > backends/database.cc: Database::Internal can't call the > PostingIterator(PostingIterator::Internal*) ctor (at least under > g++ 3.3.5) because it isn't a friend (only class Database is). For the record, Mark just reported this to me under windows so it was a problem there too, but it does work under GCC 4.1. No idea which compiler is

Error while compacting: Bad position key

2018 Jul 12

Error while compacting: Bad position key

Mike Hommey <mh at glandium.org> writes: > Hi, > > When running `notmuch compact` today, it stopped with the following > output: > > Compacting database... > compacting table postlist > Reduced by 25% 648656K (2498904K -> 1850248K) > compacting table docdata > Reduced by 15% 24K (152K -> 128K) > compacting table termlist > Reduced by

Xapian 1.3.5 snapshot performance and index size

2016 Apr 11

Xapian 1.3.5 snapshot performance and index size

Olly Betts writes: > On Sun, Apr 10, 2016 at 04:47:01PM +0200, Jean-Francois Dockes wrote: > > Some might notice the 50% index size increase. Excessive index size is > > already one relatively rare, but recurring complaint. Except if I did > > something wrong: I'm actually quite surprised by it. > > Did you try compacting the resulting databases? > >

Storing the documents text: data record or value ?

2018 Jan 03

Storing the documents text: data record or value ?

Hi, Following the Recoll snippets generation performance problem caused by the new positions list storage scheme in Xapian 1.4, I am experimenting with generating snippets from the complete document text stored in the index. This increases the index size much less than I would have expected (around 10-15% apparently with my home directory data), which is good news obviously. I have tried

Xapian Index: 607GB = 219 million of unique documents

2011 Mar 31

Xapian Index: 607GB = 219 million of unique documents

It took approximately five days, having single process using one core CPU and 6GB of memory to build this giant 607GB single Xapian index, containing 219 million of unique documents (web sites). So far I did not found any other implementation that would enable me to build such a single index containing over 200 million documents, while testing Lucene, Solr, MySQL, Hadoop and Oracle. Probably

Xapian Index 253 million documents = 704G

2011 May 13

Xapian Index 253 million documents = 704G

Xapian Index 253 million documents = 704G I just build my largest single Xapian index with 253 million unique documents on single server using single hard disk, less that 8G RAM and single processor 2.0 GHz. I do not see any search performance decreases in searching my indexes between 100 million and 250 million, which indicates a good scalability of Xapian and it looks like, I can push it easily

search for: postlists