search for: postlists

Displaying 20 results from an estimated 99 matches for "postlists".

Did you mean: postlist
2010 Jan 18
3
postlist: Tag containing meta information is corrupt.
Greetings, Using latest svn. I've noticed the following error when performing index merging: postlist: baseB blocksize=8K items=33962 lastblock=534 revision=1 levels=2 root=459 B-tree checked okay Tag containing meta information is corrupt. postlist table errors found: 1 I can still search on this index (I've only checked very small indexes), but merging is now a problem since I check
2011 Jul 19
1
xapian-compact ok, xapian-check failure
Greets, I've encountered the following while performing test merges (and writing code to handle errors, etc so things can be automated) and wondering about the best way to proceed: xapian-compact -b64k -m src1 src2.... tmp_dst -- works as expected, exit code 0. xapian-check tmp_dst -- produces the following error for the postlist: postlist: baseB blocksize=64K items=28175410
2004 Aug 23
1
postlist chunking
Postlists are split up into chunks, so that skip_to can avoid reading all the postlist. Currently the chunk threshold is 2048, but this is checked before adding an entry, so the postlist chunk can actually grow a little larger. Something like 2060 at most. Unfortunately this isn't a good threshold with...
2016 Apr 12
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes: > On Mon, Apr 11, 2016 at 09:54:36AM +0200, Jean-Francois Dockes wrote: > > The question which remains for me is if I should run xapian-compact > > after an initial indexing operation. I guess that this depends on the > > amount of expected updates and that there is no easy answer ? > > I think it's not obvious whether it's a good plan
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum
2017 Dec 29
2
notmuch: Xapian exception during database creation
Running notmuch from git on Debian testing[1] with the mail and database sitting on a ZFS filesystem, adding mail to a new database: > agrajag-testing ~/s/notmuch % ./notmuch new > Found 605510 total files (that's not much mail). > add_file: A Xapian exception occurred36m 37s remaining). > A Xapian exception occurred adding message: Unexpected end of posting list for
2017 Jul 31
2
Segmentation fault in matcher/queryoptimiser
Since a couple of weeks we are experiencing occasional segmentation faults within Xapian 1.5. We can't reproduce the crashes, but we have strong hints that they are due to memory corruption. We have narrowed down our root cause analysis to phrase searches on multi-databases that fail on reading the `hint` field in the`QueryOptimiser`class [1]. We'd appreciate any hints on how to fix this.
2017 Dec 29
0
notmuch: Xapian exception during database creation
On Fri, Dec 29, 2017 at 03:00:47PM +0000, David Edmondson wrote: > Running notmuch from git on Debian testing[1] with the mail and database > sitting on a ZFS filesystem, adding mail to a new database: > > > agrajag-testing ~/s/notmuch % ./notmuch new > > Found 605510 total files (that's not much mail). > > add_file: A Xapian exception occurred36m 37s remaining).
2010 Dec 18
1
Xapian index size 475GB = 170 million documents (URLs)
Xapians, I am maintaining about two indexes for my search engines which approximately is each the same size. I would like to share this knowledge with you, since many of you have never seen Xapian index of this size. And of course you can search the index by yourself at - http://myhealthcare.com/ - http://find1friend.com/ I need 2 x 100 million more documents into each index, and I hope it will
2014 Mar 13
3
Optimized VSEncoding
...rds). It is true that position lists don't need unpacking and repacking, but if encoding actually takes 150 times longer in practice, that's probably not something we want to do by default. It could be an option for users who want to squeeze out every last bit of search speed though. For postlists, we would need to decode and reencode chunks when documents are deleted or modified - there I think the extra encode time would also be too much for general use. But linux.postlist is rather large - we'll probably never encode such a big chunk of data in one go. Our postlist chunks are curren...
2019 Feb 03
0
Amount of writes during index creation
On Thu, Jan 31, 2019 at 08:44:44PM +0100, Jean-Francois Dockes wrote: > I have run a number of tests, with data mostly from a project gutenberg dvd > and other books, with relatively modest index sizes, from 1 to 24 GB. > > Quite curiously, in this zone, with all Xapian versions I tried, the ratio > from index size to the amount of writes is roughly proportional to the index >
2013 Jun 19
2
Compact databases and removing stale records at the same time
I'm trying to compact (or at least merge) multiple databases, while stripping search records which are no longer required. Backstory: I've inherited the Cyrus IMAPd xapian-based search code from Greg Banks when he left Opera. One of the unfinished parts was removing expunged emails from the search database. We moved from having a single search database to supporting multiple
2009 Jul 15
2
XAPIAN_FLUSH_THRESHOLD
I'm playing around with a machine that has 2 GB of memory. Indexing about 5GB of data average of 2MB per document. The documents are plain text. I notice the omindex's memory fott print get's biger an bigger then the machine starts to swap and it all slows down to a crawl. In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000 Am I right in saying that for my setup
2011 Jan 11
1
chert-update creates a db with some errors
I've some problems converting a xapian db, created with core 1.1.3 (using chert), to the new chert format. I'm using xapian-chert-update, compiled from the core-1.2.4. The conversion seems to run without errors: #./xapian-core-1.2.4/bin/xapian-chert-update old new postlist: Reduced by 33.3333% 16K (48K -> 32K) record: Size unchanged (8K) termlist: doesn't exist position: Size
2007 Apr 05
1
Re: [Xapian-commits] 8107: trunk/xapian-core/ trunk/xapian-core/backends/
olly wrote: > Log message (7 lines): > backends/database.cc: Database::Internal can't call the > PostingIterator(PostingIterator::Internal*) ctor (at least under > g++ 3.3.5) because it isn't a friend (only class Database is). For the record, Mark just reported this to me under windows so it was a problem there too, but it does work under GCC 4.1. No idea which compiler is
2018 Jul 12
1
Error while compacting: Bad position key
Mike Hommey <mh at glandium.org> writes: > Hi, > > When running `notmuch compact` today, it stopped with the following > output: > > Compacting database... > compacting table postlist > Reduced by 25% 648656K (2498904K -> 1850248K) > compacting table docdata > Reduced by 15% 24K (152K -> 128K) > compacting table termlist > Reduced by
2016 Apr 11
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes: > On Sun, Apr 10, 2016 at 04:47:01PM +0200, Jean-Francois Dockes wrote: > > Some might notice the 50% index size increase. Excessive index size is > > already one relatively rare, but recurring complaint. Except if I did > > something wrong: I'm actually quite surprised by it. > > Did you try compacting the resulting databases? > >
2018 Jan 03
2
Storing the documents text: data record or value ?
Hi, Following the Recoll snippets generation performance problem caused by the new positions list storage scheme in Xapian 1.4, I am experimenting with generating snippets from the complete document text stored in the index. This increases the index size much less than I would have expected (around 10-15% apparently with my home directory data), which is good news obviously. I have tried
2011 Mar 31
0
Xapian Index: 607GB = 219 million of unique documents
It took approximately five days, having single process using one core CPU and 6GB of memory to build this giant 607GB single Xapian index, containing 219 million of unique documents (web sites). So far I did not found any other implementation that would enable me to build such a single index containing over 200 million documents, while testing Lucene, Solr, MySQL, Hadoop and Oracle. Probably
2011 May 13
0
Xapian Index 253 million documents = 704G
Xapian Index 253 million documents = 704G I just build my largest single Xapian index with 253 million unique documents on single server using single hard disk, less that 8G RAM and single processor 2.0 GHz. I do not see any search performance decreases in searching my indexes between 100 million and 250 million, which indicates a good scalability of Xapian and it looks like, I can push it easily