search for: postlist

Displaying 20 results from an estimated 99 matches for "postlist".

2010 Jan 18
3
postlist: Tag containing meta information is corrupt.
Greetings, Using latest svn. I've noticed the following error when performing index merging: postlist: baseB blocksize=8K items=33962 lastblock=534 revision=1 levels=2 root=459 B-tree checked okay Tag containing meta information is corrupt. postlist table errors found: 1 I can still search on this index (I've only checked very small indexes), but merging is now a problem since I check the retu...
2011 Jul 19
1
xapian-compact ok, xapian-check failure
...the following while performing test merges (and writing code to handle errors, etc so things can be automated) and wondering about the best way to proceed: xapian-compact -b64k -m src1 src2.... tmp_dst -- works as expected, exit code 0. xapian-check tmp_dst -- produces the following error for the postlist: postlist: baseB blocksize=64K items=28175410 lastblock=117541 revision=1 levels=2 root=3493 B-tree checked okay document id 68511: length 32400 doesn't match 27468 in the termlist table postlist table errors found: 1 Using "delve -d -r68511" I can identify the source index which I...
2004 Aug 23
1
postlist chunking
Postlists are split up into chunks, so that skip_to can avoid reading all the postlist. Currently the chunk threshold is 2048, but this is checked before adding an entry, so the postlist chunk can actually grow a little larger. Something like 2060 at most. Unfortunately this isn't a good threshold wit...
2016 Apr 12
2
Xapian 1.3.5 snapshot performance and index size
...0 -rw-r--r-- 1 dockes dockes 24150016 Apr 12 10:47 docdata.glass -rw-r--r-- 1 dockes dockes 0 Apr 12 10:47 flintlock -rw-r--r-- 1 dockes dockes 130 Apr 12 10:47 iamglass -rw-r--r-- 1 dockes dockes 577527808 Apr 12 10:47 position.glass -rw-r--r-- 1 dockes dockes 120905728 Apr 12 10:47 postlist.glass -rw-r--r-- 1 dockes dockes 89677824 Apr 12 10:47 termlist.glass ************************* *******LIB***************** Tue Apr 12 10:48:04 CEST 2016 #define SEQ_START_POINT (-7) -rwxr-xr-x 1 root root 30728315 Apr 12 10:48 /usr/lib/libxapian-1.3.so.6 ************************* 449.64user 124....
2013 Jun 19
2
Compact databases and removing stale records at the same time
...'t delete things (or at least I can't see > > how). > > A lot of the reason why compact is fast is because it pretty much just > treats the contents of each posting list chunk as opaque data (if it > renumbers, it has to adjust the header of the first chunk from each > postlist, if I remember correctly). Yeah, fair enough! > In order to be able to delete documents as it went, it would have to > modify any postlist chunks which contained those documents. That's > possible, but adds complexity to the compaction code, and will probably > lose most of the s...
2017 Dec 29
2
notmuch: Xapian exception during database creation
...tems=0 firstunused=1 revision=1 levels=0 root=(faked) > void B-tree checked okay > docdata table structure checked OK > > termlist: > blocksize=8K items=0 firstunused=2 revision=1 levels=0 root=(faked) > void B-tree checked okay > termlist table structure checked OK > > postlist: > blocksize=8K items=2 firstunused=1 revision=1 levels=0 root=0 > B-tree checked okay > postlist table structure checked OK > > position: > blocksize=8K items=0 firstunused=1 revision=1 levels=0 root=(faked) > void B-tree checked okay > position table structure checked OK...
2017 Jul 31
2
Segmentation fault in matcher/queryoptimiser
...t fail on reading the `hint` field in the`QueryOptimiser`class [1]. We'd appreciate any hints on how to fix this. I've written up our findings and solution attempts below. Should we post this on trac? Our findings so far ================ In a core dump we see that calling the `open_nearby_postlist` function on the `hint` variable [2] falls of a cliff, resulting in a segfault: (gdb) bt 2 #0 0x000000000001eaa1 in ?? () #1 0x00007fa19d09231f in LocalSubMatch::open_post_list (this=0x13527d0, term=..., wqf=1, factor=1, need_positions=<optimized out>, in_synonym=&l...
2017 Dec 29
0
notmuch: Xapian exception during database creation
...=(faked) > > void B-tree checked okay > > docdata table structure checked OK > > > > termlist: > > blocksize=8K items=0 firstunused=2 revision=1 levels=0 root=(faked) > > void B-tree checked okay > > termlist table structure checked OK > > > > postlist: > > blocksize=8K items=2 firstunused=1 revision=1 levels=0 root=0 > > B-tree checked okay > > postlist table structure checked OK > > > > position: > > blocksize=8K items=0 firstunused=1 revision=1 levels=0 root=(faked) > > void B-tree checked okay > &...
2010 Dec 18
1
Xapian index size 475GB = 170 million documents (URLs)
...: total 475G -rw-r--r-- 1 kevin kevin 28 2010-12-18 15:25 iamchert -rw-r--r-- 1 kevin kevin 13 2010-12-18 12:19 position.baseA -rw-r--r-- 1 kevin kevin 3.8M 2010-12-18 15:25 position.baseB -rw-r--r-- 1 kevin kevin 240G 2010-12-18 15:25 position.DB -rw-r--r-- 1 kevin kevin 13 2010-12-18 04:31 postlist.baseA -rw-r--r-- 1 kevin kevin 923K 2010-12-18 11:36 postlist.baseB -rw-r--r-- 1 kevin kevin 58G 2010-12-18 11:36 postlist.DB -rw-r--r-- 1 kevin kevin 13 2010-12-18 11:36 record.baseA -rw-r--r-- 1 kevin kevin 1.6M 2010-12-18 12:03 record.baseB -rw-r--r-- 1 kevin kevin 102G 2010-12-18 12:02 recor...
2014 Mar 13
3
Optimized VSEncoding
...gtong.cpp at qq.com>; Cc: "xapian-devel"<xapian-devel at lists.xapian.org>; Subject: Re: [Xapian-devel] Optimized VSEncoding On Wed, Mar 12, 2014 at 12:30:09PM +0800, Hurricane Tong wrote: > I optimized the code of VSEncoder, > and encode/decode the whole list, linux.postlist. > > Encoding time : > VSEncoder 103429ms > InterpolativeEncoder 684ms > > Decoding time: > VSDecoder 756ms > InterpolativeDecoder 925ms What where the sizes in each case? Is your code online somewhere? I'd be interested in taking a look. > ( compile with -o2,...
2019 Feb 03
0
Amount of writes during index creation
...trace-analyse Using strace means other processes are definitely excluded and you get to see which tables (and even which blocks) the I/O is, e.g. a small update to a small database gives: read 0 from tmp.db/record.DB read 0 from tmp.db/termlist.DB read 0 from tmp.db/position.DB read 0 from tmp.db/postlist.DB write 1 to tmp.db/postlist.DB write 1 to tmp.db/position.DB write 1 to tmp.db/termlist.DB write 1 to tmp.db/record.DB sync tmp.db/postlist.tmp sync tmp.db/postlist.DB read 1 from tmp.db/postlist.DB sync tmp.db/position.tmp sync tmp.db/position.DB read 1 from tmp.db/position.DB sync tmp.db/termli...
2013 Jun 19
2
Compact databases and removing stale records at the same time
I'm trying to compact (or at least merge) multiple databases, while stripping search records which are no longer required. Backstory: I've inherited the Cyrus IMAPd xapian-based search code from Greg Banks when he left Opera. One of the unfinished parts was removing expunged emails from the search database. We moved from having a single search database to supporting multiple
2009 Jul 15
2
XAPIAN_FLUSH_THRESHOLD
I'm playing around with a machine that has 2 GB of memory. Indexing about 5GB of data average of 2MB per document. The documents are plain text. I notice the omindex's memory fott print get's biger an bigger then the machine starts to swap and it all slows down to a crawl. In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000 Am I right in saying that for my setup
2011 Jan 11
1
chert-update creates a db with some errors
I've some problems converting a xapian db, created with core 1.1.3 (using chert), to the new chert format. I'm using xapian-chert-update, compiled from the core-1.2.4. The conversion seems to run without errors: #./xapian-core-1.2.4/bin/xapian-chert-update old new postlist: Reduced by 33.3333% 16K (48K -> 32K) record: Size unchanged (8K) termlist: doesn't exist position: Size unchanged (0K) spelling: Size unchanged (0K) synonym: Size unchanged (0K) But if I run xapian-check on the new database there're many errors reported: #./xapian-core-1.2.4/bin/xapian...
2007 Apr 05
1
Re: [Xapian-commits] 8107: trunk/xapian-core/ trunk/xapian-core/backends/
...Mark just reported this to me under windows so it was a problem there too, but it does work under GCC 4.1. No idea which compiler is "correct", but that hardly matters... > Can't seem to forward define Database::Internal to make > Database::Internal a friend so just use LeafPostList directly > as that seems less bad than pulling in the whole of database.h > or making PostingIterator::internal public. Only problem with this patch is that it looks like it leaks the LeafPostList if an exception is thrown by one of the other methods (such as delete_document()). Actually,...
2018 Jul 12
1
Error while compacting: Bad position key
Mike Hommey <mh at glandium.org> writes: > Hi, > > When running `notmuch compact` today, it stopped with the following > output: > > Compacting database... > compacting table postlist > Reduced by 25% 648656K (2498904K -> 1850248K) > compacting table docdata > Reduced by 15% 24K (152K -> 128K) > compacting table termlist > Reduced by 1% 27008K (2211800K -> 2184792K) > compacting table position > Error while compacting: Bad position ke...
2016 Apr 11
2
Xapian 1.3.5 snapshot performance and index size
...f > seconds (or even minutes for the extreme ones), and were generally > sub-second afterwards - 5.8 to 2.1 seconds is at the unimpressive end > of the improvements seen. One particular issue with "to be or not to > be" will be that we don't currently try to reuse the postlist or > positional data for "to" and "be", so it has to decode them twice. > > > As it is, and still hoping that more 1.3 optimization will improve the > > situation, I have to wonder if the price payed for faster phrase searches > > is not a bit too h...
2018 Jan 03
2
Storing the documents text: data record or value ?
Hi, Following the Recoll snippets generation performance problem caused by the new positions list storage scheme in Xapian 1.4, I am experimenting with generating snippets from the complete document text stored in the index. This increases the index size much less than I would have expected (around 10-15% apparently with my home directory data), which is good news obviously. I have tried
2011 Mar 31
0
Xapian Index: 607GB = 219 million of unique documents
.../ total 607G -rw-r--r-- 1 kevin kevin 28 2011-03-31 06:09 iamchert -rw-r--r-- 1 kevin kevin 14 2011-03-31 01:50 position.baseA -rw-r--r-- 1 kevin kevin 622K 2011-03-31 06:09 position.baseB -rw-r--r-- 1 kevin kevin 311G 2011-03-31 06:09 position.DB -rw-r--r-- 1 kevin kevin 14 2011-03-30 17:19 postlist.baseA -rw-r--r-- 1 kevin kevin 139K 2011-03-31 00:49 postlist.baseB -rw-r--r-- 1 kevin kevin 70G 2011-03-31 00:49 postlist.DB -rw-r--r-- 1 kevin kevin 14 2011-03-31 00:49 record.baseA -rw-r--r-- 1 kevin kevin 261K 2011-03-31 01:24 record.baseB -rw-r--r-- 1 kevin kevin 131G 2011-03-31 01:24 recor...
2011 May 13
0
Xapian Index 253 million documents = 704G
...6 total 704G -rw-r--r-- 1 kevin kevin 28 2011-05-13 08:30 iamchert -rw-r--r-- 1 kevin kevin 14 2011-05-13 03:28 position.baseA -rw-r--r-- 1 kevin kevin 718K 2011-05-13 08:30 position.baseB -rw-r--r-- 1 kevin kevin 359G 2011-05-13 08:30 position.DB -rw-r--r-- 1 kevin kevin 14 2011-05-12 17:22 postlist.baseA -rw-r--r-- 1 kevin kevin 167K 2011-05-13 02:26 postlist.baseB -rw-r--r-- 1 kevin kevin 84G 2011-05-13 02:26 postlist.DB -rw-r--r-- 1 kevin kevin 14 2011-05-13 02:26 record.baseA -rw-r--r-- 1 kevin kevin 301K 2011-05-13 03:02 record.baseB -rw-r--r-- 1 kevin kevin 151G 2011-05-13 03:02 recor...