search for: termlist

Displaying 20 results from an estimated 81 matches for "termlist".

2018 Jul 12
1
Error while compacting: Bad position key
...n running `notmuch compact` today, it stopped with the following > output: > > Compacting database... > compacting table postlist > Reduced by 25% 648656K (2498904K -> 1850248K) > compacting table docdata > Reduced by 15% 24K (152K -> 128K) > compacting table termlist > Reduced by 1% 27008K (2211800K -> 2184792K) > compacting table position > Error while compacting: Bad position key I had not seen anything like this before, but when I run xapian-check from xapian 1.4.6-2, I see termlist: B-tree checked okay doclen not within bounds doclen not...
2020 Apr 07
2
crash after running notmuch new
Matt <mattator at gmail.com> writes: > thanks didn't know about xapian-check ! > the output > === > docdata: > blocksize=8K items=70 firstunused=3 revision=421 levels=0 root=2 > B-tree checked okay > docdata table structure checked OK > > termlist: > blocksize=8K items=186136 firstunused=62058 revision=421 levels=2 root=12260 > B-tree checked okay > termlist table structure checked OK > > postlist: > blocksize=8K items=2598971 firstunused=61412 revision=421 levels=2 root=49814 > xapian-check: DatabaseCorruptError: Db blo...
2016 Apr 12
2
Xapian 1.3.5 snapshot performance and index size
...-rw-r--r-- 1 dockes dockes 0 Apr 12 10:47 flintlock -rw-r--r-- 1 dockes dockes 130 Apr 12 10:47 iamglass -rw-r--r-- 1 dockes dockes 577527808 Apr 12 10:47 position.glass -rw-r--r-- 1 dockes dockes 120905728 Apr 12 10:47 postlist.glass -rw-r--r-- 1 dockes dockes 89677824 Apr 12 10:47 termlist.glass ************************* *******LIB***************** Tue Apr 12 10:48:04 CEST 2016 #define SEQ_START_POINT (-7) -rwxr-xr-x 1 root root 30728315 Apr 12 10:48 /usr/lib/libxapian-1.3.so.6 ************************* 449.64user 124.36system 4:48.82elapsed 198%CPU (0avgtext+0avgdata 1074832maxresi...
2010 Jan 30
2
Failure trying to update document.
Hi list. I have a specific document that does not handle updates sitting in the index. What can I do about that? 2010-01-30T13:58:07 Eval failure: Exception: No termlist for document 287376 at /usr/lib/perl5/Search/Xapian/Enquire.pm line 56. 2010-01-30T13:58:07 job failed. considering retry. is max_retries of 1000 >= failures of 1? 2010-01-30T13:58:07 job failed: Exception: No termlist for document 287376 at /usr/lib/perl5/Search/Xapian/Enquire.pm line...
2010 Dec 18
1
Xapian index size 475GB = 170 million documents (URLs)
...6 postlist.baseB -rw-r--r-- 1 kevin kevin 58G 2010-12-18 11:36 postlist.DB -rw-r--r-- 1 kevin kevin 13 2010-12-18 11:36 record.baseA -rw-r--r-- 1 kevin kevin 1.6M 2010-12-18 12:03 record.baseB -rw-r--r-- 1 kevin kevin 102G 2010-12-18 12:02 record.DB -rw-r--r-- 1 kevin kevin 13 2010-12-18 12:03 termlist.baseA -rw-r--r-- 1 kevin kevin 1.2M 2010-12-18 12:19 termlist.baseB -rw-r--r-- 1 kevin kevin 76G 2010-12-18 12:18 termlist.DB $ delve . number of documents = 169346678 average document length = 230970 document length lower bound = 1 document length upper bound = 3585385 highest document id ever u...
2009 Apr 12
2
Indexing speed benchmark - Xapian, Solr
I came across this benchmark between Xapian & Solr: http://www.anur.ag/blog/2009/03/xapian-and-solr/ According to the benchmark, a doc set that took Solr 34 min to index took Xapian 7 hours. Solr's index is also much smaller - 2.5GB to Xapian's 8.9GB. I'm new to Xapian. Just wondering if results like these are typical? Is indexing speed & size a known issue in Xapian? Or is
2019 Feb 03
0
Amount of writes during index creation
...elevant I/O using strace: xapian-maintainer-tools/profiling/strace-analyse Using strace means other processes are definitely excluded and you get to see which tables (and even which blocks) the I/O is, e.g. a small update to a small database gives: read 0 from tmp.db/record.DB read 0 from tmp.db/termlist.DB read 0 from tmp.db/position.DB read 0 from tmp.db/postlist.DB write 1 to tmp.db/postlist.DB write 1 to tmp.db/position.DB write 1 to tmp.db/termlist.DB write 1 to tmp.db/record.DB sync tmp.db/postlist.tmp sync tmp.db/postlist.DB read 1 from tmp.db/postlist.DB sync tmp.db/position.tmp sync tmp.db...
2018 Mar 29
2
bug: "no top level messages" crash on Zen email loops
...it shouldn't >> lead to a corrupted database > > There was a Xapian bug here, which I fixed on master last week and will > be fixed in 1.4.6. An honor. It's not every day you find a bug in a database software. ;) > If changes to a new database which didn't modify the termlist table were > committed, then a disk block which had been allocated to be the root > block in the termlist table was leaked (not used but not on the > freelist of blocks the table can recycle). This was largely harmless, > except that it was detected by Database::check() and caused an e...
2016 May 09
1
Given a document, how do you get its ID? (perl bindings)
I am writing an indexer that will crawl our web site. Following the recommendation here: https://trac.xapian.org/wiki/FAQ/UniqueIds I'm using the URL as the unique ID for each document. I see how to get a document from the xapian database if I know its URL, but what I need is also to be able to find out the URL from the document. Does this mean I need to store the URL in a value in
2018 Mar 19
2
bug: "no top level messages" crash on Zen email loops
Antoine Beaupré <anarcat at orangeseeds.org> writes: > On 2018-03-19 13:36:49, David Bremner wrote: >> >> I can't duplicate that part. > > That's very strange. I can reproduce this on my workstation here, but > taking the tarball I sent in the original message, I can't reproduce > anymore. So something changed! I suspect it's the
2018 Apr 29
1
Database corruption after clean rebuild
...6.2, build it, moved the xapian directory away, did a notmuch new and restored the tags from a dump. But the problem remains: ~$ xapian-check ~/Mail/.notmuch/xapian docdata: blocksize=8K items=10841 firstunused=75 revision=82 levels=1 root=2 B-tree checked okay docdata table structure checked OK termlist: blocksize=8K items=1893162 firstunused=368983 revision=82 levels=3 root=177608 xapian-check: DatabaseError: 1 unused block(s) missing from the free list, first is 0 this is very similar to the old database which I had moved away: ~$ xapian-check ~/Mail/.notmuch/xapian-2018-04-29-00-22/ docdata:...
2011 Mar 31
0
Xapian Index: 607GB = 219 million of unique documents
...9 postlist.baseB -rw-r--r-- 1 kevin kevin 70G 2011-03-31 00:49 postlist.DB -rw-r--r-- 1 kevin kevin 14 2011-03-31 00:49 record.baseA -rw-r--r-- 1 kevin kevin 261K 2011-03-31 01:24 record.baseB -rw-r--r-- 1 kevin kevin 131G 2011-03-31 01:24 record.DB -rw-r--r-- 1 kevin kevin 14 2011-03-31 01:24 termlist.baseA -rw-r--r-- 1 kevin kevin 192K 2011-03-31 01:50 termlist.baseB -rw-r--r-- 1 kevin kevin 96G 2011-03-31 01:50 termlist.DB $ delve . number of documents = 219344757 average document length = 28255.9 document length lower bound = 1 document length upper bound = 173153 highest document id ever u...
2011 May 13
0
Xapian Index 253 million documents = 704G
...6 postlist.baseB -rw-r--r-- 1 kevin kevin 84G 2011-05-13 02:26 postlist.DB -rw-r--r-- 1 kevin kevin 14 2011-05-13 02:26 record.baseA -rw-r--r-- 1 kevin kevin 301K 2011-05-13 03:02 record.baseB -rw-r--r-- 1 kevin kevin 151G 2011-05-13 03:02 record.DB -rw-r--r-- 1 kevin kevin 14 2011-05-13 03:02 termlist.baseA -rw-r--r-- 1 kevin kevin 224K 2011-05-13 03:28 termlist.baseB -rw-r--r-- 1 kevin kevin 112G 2011-05-13 03:28 termlist.DB Thanks, Kevin Duraj http://myhealthcare.com
2020 Apr 07
0
crash after running notmuch new
On Tue, Apr 07, 2020 at 05:21:47PM -0300, David Bremner wrote: > Matt <mattator at gmail.com> writes: [...] > > termlist: > > blocksize=8K items=186136 firstunused=62058 revision=421 levels=2 root=12260 > > B-tree checked okay > > termlist table structure checked OK > > > > postlist: > > blocksize=8K items=2598971 firstunused=61412 revision=421 levels=2 root=49814 > > xapian-...
2011 Jun 10
2
Just starting to experiment with php
...tlist.baseB -rwxrwxrwx 1 jwl jwl 367476736 2011-06-09 02:28 postlist.DB -rwxrwxrwx 1 jwl jwl 734 2011-06-09 02:28 record.baseA -rwxrwxrwx 1 jwl jwl 725 2011-06-09 02:28 record.baseB -rwxrwxrwx 1 jwl jwl 46923776 2011-06-09 02:28 record.DB -rwxrwxrwx 1 jwl jwl 3530 2011-06-09 02:28 termlist.baseA -rwxrwxrwx 1 jwl jwl 3439 2011-06-09 02:28 termlist.baseB -rwxrwxrwx 1 jwl jwl 230023168 2011-06-09 02:28 termlist.DB -rwxrwxrwx 1 jwl jwl 581 2010-07-18 00:36 value.baseA -rwxrwxrwx 1 jwl jwl 576 2010-07-18 00:36 value.baseB -rwxrwxrwx 1 jwl jwl 36864000 2010-07-18 00:36 va...
2018 Apr 07
3
Database corruption after clean rebuild
...tal files in 58s (341 files/sec.). >    Added 19605 new messages to the database. > > $ xapian-check .mail/.notmuch/xapian/ >    docdata: >    blocksize=8K items=63 firstunused=1 revision=2 levels=0 root=0 >    B-tree checked okay >    docdata table structure checked OK >    termlist: >    blocksize=8K items=43520 firstunused=8293 revision=2 levels=2 root=748 >    xapian-check: DatabaseError: 1 unused block(s) missing from the free > list, first is 0 OK, so probably not related to reference loops (although that patch is not very well tested). It's not clear how n...
2012 Nov 21
1
about index speed of xapian
...20K 11-21 17:22 postlist.baseB -rw-rw-r-- 1 warren warren 1.4G 11-21 17:24 postlist.DB -rw-rw-r-- 1 warren warren 2.0K 11-21 17:24 record.baseA -rw-rw-r-- 1 warren warren 1.8K 11-21 17:22 record.baseB -rw-rw-r-- 1 warren warren 121M 11-21 17:24 record.DB -rw-rw-r-- 1 warren warren 6.7K 11-21 17:24 termlist.baseA -rw-rw-r-- 1 warren warren 6.1K 11-21 17:22 termlist.baseB -rw-rw-r-- 1 warren warren 428M 11-21 17:24 termlist.DB too big! is there any problem about my code, and is there any way to impove index speed? thank you
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum
2011 Jul 19
1
xapian-compact ok, xapian-check failure
...src2.... tmp_dst -- works as expected, exit code 0. xapian-check tmp_dst -- produces the following error for the postlist: postlist: baseB blocksize=64K items=28175410 lastblock=117541 revision=1 levels=2 root=3493 B-tree checked okay document id 68511: length 32400 doesn't match 27468 in the termlist table postlist table errors found: 1 Using "delve -d -r68511" I can identify the source index which I then delveilter out so automated batch merging can continue. However, since this was the first time I encountered this particular error, I decided to check the source index (id'd wi...
2011 Apr 01
0
Xapian-discuss Digest, Vol 83, Issue 1
...-r--r-- 1 kevin kevin 70G 2011-03-31 00:49 postlist.DB > -rw-r--r-- 1 kevin kevin 14 2011-03-31 00:49 record.baseA > -rw-r--r-- 1 kevin kevin 261K 2011-03-31 01:24 record.baseB > -rw-r--r-- 1 kevin kevin 131G 2011-03-31 01:24 record.DB > -rw-r--r-- 1 kevin kevin 14 2011-03-31 01:24 termlist.baseA > -rw-r--r-- 1 kevin kevin 192K 2011-03-31 01:50 termlist.baseB > -rw-r--r-- 1 kevin kevin 96G 2011-03-31 01:50 termlist.DB > > $ delve . > number of documents = 219344757 > average document length = 28255.9 > document length lower bound = 1 > document length upper b...