similar to: Compact databases and removing stale records at the same time

Displaying 20 results from an estimated 600 matches similar to: "Compact databases and removing stale records at the same time"

2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum
2019 Jan 31
4
Amount of writes during index creation
Olly Betts writes: > On Mon, Jan 21, 2019 at 03:25:01PM +0100, Jean-Francois Dockes wrote: > > I have had a problem report from a Recoll user about the amount of writes > > during index creation. > > > > https://opensourceprojects.eu/p/recoll1/tickets/67/ > > > > The issue is that the index is on SSD and that the amount of writes is > >
2006 Dec 06
1
Bug and patch for +terms with wildcards
In current Xapian SVN HEAD, there is a bug in the query parser concerned with the handling of wildcard terms with a "+" prefix. Specifically, a query such as "+foo* bar" will be parsed by the query parser into Xapian::Query("bar") if there are no terms in the database which start "foo". Instead, since the "+" term cannot be matched, I believe
2010 Jun 24
1
Quickest way to retrieve data for a large match set?
We're using the Perl binding to access Xapian in a simple search of image metadata (title and keywords). Due to the specification for the search engine, by default we have to sort the results using a function of the search rank, age (well, newness) and popularity (rated by sales of the image). As a result, we have to fetch the complete result set and then calculate a new ranking based on
2016 Apr 12
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes: > On Mon, Apr 11, 2016 at 09:54:36AM +0200, Jean-Francois Dockes wrote: > > The question which remains for me is if I should run xapian-compact > > after an initial indexing operation. I guess that this depends on the > > amount of expected updates and that there is no easy answer ? > > I think it's not obvious whether it's a good plan
2017 Dec 29
2
notmuch: Xapian exception during database creation
Running notmuch from git on Debian testing[1] with the mail and database sitting on a ZFS filesystem, adding mail to a new database: > agrajag-testing ~/s/notmuch % ./notmuch new > Found 605510 total files (that's not much mail). > add_file: A Xapian exception occurred36m 37s remaining). > A Xapian exception occurred adding message: Unexpected end of posting list for
2010 Jan 18
3
postlist: Tag containing meta information is corrupt.
Greetings, Using latest svn. I've noticed the following error when performing index merging: postlist: baseB blocksize=8K items=33962 lastblock=534 revision=1 levels=2 root=459 B-tree checked okay Tag containing meta information is corrupt. postlist table errors found: 1 I can still search on this index (I've only checked very small indexes), but merging is now a problem since I check
2017 Jul 31
2
Segmentation fault in matcher/queryoptimiser
Since a couple of weeks we are experiencing occasional segmentation faults within Xapian 1.5. We can't reproduce the crashes, but we have strong hints that they are due to memory corruption. We have narrowed down our root cause analysis to phrase searches on multi-databases that fail on reading the `hint` field in the`QueryOptimiser`class [1]. We'd appreciate any hints on how to fix this.
2009 Jul 15
2
XAPIAN_FLUSH_THRESHOLD
I'm playing around with a machine that has 2 GB of memory. Indexing about 5GB of data average of 2MB per document. The documents are plain text. I notice the omindex's memory fott print get's biger an bigger then the machine starts to swap and it all slows down to a crawl. In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000 Am I right in saying that for my setup
2011 Jul 19
1
xapian-compact ok, xapian-check failure
Greets, I've encountered the following while performing test merges (and writing code to handle errors, etc so things can be automated) and wondering about the best way to proceed: xapian-compact -b64k -m src1 src2.... tmp_dst -- works as expected, exit code 0. xapian-check tmp_dst -- produces the following error for the postlist: postlist: baseB blocksize=64K items=28175410
2023 Aug 19
1
does Xapian::Enquire hold an MVCC revision?
Olly Betts <olly at survex.com> wrote: > On Fri, Aug 18, 2023 at 10:41:52AM +0000, Eric Wong wrote: > > Olly Betts <olly at survex.com> wrote: > > > While the match is running, get_mset(2000, 1000) needs to track > > > 3000 entries so this won't reduce your heap usage (at least not > > > peak usage). > > > > > > Is the heap
2010 Dec 18
1
Xapian index size 475GB = 170 million documents (URLs)
Xapians, I am maintaining about two indexes for my search engines which approximately is each the same size. I would like to share this knowledge with you, since many of you have never seen Xapian index of this size. And of course you can search the index by yourself at - http://myhealthcare.com/ - http://find1friend.com/ I need 2 x 100 million more documents into each index, and I hope it will
2016 Dec 13
2
Pull requests: CJK words and Snippet generator
On Tue, Oct 04, 2016 at 10:37:49AM +1100, Bron Gondwana wrote: > Robert is in Australia visiting the FastMail office to co-work with us for a > couple of months, and I'd love to get this Xapian integration work done > during this time. We're also looking to release Cyrus IMAPd version 3.0 some > time in the next few months, and it would be great to not depend on too many >
2017 Dec 15
2
Question about imap (expunge response)
(This is not neccesarily about dovecot, but rather IMAP protocol) At https://drive.google.com/open?id=1j3oa5jYeSdiPbgaihq02K-u_vHbZLJZQ is fetchmail log from my sessinon with polish email provider "Wirtualna Polska" As you can se fetchmail logged "* 1 EXPUNGE" as a response to "STORE" command. According to https://tools.ietf.org/html/rfc3501#section-7.4.1 EXPUNGE
2018 Jan 03
2
Storing the documents text: data record or value ?
Hi, Following the Recoll snippets generation performance problem caused by the new positions list storage scheme in Xapian 1.4, I am experimenting with generating snippets from the complete document text stored in the index. This increases the index size much less than I would have expected (around 10-15% apparently with my home directory data), which is good news obviously. I have tried
2018 Feb 27
1
modifying the DB while iterating is user error, right?
Hello, I noticed a problem with DatabaseCorruptError exceptions with public-inbox and I guess it's user error... The problem is public-inbox was calling replace_document to modify the DB while iterating through a PostingIterator. At first I thought it was a glass problem, but I've hit it with chert on my dataset, too. I have a standalone Perl script to reproduce the problem at
2019 Nov 14
4
JMAP: Re: http API for IMAP
Am 14.11.19 um 14:03 schrieb Benny Pedersen via dovecot: > Thomas G?ttler via dovecot skrev den 2019-11-14 08:55: > >> Is there already an open source imap2jmap server? > > why do you say imap here ? > > https://www.cyrusimap.org/imap/developer/jmap.html > > cyrus already have it, we just wait for dovecot :) I used my favorite search engine (ecosia) and found
2016 Apr 11
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes: > On Sun, Apr 10, 2016 at 04:47:01PM +0200, Jean-Francois Dockes wrote: > > Some might notice the 50% index size increase. Excessive index size is > > already one relatively rare, but recurring complaint. Except if I did > > something wrong: I'm actually quite surprised by it. > > Did you try compacting the resulting databases? > >
2007 Apr 05
1
Re: [Xapian-commits] 8107: trunk/xapian-core/ trunk/xapian-core/backends/
olly wrote: > Log message (7 lines): > backends/database.cc: Database::Internal can't call the > PostingIterator(PostingIterator::Internal*) ctor (at least under > g++ 3.3.5) because it isn't a friend (only class Database is). For the record, Mark just reported this to me under windows so it was a problem there too, but it does work under GCC 4.1. No idea which compiler is
2019 Aug 26
2
Commit error with Xapian 1.4.11
A Recoll user gets the following message while indexing: "Attempted to delete or modify an entry in a non-existent posting list for #bannerholder" The exception happens during a commit call. Xapian version 1.4.11, Debian Buster A little more detail here: https://opensourceprojects.eu/p/recoll1/tickets/108/ I asked if this was reproducible, and to run the indexing in single-thread