similar to: Interesting xapian-compact observations

Displaying 20 results from an estimated 10000 matches similar to: "Interesting xapian-compact observations"

2010 Feb 02
1
Optimal usage of xapian-compact for merging
Greets, I've been wondering, what's the sane/optimal use of xapian-compact when merging many indexes with a view to maximum merging performance? The obvious: - only use -F on the final db. - use -m since I'm merging more than 3 dbs. Best strategy? a) loop: merge batches (of say 50, where the individual db's are small) into a temp index, then merge the (larger) temp into the
2007 Jan 12
1
Re: [Xapian-commits] 7603: trunk/xapian-core/ trunk/xapian-core/backends/flint/ trunk/xapian-core/backends/quartz/
On Tue, Jan 02, 2007 at 03:55:59PM +0000, richard wrote: > * backends/quartz/btree.cc,backends/flint/flint_io.h: Patches from > Charlie Hull to allow 2GB+ index files work when compiled using > Visual C++. I suspect that xapian-compact.cc (and quartzcompact.cc if you can be bothered) will also need fixing since they use off_t. You need to make sure that the stat() function called
2011 Jul 19
1
xapian-compact ok, xapian-check failure
Greets, I've encountered the following while performing test merges (and writing code to handle errors, etc so things can be automated) and wondering about the best way to proceed: xapian-compact -b64k -m src1 src2.... tmp_dst -- works as expected, exit code 0. xapian-check tmp_dst -- produces the following error for the postlist: postlist: baseB blocksize=64K items=28175410
2013 Jun 19
2
Compact databases and removing stale records at the same time
I'm trying to compact (or at least merge) multiple databases, while stripping search records which are no longer required. Backstory: I've inherited the Cyrus IMAPd xapian-based search code from Greg Banks when he left Opera. One of the unfinished parts was removing expunged emails from the search database. We moved from having a single search database to supporting multiple
2018 Jul 12
1
Error while compacting: Bad position key
Mike Hommey <mh at glandium.org> writes: > Hi, > > When running `notmuch compact` today, it stopped with the following > output: > > Compacting database... > compacting table postlist > Reduced by 25% 648656K (2498904K -> 1850248K) > compacting table docdata > Reduced by 15% 24K (152K -> 128K) > compacting table termlist > Reduced by
2006 May 09
1
xapian-compact fails to rename iamflint.tmp for win32/cygwin builds
Hello, Any call to xapian-compact compiled for Win32/Cygwin ultimately fails when it tries to rename/replace iamflint.tmp with the final version. The reason is you cannot rename an opened file with Win32/Cygwin libc. To correct this, just close the output ofstream (xapian-compact.cc(534)) like: """ ofstream output(dest.c_str()); if (!output.write(buf, input.gcount())) { cerr
2006 Dec 22
1
xapian + win XP + 2GB
hi, I have problem with Xapian on Win XP. I compile Xapian with VC 7.0 using win32.mak from Lemurconsulting. I want to use files 2GB+ but unfortunetly this not work. I know that configure in Linux version change something to allow using large files. Could anyone help me. regards, Grzegorz ---------------------------------------------------- Dwie kobiety i ten sam problem -
2006 Jun 13
1
xapian-compact seg faulting & Re: Error msg xapian-compact: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation
I am fairly confident that these issues are related to killing the scriptindex process ungracefully causing blocks that were queued for writing to disk to not get written. I mention to send you the file because it could be that you would see almost immediately with the situation is. Thanks > ----- Original Message ----- > From: oscaruser@programmer.net > To:
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum
2011 Jul 13
1
Feature request: Determining source index of xapian-compact DatabaseError exception
Greets, When merging lots of subindexes in batches like so: xapian-compact -m idx1 idx2... dstidx Errors such as: xapian-compact: DatabaseError: Error reading block 0: got end of file present a problem since it does not provide the offending path name (of the broken index) for easy identification/removal in automated/batch scenarios (the way DatabaseOpeningError:.... does, eg). The only way
2016 Apr 10
2
Xapian 1.3.5 snapshot performance and index size
Hi, I ran some tests with Recoll to compare Xapian 1.2.22 and 1.3.5 performance. I mostly used two relatively small document sets (realistic/typical recoll data subsets). The first set is a 2.2 GB mbox folder, with approximately 56K messages in 275 files, producing approximately 64K documents (because of attachments). The second set is a 11 GB folder with 5300 PDF files in it (random PDFS
2007 Jan 12
1
xapian error
Just got this error when replacing (updating) a document in the xapian index (using php bindings): Fatal error: Uncaught exception 'Exception' with message 'DatabaseError: Error reading block 16908825: got end of file' Does anyone know what this means exactly? Alec
2010 Mar 29
0
Optimal usage of xapian-compact for merging
On Tue, March 23, 2010 19:46, Kevin Duraj wrote: > I am merging 300 indexes at once, it takes less than a day for merge > to happen for 100 million documents, during merging I notice very heavy IO. That IO sounds pretty normal. To help with IO load, we have a dedicated index store cluster, dedicated source data cluster, dedicated indexing cluster, etc. Sigh. Each time I think we have
2020 Oct 21
2
xapian-check sorted order error
Hi, We were running xapian-check on one of our Xapian indexes and it returns the following error: position: baseB blocksize=8K items=809896869 lastblock=2090419 revision=3161 levels=3 root=2084903 Failed to check B-tree: DatabaseError: Items not in sorted order The other tables verify without issue. It looks like our oldest backup of this database (a month old) has the same issue. Searching and
2007 Jan 30
1
Re: [Xapian-commits] 7603: trunk/xapian-core/trunk/xapian-core/backends/flint/ trunk/xapian-core/backends/quartz/
hi, I'm using Xapian on Windows with large files. My index directory is about 65 607 466 693 bytes: 2007-01-30 09:28 17 position_baseA 2007-01-30 10:06 17 position_baseB 2007-01-23 14:18 0 position_DB 2007-01-30 10:06 360 496 postlist_baseB 2007-01-30 10:06 23 623 852 032 postlist_DB 2007-01-30 10:06 88 432
2020 Aug 27
4
Xapian on Android?
Friends, I would like to hear from anyone who has experience deploying Xapian on Android. I'm new to Xapian, but I know it is used by a couple partners for offline projects on Linux and Windows. Our small nonprofit, WiderNet, provides off-line access to thousands of Web sites for people who lack Internet connectivity (www.widernet.org). Over 2,000 universities, schools, health care sites,
2010 Jan 14
1
Latest revision and backwards compatibility
Greetings, I've been wondering about the index format and backwards compatibility. We're using the dev version (for chert) and each svn up means that any indexes created prior to this revision cannot be read. Is this purely a cautious move to prevent errors, and, barring any obvious index format changes, can I safely force the current revision to read existing indexes? eg, by
2013 Apr 25
2
Converting MySQL database to Xapian
I am looking for some guidance on converting a large MySQL database to Xapian. The current structure is that the database is broken up into 160 "sub-databases". There are 50,000 or so records in each stub database. Each record has content that I am full-text indexing. The average size of the text is about 59k characters. The database is broken up into sub-databases because the MySQL
2018 Jul 10
2
Xapian 1.4.5 "Db block overwritten - are there multiple writers?" with Glass
On Mon, Jul 09, 2018 at 10:29:18AM +0100, Olly Betts wrote: > The attached patch reset this cursor each time commit() is called, and > that fixes my C++ reproducer, though I think this ought to work as-is > and the real bug is at a lower level. I've dug deeper and that was indeed the case. Here's a patch which addresses the root cause:
2011 Aug 09
3
what is the fastest way to fetch results which are sorted by timestamp ?
what is the fastest way to fetch results which are sorted by timestamp ? i want to use xapian as my search engine , use add_boolean_term(something) and add_value(0,sortable_serialise(get_timestamp())) to a doc. search through enquire.set_weighting_scheme(xapian.BoolWeight()) and enquire.set_sort_by_value(0,True) to ensure that the results are sorted by the timestamp. This method is ok , but