similar to: Incremental updates and disk space ...

Displaying 20 results from an estimated 2000 matches similar to: "Incremental updates and disk space ..."

2010 Nov 01
1
floating-point issues with set_sort_by_relevance_then_value? (1.2.3, BM25 k1=0)
I am using BM25 with k1=0 and min_normlen=1 to get weights unaffected by document length and term frequency in the document (min_normlen=1 isn't necessary I guess) and am expecting single-term weights to be identical for all matches. I have added a document value to steer such general search queries and it works fine, except that for some search terms, I get results like:
2010 Oct 28
1
hypens in words + NEAR + 3 terms + AND_MAYBE => crash
Probably an uncaught malformed query - the following form of search queries causes a crash for me (core 1.2.3, Perl API, 64bit Debian Lenny, self-compiled): x-y NEAR test NEAR test The first term can be anything with a hyphen in it but word characters at the beginning and end ("3--3" will do). The other 2 terms can be anything. "test NEAR x-y NEAR test" will not cause a
2013 Feb 14
1
Go (golang) bindings for Xapian?
Hi, is anyone working on Xapian bindings for Go? SWIG supports Go since version 2.0 (http://www.swig.org/Doc2.0/Go.html), but there's some Go-specific code that needs to be written. Unfortunately, I have 0 experience both with SWIG and hacking on the Xapian bindings, so I probably cannot do this as a weekend project. It would come in very handy though. Regards, Marinos
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
This is my test case, what am I doing wrong? It seems that the API is used incorrectly, but I cannot find the problem... --- 8< --- #!/usr/bin/perl use Search::Xapian qw(:all); use strict; my $xa = new Search::Xapian::WritableDatabase ("/tmp/xapian", DB_CREATE_OR_OVERWRITE); my $indexer = Search::Xapian::TermGenerator->new();
2010 Aug 30
1
getdents() with 4KB buffer - seems slow (Maildir, large inbox)
Hi, I have a very large inbox (~146K mails) in Maildir format and dovecot seems to spend a lot of time rescanning the directory, especially when the server is loaded. I'm not sure whether this is triggered by Thunderbird or done regularly, but it takes longer when the server is loaded, so sometimes it seems that it is scanning continuously. Since it takes around 2000 getdents64() syscalls to
2015 Feb 03
2
Fwd: Waiting for Reply regarding "TestCases Failure"
---------- Forwarded message ---------- From: Saad Ahmed <ch.saad.ahmed at gmail.com> Date: 3 February 2015 at 21:10 Subject: Waiting for Reply regarding "TestCases Failure" To: Xapian Development <xapian-devel at lists.xapian.org> I have been waiting for reply regarding any further steps to take. Following are the outputs of commands that you asked me to run. All these
2017 Apr 03
3
errors on rebuild
On Sat, Mar 25, 2017 at 06:36:25PM -0500, Ryan Cross wrote: > After upgrades my stack is now: > > Python 2.7 > Django 1.8 > Haystack 2.6.0 > Xapian 1.4.3. (latest xapian haystack backend with some modifications) > > Using the same rebuild command as below but with —batch-size=50000 > > The issue has now become one of performance. I am indexing 2.2 million >
2010 Jan 14
1
Latest revision and backwards compatibility
Greetings, I've been wondering about the index format and backwards compatibility. We're using the dev version (for chert) and each svn up means that any indexes created prior to this revision cannot be read. Is this purely a cautious move to prevent errors, and, barring any obvious index format changes, can I safely force the current revision to read existing indexes? eg, by
2016 May 31
1
Need info on chert and flint
Hi, I am new to xapian and read somewhere that xapain was using flint before released version 1.1 now it is using chert. I am looking for differences between both flint vs chert respective to xapian and advantages of chert. thanks Smriti
2013 Apr 26
1
remote backend
So, given what I've read in the documentation I would create a text file named document_database.txt that might have the following: remote 192.168.1.10:30000 chert /var/lib/xapian_database/segment1 remote 192.168.1.10:30000 chert /var/lib/xapian_database/segment2 remote 192.168.1.10:30000 chert /var/lib/xapian_database/segment3 etc. I would then in my PHP program open
2010 Oct 15
1
Chert backend
As Chert backend is not the default Xapian backend for the 1.2 branch, I am thinking about switching from Flint to Chert. However, I am wondering how do both compare in performance and index size? My indexes are updated quite frequently, so this is my priority. Which is better in this case? Regards, PK
2017 Dec 08
2
xapian 1.4 performance issue
Olly Betts writes: > On Thu, Dec 07, 2017 at 10:29:09AM +0100, Jean-Francois Dockes wrote: > > Recoll builds snippets by partially reconstructing documents out of index > > contents. > > > [...] > > > > The specific operation which has become slow is opening many term position > > lists, each quite short. > > The difference will actually
2013 Jan 17
1
FASTER Search
I am suffering for slow searching performance on Xapian. I am using Xapian for indexing about 150,000,000 documents. It was implemented in C++; The performance of searching was not that fast. e.g. Searching a query, which includes about 20 terms, needs 2 secs avg. For searching, I followed such steps: 1. construct a QueryParser for certain string 2. parse the query to get a Xapian::Query
2009 Jan 16
1
chert vs flint vs lucene
Hi, What's the main difference between chert and flint? What above vs lucene? I am mainly asking about data structure (lexicon, posting list, document data), what's in memory, what's on disk, hash vs b-tree and reasons behind them. Any pointer is appreciated. Thanks! Crystal -------------- next part -------------- An HTML attachment was scrubbed... URL:
2015 Feb 06
2
Fwd: Waiting for Reply regarding "TestCases Failure"
> Is that the complete output? Yes it is the complete output against "./runtest ./apitest --verbose topercent2"( after running make remove-cached-databases). I attached the snapshot of the output of commands but the size of the email got bigger than 40kb so i had to place the output as text. If I do not run "make remove-cached-databases" and run "./runtest ./apitest
2008 Apr 29
1
generic question ==>> mapping Longhurst biogeochemical ocean provinces in R
*Hello all ** I am a newbie to R plotting maps. I am trying to plot over a world map a layer of Biogeochemical provinces (BGCP) by A.R. Longhurst. Each ocean region unfortunately are quite irregular in shape (not perfect squares). In GIS this layer of ocean provinces would be a layer of polygons, which I am assuming it cannot be plotted with R. I was wondering If anybody has encounter this
2019 Jul 09
2
Transitioning notmuch/Xapian from 32-bit to 64-bit system
Hi! Suppose you have a huge notmuch/Xapian database, built on a 32-bit system (well, actually on x86_64-pc-linux-gnu, but using a years old 32-bit notmuch binary; notmuch 0.9, Xapian 1.2.21 -- don't laugh), and suppose you're finally going to update that years old notmuch installation (release by release, forward-porting a bunch of patches). Naturally, I'd now do a native 64-bit
2010 Nov 15
4
Stopword addition and stemming
Hi, Two questions which I'm unsure about: Stemming: I've turned on stemming, etc, but how can I confirm that it's being used in searches? What should I look/search for? Stopwords: I'm trying out xapian on a regional dataset (searching data from a *.co.us TLD, eg) . I've noticed that searching for [bob co.us] results in *very* slow search times (tens of seconds), since it
2015 Sep 30
1
brass and chert / xapian port to Interix
Report by Eric Lindblad 30-09-2015 http://www.ericlindblad.blogspot.com The xapian-core-1.2.21 'ambiguous overload' error on the files /backends/brass/brass_check.cc and /backends/chert/chert_check.cc appear to be resultant of a bug reported in gcc-3.3.4, which was fixed for 3.4.2 [Sept. 6, 2004] and 3.5. Bug 16854 - streams missing "long long" specializations on Tru64
2016 Apr 07
2
slowdown in notmuch perf suite with xapian 1.3.5
I hadn't noticed any interactive slowdown, but when I got around to running the notmuch performance suite, there seems to be some noticable slowdown with the glass backend (default in Xapian 1.3.5) compared to chert (using xapian 1.2.22) These tests are on an older i7 with 12G of RAM and an SSD. I'm reasonable confident they are CPU bound. One curious thing is the increase in system time