Displaying 20 results from an estimated 8000 matches similar to: "Is there a large variance in xapian searching?"
2014 Mar 01
2
Cleaning the index
Just curious: How does Xapian clean up postings/words from deleted documents? Does it just remove them whenever a posting node is COWed in the Btree? Or is there some kind of periodic reaper function?
Thanks!
Matt
2020 Nov 06
2
Process to Incorporate Functions from {parallely} into base R's {parallel} package
Hi all,
Henrik Bengtsson has done some fantastic work with {future} and, more importantly, greatly improved constructing and deconstructing a parallelized environment within R. It was with great joy that I saw Henrik slowly split off some functionality of {future} into {parallelly} package. Reading over the package?s README, he states:
> The functions and features added to this package are
2020 Nov 07
2
Process to Incorporate Functions from {parallely} into base R's {parallel} package
FWIW, there are indeed a few low hanging bug fixes in 'parallelly'
that should be easy to incorporate into 'parallel' without adding
extra maintenance. For example, in parallel::makePSOCKcluster(), it
is not possible to disable SSH option '-l USER' so that it can be set
in ~/.ssh/config. The remote user name will be the user name of your
local machine and if you try to
2018 Sep 19
2
Couldn't detect type of database
I recently lost a hard drive and after successfully restoring
everything, I think, I'm getting "Error opening database `current.1':
DatabaseOpeningError: Couldn't detect type of database"
The directory current.1 contains the following files:
-rw-rw-r-- 1 jwl jwl 30064640 Aug 28 23:44 docdata.glass
-rw-rw-r-- 1 jwl jwl 151 Aug 28 23:44 iamglass
-rw-rw-r-- 1 jwl jwl
2016 Jul 06
2
Xapian 1.4.0 released
I have installed the new Xapian 1.4.0 , during the installation, I
haven't seen any problems, however, when I execute commands quest and
delve I get different versions, and my Perl-based searches return
Exception: Couldn't detect type of database ... and what are these
glass things in the index directories? There is a no new version of
Perl Search::Xapian.
$ quest -version
quest -
2016 Jun 25
2
Xapian 1.4.0 released
I'm delighted to announce the release of 1.4.0. You can download from:
http://xapian.org/download
This is a major milestone release, but the last development release (1.3.7)
was essentially a release candidate so the changes arefairly minor - the only
notable change is the update to Unicode 9.0.0.
That means a short thank you list for this release - thanks to Andy
Chilton!
As always, if
2017 Apr 03
3
errors on rebuild
On Sat, Mar 25, 2017 at 06:36:25PM -0500, Ryan Cross wrote:
> After upgrades my stack is now:
>
> Python 2.7
> Django 1.8
> Haystack 2.6.0
> Xapian 1.4.3. (latest xapian haystack backend with some modifications)
>
> Using the same rebuild command as below but with —batch-size=50000
>
> The issue has now become one of performance. I am indexing 2.2 million
>
2017 Mar 02
2
errors on rebuild
Hi Olly,
Thanks for the detailed response. I hadn’t realized there was a new xapian haystack backend. I’m going to try that but I have some upgrades to do first. Django 1.8, etc.
Thanks,
Ryan
> On Feb 28, 2017, at 3:40 PM, Olly Betts <olly at survex.com> wrote:
>
> On Mon, Feb 27, 2017 at 10:29:46AM -0800, Ryan Cross wrote:
>> I am trying to rebuild an index of 2+
2017 Feb 27
2
errors on rebuild
Hello,
I am trying to rebuild an index of 2+ million documents and have not been successful. I am running
Python 2.7
Django 1.7
Haystack 2.1.1
Xapian 1.2.21
The index rebuild command I’m using is: django-admin.py rebuild_index --noinput --batch-size=100000
The rebuild completes but an immediate xapian-check returns this error:
xapian-check ./archive_index
record:
baseB blocksize=8K
2019 Nov 21
2
How to make xapian run in hadoop
Hi all,
We use xapian as the backend of our system. Now the data need be indexed ever-increasing, and the local mode is hard to maintain, so we plan to move the index builder to hadoop. We try to make xapian can be run in hadoop, and now met a problem that there are many seek operations when xapian writes the index files, but the method seek() in hadoop c api only support read, and we blocked by
2015 Feb 03
2
Fwd: Waiting for Reply regarding "TestCases Failure"
---------- Forwarded message ----------
From: Saad Ahmed <ch.saad.ahmed at gmail.com>
Date: 3 February 2015 at 21:10
Subject: Waiting for Reply regarding "TestCases Failure"
To: Xapian Development <xapian-devel at lists.xapian.org>
I have been waiting for reply regarding any further steps to take.
Following are the outputs of commands that you asked me to run. All these
2018 Mar 30
2
sorting large msets
Hello, is there a way to optimize sorting by certain values
for queries which return a huge amount of results?
For example, I just want a simple query that gives me the 200
most recent emails out of millions. The elapsed time for
get_mset increases as the number of documents ($n * 2000)
increases.
I suppose I could store a pre-sorted set using SQLite or
similar. Thanks in advance for any
2020 Aug 21
2
MultiDatabase shard count limitations
Going back to the "prioritizing aggregated DBs" thread from
February 2020, I've got 390 Xapian shards for 130 public inboxes
I want to search against(*). There's more on the horizon (we're
expecting tens of thousands of public inboxes).
After bumping RLIMIT_NOFILE and running ->add_database a bunch,
the actual queries seem to be taking ~30s (not good :x).
Now I'm
2016 Jul 12
3
Xapian 1.4.0 released
On Mon, Jul 11, 2016 at 02:02:56PM -0700, Kevin Duraj wrote:
> You are saying that when I search for "delve Xapian 1.4" on Google, a
> company worth of 491 Billion of Dollars and you saying that their top
> of the search result has nothing to do with Xapian.
>
> https://www.google.com/search?q=xapian+delve&ie=utf-8&oe=utf-8#q=delve+xapian+1.4
Well, I'm not
2016 Apr 11
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes:
> On Sun, Apr 10, 2016 at 04:47:01PM +0200, Jean-Francois Dockes wrote:
> > Some might notice the 50% index size increase. Excessive index size is
> > already one relatively rare, but recurring complaint. Except if I did
> > something wrong: I'm actually quite surprised by it.
>
> Did you try compacting the resulting databases?
>
>
2016 Apr 12
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes:
> On Mon, Apr 11, 2016 at 09:54:36AM +0200, Jean-Francois Dockes wrote:
> > The question which remains for me is if I should run xapian-compact
> > after an initial indexing operation. I guess that this depends on the
> > amount of expected updates and that there is no easy answer ?
>
> I think it's not obvious whether it's a good plan
2011 Jan 24
2
Memory leak
Hello,
There is a memory leak in Xapian 1.2.4.
We use a persistant connection in FastCGI processes. As soon
as we catch this exception, "dmalloc" recognizes memory leaks:
The revision being read has been discarded - you should
call Xapian::Database::reopen() and retry the operation
Down below the output of "dmalloc".
This happens only on the production system. On my
2011 Aug 09
3
what is the fastest way to fetch results which are sorted by timestamp ?
what is the fastest way to fetch results which are sorted by timestamp ?
i want to use xapian as my search engine , use add_boolean_term(something) and add_value(0,sortable_serialise(get_timestamp())) to a doc.
search through enquire.set_weighting_scheme(xapian.BoolWeight()) and enquire.set_sort_by_value(0,True) to ensure that the results are sorted by the timestamp.
This method is ok , but
2020 Aug 27
4
Xapian on Android?
Friends,
I would like to hear from anyone who has experience deploying Xapian on Android. I'm new to Xapian, but I know it is used by a couple partners for offline projects on Linux and Windows.
Our small nonprofit, WiderNet, provides off-line access to thousands of Web sites for people who lack Internet connectivity (www.widernet.org). Over 2,000 universities, schools, health care sites,
2020 Oct 21
2
xapian-check sorted order error
Hi,
We were running xapian-check on one of our Xapian indexes and it
returns the following error:
position:
baseB blocksize=8K items=809896869 lastblock=2090419 revision=3161
levels=3 root=2084903
Failed to check B-tree: DatabaseError: Items not in sorted order
The other tables verify without issue. It looks like our oldest backup
of this database (a month old) has the same issue. Searching and