Displaying 20 results from an estimated 10000 matches similar to: "Xapian vs Lucene"
2009 Jan 16
1
chert vs flint vs lucene
Hi,
What's the main difference between chert and flint? What above vs lucene?
I am mainly asking about data structure (lexicon, posting list, document
data), what's in memory, what's on disk, hash vs b-tree and reasons behind
them.
Any pointer is appreciated.
Thanks!
Crystal
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2011 Feb 14
1
Idea: Backend for Lucene format indexes
Hi,
I'm interested to implement the idea of using search indexes build by
Lucene. I have some idea of how to do that. I'm currently studying the index
format of both. I have written an application of search using Lucene and now
I'll do the same using Xapian. then I'll check the details of those index
format and then find how to convert Lucene index to Xapian one. Is there
somebody
2005 Nov 07
1
Re Phrase Tuning.
Thanks Olly,
I have given flint a go -and it is generally much quicker (once it has
loaded the cache -that process still takes minutes). Flint actually seems
tro do some caching whereas quartz just seemed to hit the disc constantly.
Generally I've been running it on a machine with 5G of memory -but it has
to contend with other processes for resources, I have also run it on its
own on a 1G
2007 Jun 17
2
Flint failed to deliver indexing performance to Quartz.
Flint failed to deliver indexing performance to Quartz.
I am proposing to remove Flint as default database and place Quartz
database back as default. The catch is not that Flint database is
smaller and faster during searches then Quartz database as developers
were concerning when were measuring and neglecting to measure
performance when creating the large indexes.
The truth is that Flint
2007 Feb 07
2
My new record: Indexing 20 millions docs = 79m9.378s
Gentoo Linux 2.6
8 AMD Opteron 64-bit Processors
32GB Memory
--------------------------------------------------------------------------------
Environment:
------------------
XAPIAN_FLUSH_THRESHOLD=21000000
XAPIAN_FLUSH_THRESHOLD_LENGTH=16000000
XAPIAN_PREFER_FLINT=True
Indexing 20 million documents:
--stemmer=none
-------------------------------------------
real 79m9.378s
user 77m28.696s
2013 Jun 17
2
Backend for Lucene format indexes-How to get doclength
*Or do you mean that it's one number per document whereas the other stats
are per database, so it's harder to store it?*
yes, I mean this. It's a huge data. If a new doclength list(contains all
the doclength in a list, like chert)
is added by myself, I am concern about:
1. This doclength list may be the bottlenect in this backend,
http://trac.xapian.org/ticket/326
2. Change too much
2013 Oct 30
2
Lucene 3.6.2 backend for xapian (#25)
[Replying to xapian-devel, as I think a wider audience would be useful]
On Mon, Oct 21, 2013 at 11:24:51PM +0800, jiangwen jiang wrote:
> yes, it's less efficient. Lucene database has multiple segments, each
> segment can treat as a independent database. The same term may exists in >=
> 1 segments.
Sorry for taking a while to respond - I've been both busy and mulling
this
2013 Jun 16
3
Backend for Lucene format indexes-How to get doclength
Hi, all:
I have wrote a demo patch for Backend for Lucene format indexes, Lucene
version is 3.6.2.
http://lucene.apache.org/core/3_6_2/fileformats.html
Now, this demo patch just support the basic features in Lucene. Compound
File(.cfs/.cfe)?term vector(.tvx/.tvd/.tvf)
delete document(.del) are not supported, skip list in .fdx is not supported
too
example/quest.cc is used to test this demo.
2013 Sep 02
2
Backend for Lucene format indexes-How to get doclength
On Mon, Sep 02, 2013 at 09:21:48AM +0800, jiangwen jiang wrote:
> TfIdfWeight and BM25(b=0) also need wdf_upper_bound, it is not exists in
> Lucene backends.
If you don't provide an implementation of wdf_upper_bound(), the default
is to use the collection frequency of the term, so provided that
information is available in the lucene files, the lack of
wdf_upper_bound information
2013 Aug 25
2
Backend for Lucene format indexes-How to get doclength
On Tue, Aug 20, 2013 at 07:28:42PM +0800, jiangwen jiang wrote:
> I think norm(t, d) in Lucene can used to caculate the number which is
> similar to doc length(see norm(t,d) in
> http://lucene.apache.org/core/3_5_0/api/all/org/apache/lucene/search/Similarity.html#formula_norm).
It sounds similar (especially if document and field boosts aren't in use),
though some places may rely on
2020 May 19
5
FTS-lucene errors : language not available for stemming
I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs.
Errors:
May 19 05:05:16 indexer-worker(gessel at blackrosetech.com)<62971><aPAEI3zLw17A/QAA0J78UA:EF25M3zLw1779QAA0J78UA>: Error: lucene index /mail/blackrosetech.com/gessel//lucene-indexes: IndexWriter::addDocument() failed (#4): language not available for stemming
May 19 05:05:16
2013 Aug 26
2
Backend for Lucene format indexes-How to get doclength
On Mon, Aug 26, 2013 at 09:41:07AM +0800, jiangwen jiang wrote:
> > For now, using weighting schemes which don't use document length is
> > probably the simplest answer.
>
> There's tf-idf weighting scheme on svn master, is it suitable for lucene
> backend?
Yes - TfIdfWeight doesn't ever use the document length (at least with
the normalisations currently
2006 Aug 07
3
Omega is fast, but not THAT fast
>Search took -125.376129 seconds
I double checked with a handheld stopwatch, and at no
point did the hands spin backwards.
Known problem?
2004 Oct 28
1
Lucene ranking
Kevin Burton has posted about poor ranking in Lucene preferring
shorter documents over longer ones[1]. A similar search in Xapian
returns documents in the expected order:
Performing query `Xapian::Query(foo)'
3 results found
ID 3 99% [foo foo foo]
ID 2 94% [foo foo]
ID 1 80% [foo]
Anyone know what Lucene is doing here? Their FAQ doesn't mention what
weighting scheme they use, and I
2007 Jan 12
1
xapian error
Just got this error when replacing (updating) a document in the xapian
index (using php bindings):
Fatal error: Uncaught exception 'Exception' with message 'DatabaseError:
Error reading block 16908825: got end of file'
Does anyone know what this means exactly?
Alec
2007 Jul 17
1
BUG IN XAPIAN_FLUSH_THRESHOLD
There is is bug when setting XAPIAN_FLUSH_THRESHOLD=20000000
When trying for force Xapian flush documents to flush after 20 million
documents Xapian ignores the size and flush it after only 10,000
documents.
Data captured from delve after 60 seconds interval when has been set as follow:
XAPIAN_FLUSH_THRESHOLD=20000000
perl -e ' while(1) { system("delve ."); sleep(60); } '
2019 Oct 22
5
It was twenty years ago today...
Xapian has turned 20!
Strictly speaking it was 20 years ago last month but I managed to miss
the true anniversary - the oldest commit in the Xapian repo is:
commit 8ced76ea128c8fb2792477e09b41fa989f2e572f
Author: Richard Boulton <richard at tartarus.org>
Date: Fri Sep 10 09:50:40 1999 +0000
Martins initial code, which didn't work for him but did for me.
Back then Richard,
2007 Jul 24
1
Xapian::DocNotFoundError on replace_document? (Called from Search::Xapian)
Hello,
I'm using Xapian 1.0.2 (flint) and matching Search::Xapian.
I'm getting:
terminate called after throwing an instance of
'Xapian::DocNotFoundError', which dumps core.
at first it was after adding my 2nd document (to an empty db, although
I don't know if that has any bearing) to the database with a
replace_document() call.
I shifted the first document off the
2016 Mar 26
3
PHP bindings fail on Ubuntu for xapian-bindings-1.2.21
Le 26/03/16 18:04, Olly Betts a ?crit :
> On Sat, Mar 26, 2016 at 04:51:35PM -0500, Yannick Warnier wrote:
>> On an Ubuntu 15.10, following the docs at
>> https://trac.xapian.org/wiki/FAQ/PHP%20Bindings%20Package
>>
>> When
>>
>> running debuild -e PHP_VERSIONS=5 -us -uc
>>
>> I get (sorry for the French):
>>
>>
>> "
2016 May 31
1
Need info on chert and flint
Hi,
I am new to xapian and read somewhere that xapain was using flint before
released version 1.1 now it is using chert.
I am looking for differences between both flint vs chert respective to
xapian and advantages of chert.
thanks
Smriti