Displaying 20 results from an estimated 200 matches similar to: "BUG IN XAPIAN_FLUSH_THRESHOLD"
2009 Jul 15
2
XAPIAN_FLUSH_THRESHOLD
I'm playing around with a machine that has 2 GB of memory.
Indexing about 5GB of data average of 2MB per document.
The documents are plain text.
I notice the omindex's memory fott print get's biger an bigger then the
machine starts to swap and it all slows down to a crawl.
In regards to export XAPIAN_FLUSH_THRESHOLD I know the default is 10000
Am I right in saying that for my setup
2007 Feb 07
2
My new record: Indexing 20 millions docs = 79m9.378s
Gentoo Linux 2.6
8 AMD Opteron 64-bit Processors
32GB Memory
--------------------------------------------------------------------------------
Environment:
------------------
XAPIAN_FLUSH_THRESHOLD=21000000
XAPIAN_FLUSH_THRESHOLD_LENGTH=16000000
XAPIAN_PREFER_FLINT=True
Indexing 20 million documents:
--stemmer=none
-------------------------------------------
real 79m9.378s
user 77m28.696s
2007 Jun 17
2
Flint failed to deliver indexing performance to Quartz.
Flint failed to deliver indexing performance to Quartz.
I am proposing to remove Flint as default database and place Quartz
database back as default. The catch is not that Flint database is
smaller and faster during searches then Quartz database as developers
were concerning when were measuring and neglecting to measure
performance when creating the large indexes.
The truth is that Flint
2007 Oct 01
3
How to beat Google aka Xapian & Natural Language Processing.
Xapians!
If tomorrow Xapian search engine would achieved the same performance
and result in searches as Google we would not be able to beat Google,
because we would create only a copy of the searches that already
exists from Google search engine. However there is a way to beat
anyone, and there is a way to beat Google successfully as well just do
not give up. Some see it as implementing Ajax, or
2007 Jul 24
1
Xapian::DocNotFoundError on replace_document? (Called from Search::Xapian)
Hello,
I'm using Xapian 1.0.2 (flint) and matching Search::Xapian.
I'm getting:
terminate called after throwing an instance of
'Xapian::DocNotFoundError', which dumps core.
at first it was after adding my 2nd document (to an empty db, although
I don't know if that has any bearing) to the database with a
replace_document() call.
I shifted the first document off the
2007 Oct 16
1
Xapian 1.0.3_svn9466 - OK!
After couple of day of hacking my Fedora 6 server, finally I was able
to install the new version of Xapian 1.0.3_svn9466 from trunk.
Steps
-----------
1. Removed all old Xapian files and libraries from entire server.
2. Installed Xapian 1.0.3_svn9466.
3. libxapian.so.15 used to be in directory /usr/local/lib64/ however
this time the library was in /usr/local/lib/ directory
4. cp
2007 Jul 09
7
Xapian pubmeet
Hi all,
A few of us have been discussing whether we should have a Xapian social
gathering of some kind. The current idea is meeting up in a pub in
London some time in autumn for drinks and food. However all of this
really depends on who might be able to come! It would be a chance to
meet other Xapian enthusiasts in an informal social setting and talk
about all things search-related (and
2012 Dec 29
3
omindex killed
I'm finding that omindex is consistently ending prematurely when
indexing certain files. The last output looks like this:
[Entering directory /compounds/Acetic_acid]
Indexing "/MATLAB/compounds/Acetic_acid/AACID_50T.TXT" as text/plain ...
added.
Indexing "/MATLAB/compounds/Acetic_acid/AACID_50T.pdf" as
application/pdf ... "pdftotext -enc UTF-8
2004 Oct 08
1
indexing performance
I've some trouble with my indexer, which builds on simpleindex.cc. The problem
is that indexing process becomes very slow after we indexed 2000k docs (though
the indexer works quite well with first 2000k docs). It took almost three
weeks to index 8 million docs. However, we need to index about 20 million
docs. I have to stop the indexer due to its performance.
I think my question is
2012 Nov 21
1
about index speed of xapian
hi,
i use xapian to index a txt file, it's size is 268M. i take each line as a document, and each line has two field like 13445511 | 111115151. the recored size is 10000000. the XAPIAN_FLUSH_THRESHOLD set 1000000. it takes 1026544ms to index the file, it is more slower than lucene. The lucene speed is about 40000 records per second.
code:
try
{
Xapian::WritableDatabase
2010 Dec 18
1
Xapian index size 475GB = 170 million documents (URLs)
Xapians,
I am maintaining about two indexes for my search engines which
approximately is each the same size. I would like to share this
knowledge with you, since many of you have never seen Xapian index of
this size. And of course you can search the index by yourself at
- http://myhealthcare.com/
- http://find1friend.com/
I need 2 x 100 million more documents into each index, and I hope it
will
2010 Aug 23
2
NetBeans and Java Bindings
Hello,
I was wondering if anyone has succeeded in getting the Java bindings to work
with NetBeans, in order to make use of NetBeans's GUI developer. I've had no
luck so far, does anyone know how to do that?
Many thanks.
2007 Jun 05
7
Chinese, Japanese, Korean Tokenizer.
Hi,
I am looking for Chinese Japanese and Korean tokenizer that could can
be use to tokenize terms for CJK languages. I am not very familiar
with these languages however I think that these languages contains one
or more words in one symbol which it make more difficult to tokenize
into searchable terms.
Lucene has CJK Tokenizer ... and I am looking around if there is some
open source that we
2016 Jul 12
3
Xapian 1.4.0 released
On Mon, Jul 11, 2016 at 02:02:56PM -0700, Kevin Duraj wrote:
> You are saying that when I search for "delve Xapian 1.4" on Google, a
> company worth of 491 Billion of Dollars and you saying that their top
> of the search result has nothing to do with Xapian.
>
> https://www.google.com/search?q=xapian+delve&ie=utf-8&oe=utf-8#q=delve+xapian+1.4
Well, I'm not
2009 Sep 30
2
C++ parser for doc.get_data() result.
Xapians!
Did anybody wrote and would like to share a routines that parse result
from doc.get_data() into some key and pair values in C++ ?
Code:
Xapian::Document doc = i.get_document();
string data = doc.get_data();
mymap = parse_result(data);
As you know the data string contain all the data within the document
delimited by "=" sign and "\n" new line and needs to be parse
2012 Nov 14
4
xapian-replicate errors
Hi,
While trying to setup xapian replication (initially for backup
purposes), I'm encountering some errors.
Our "fresh" index starts replication, and ends up with an index size
that matches the replication master (4.5GB), but then throws :
"Getting update for fresh from fresh
xapian-replicate: NetworkError: Unable to fully synchronise: Database
changing too fast"
I
2007 Apr 09
1
Re: [Xapian-commits] 8157: trunk/xapian-core/ trunk/xapian-core/backends/flint/ trunk/xapian-core/backends/quartz/
olly wrote:
> Log message (6 lines):
> backends/flint/flint_database.cc: Delete the corresponding entry
> (if any) from doclens in delete_document(). Add assertion to
> add_document_() that the corresponding entry in doclens isn't
> already set, but in a non-debug build overwrite any existing
> entry as that's more likely to be correct.
>
2012 Aug 31
1
too slow when create index
I am create index for some files,in my program,a document is a line in a
file. i create index for very lines in a file. is there any method to
speed up this ??????
2008 Aug 21
2
How to speed up indexing ?
I'm new to Xapian & need some help, many thanks if anyone replies.
I did a release build from xapian-core-1.0.7 with VS2008 by using
Charlie Hull's makefiles.
I'm trying to test-index my dataset -- some 200'000 docs, each
document being (on average) 50 bytes long and having 6 words.
I tried (a) not to use stemmer, (b) commit_transaction() on every
50/100/etc. docs, (c) not
2007 Oct 11
2
Xapian 1.0.3 installation issues.
Xapian 1.0.3 installation issues,
I installed Xapian 1.0.3 and the search would not execute when run as
Apache user. I could run the search fine inside ssh. I rolled Xapian
to previous version 1.0.2 and the search still does not work even when
I put back the old index made by Xapian 1.0.2
... my search engine is out of work ...
Kevin Duraj
http://myhealthcare.com