similar to: Large database problem

Displaying 20 results from an estimated 20000 matches similar to: "Large database problem"

2018 Jul 02
2
Is there a large variance in xapian searching?
Dear XAPIAN developers, I was using xapian to index large than 13 million document about Q & A(Quora similarly). I will share some performance data about indexing and searching, and I will seek some help for improving performance of searching. My computer has 8 i7 at 3.4G CPU and 16G memory, ubuntu 16.04. Dataset include about 13M document, each document will be cut into 35 term(Chinese
2008 Dec 03
1
Compiling latest svn revision
Greetings, Before I head off to bed I thought I'd fire off this email wrt compiling the latest svn revision. I finally resolved all the dependencies, ran bootstrap/configure, but make eventually fails with: /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crt1.o: In function `_start': (.text+0x18): undefined reference to `main' Xapian.o: In function `boot_Search__Xapian':
2012 Apr 19
1
Xapian::Database->close() for perl missing
I have a xapian-daemon, which can be queried via http. A background-process generated every hour one new index and then remove and create a new symlink to the current database. /path/to/index/20120419010000 /path/to/index/20120419020000 /path/to/index/20120419030000 /path/to/index/default => /path/to/index/20120419030000 So the daemon only check the mtime of /path/to/index/default/iamchert
2011 Jun 10
2
Just starting to experiment with php
I took one of the examples and tried to run against my database ls -l /data1/mail/db/cur.1 total 1129624 -rw-r--r-- 1 jwl jwl 0 2011-06-09 02:27 flintlock -rw-r--r-- 1 jwl jwl 28 2011-06-09 02:27 iamchert -rwxrwxrwx 1 jwl jwl 7258 2011-06-09 02:27 position.baseA -rwxrwxrwx 1 jwl jwl 7046 2011-06-09 02:27 position.baseB -rwxrwxrwx 1 jwl jwl 474226688 2011-06-09 02:28
2008 Dec 02
1
NFSv4 and locking
Greetings, We use NFSv4 on our cluster and perform distributed indexing (well, we used to on our previous system which used a simple touch() locking mechanism). I'm having a spot of bother getting Xapian to obtain a lock (hangs on fcntl64()). I've read http://trac.xapian.org/wiki/XapianOverNFS and other list posts, and noted that a lock daemon should be running to allow locks
2008 Nov 21
1
Multiple databases vs Single large database
Hi I've decided to use xapian because my files table in my mysql database is going to grow very large, and it seems mysql isn't good at full text searching. I'm doing this with the php wrapper by the way. The way my system is set out, each user has their own set of files, and when doing a search it is going to be for a specific user's file (based on file name, title,
2010 Jan 18
3
postlist: Tag containing meta information is corrupt.
Greetings, Using latest svn. I've noticed the following error when performing index merging: postlist: baseB blocksize=8K items=33962 lastblock=534 revision=1 levels=2 root=459 B-tree checked okay Tag containing meta information is corrupt. postlist table errors found: 1 I can still search on this index (I've only checked very small indexes), but merging is now a problem since I check
2010 Jan 20
2
Error when creating trac bug ticket
Greets Just tried to create a bug ticket on trac.xapian.org and it croaked with the error: ----------- Trac detected an internal error: IntegrityError: columns ticket, name are not unique The action that triggered the error was: POST: /newticket ----------- Clicking on the Create button to report the error results in an invalid URL. What's the best way to proceed to report my bug? Thanks
2011 Jun 20
1
Revision: 15699: $tg->index_text ($text, $weight) fails with "No matching function for overloaded 'TermGenerator_index_text'"
Hi, I've been out of touch recently, so perhaps I've missed something (the last time I checked the svn pulse the Perl code was under search-xapian/ - looks like things have moved to swig). The latest trunk (revision 15699) has a problem with Perl: $tg->index_text ($text, $weight); It fails with "No matching function for overloaded 'TermGenerator_index_text'..." I
2011 Jul 19
1
xapian-compact ok, xapian-check failure
Greets, I've encountered the following while performing test merges (and writing code to handle errors, etc so things can be automated) and wondering about the best way to proceed: xapian-compact -b64k -m src1 src2.... tmp_dst -- works as expected, exit code 0. xapian-check tmp_dst -- produces the following error for the postlist: postlist: baseB blocksize=64K items=28175410
2011 Jun 21
3
Error after upgrading to latest xapian distro
I upgraded to latest xapian version and I have started getting xapian.InvalidArgumentError: Term too long (> 245): XTEXT... This issue was not there in 1.0.16 but it is in the latest version. Any solutions. thanks
2008 Dec 06
1
Obtaining actual match count if using set_collapse_key()
Greets, Is it possible to obtain the actual match count if you're using set_collapse_key()? ie, the total count *before* the collapsing occurs (without using get_mset()). Alternatively, will MSet::get_matches_estimated() return the true - pre-collapse - count, or will it also be affected by collapsing? Thanks Henry
2008 Nov 26
1
Trying to patch xapian perl add/remove_spelling
Greets, I'm giving a stab at patching the CPAN module to add the missing WritableDatabase::add_spelling and remove_spelling, but need a bit of guidance since I'm coming in cold, and pressed for time (aren't we all). I've modified XS/WritableDatabase.xs and added the two necessary functions, and also added the two basic tests in t/index.t. Compilation completes cleanly, but
2010 Feb 02
1
Optimal usage of xapian-compact for merging
Greets, I've been wondering, what's the sane/optimal use of xapian-compact when merging many indexes with a view to maximum merging performance? The obvious: - only use -F on the final db. - use -m since I'm merging more than 3 dbs. Best strategy? a) loop: merge batches (of say 50, where the individual db's are small) into a temp index, then merge the (larger) temp into the
2010 Jun 11
1
Interesting xapian-compact observations
Greets, I've had xapian-compact (without -F) sessions running for several days now on 10 'merge' machines and I've noticed that the average compaction average can swing wildly: 18% 76% 10% 19% 39% 13% 69% 43% 19% 42% The average so far is about 35% (ie, 65% reduction in target index sizes, which is unexpected and pleasingly welcomed). I'm curious about the large variance in
2011 Jul 13
1
Feature request: Determining source index of xapian-compact DatabaseError exception
Greets, When merging lots of subindexes in batches like so: xapian-compact -m idx1 idx2... dstidx Errors such as: xapian-compact: DatabaseError: Error reading block 0: got end of file present a problem since it does not provide the offending path name (of the broken index) for easy identification/removal in automated/batch scenarios (the way DatabaseOpeningError:.... does, eg). The only way
2011 Sep 30
1
Slow phrase performance
I've been getting excellent performance out of xapian but when searches on phrases of common terms such as [ "north america" ] or [ "art history" ] get run it will take a very long time to come up with results. Examples: ------------------------------ [ south africa ] -- 10379 results found in ~.2 sec [ white house ] -- 17988 results found in <1 sec Quoting either of
2010 Apr 16
2
best practices - combining sql database and xapian, size of database?
Newbie-alert: I'm just getting started on a new project involving a full text search requirement, and my initial investigation points to xapian being the way to go. Two questions: - eventually I'll most likely be indexing towards 50 million documents - is this reasonable to expect or attempt with xapian? - each of my documents come with a set of attributes. These are easily stored
2008 Nov 28
1
Lucene & Solr
Hi all, I've been asked to prepare a comparison of Lucene/Solr and Xapian and I'm trying to find some differences between the two. I'm not that familiar with Lucene myself but I expect there are lots of people who will have looked at both before ending up on this mailing lists. Can anyone help? I'm looking for both differences between the two systems and perhaps the reasons
2011 Aug 09
3
what is the fastest way to fetch results which are sorted by timestamp ?
what is the fastest way to fetch results which are sorted by timestamp ? i want to use xapian as my search engine , use add_boolean_term(something) and add_value(0,sortable_serialise(get_timestamp())) to a doc. search through enquire.set_weighting_scheme(xapian.BoolWeight()) and enquire.set_sort_by_value(0,True) to ensure that the results are sorted by the timestamp. This method is ok , but