thr3ads.net - similar to: "Multiple databases vs Single large database"

Displaying 20 results from an estimated 20000 matches similar to: "Multiple databases vs Single large database"

Perl version of sortable_serialize missing?

2012 Jan 20

Perl version of sortable_serialize missing?

I attempted to use the sortable_serialize function from perl, however doesn't seem to exist. The only occurrence of the string "sortable" in the /usr/local/perl/5.10.1/Search/ tree is in the pod in Xapian.pm. What am I doing wrong? use Search::Xapian; ... $doc->add_value(4,sortable_serialize($recdate)); Undefined subroutine &main::sortable_serialize called

Are these numbers resonsable?

2007 Jan 19

Are these numbers resonsable?

I have only one box[1] running 3 sub-systems[2] at my system, are these numbers resonsable[3]?? [1] - From dmesg (FreeBSD 6.1-RELEASE): AMD Sempron(tm) Processor 3000+ (1808.33-MHz K8-class CPU) real memory = 2080309248 (1983 MB) avail memory = 1997869056 (1905 MB) ad0: 76350MB <SAMSUNG SP0802N TK200-04> at ata0-master UDMA33 [2] The sub-systems are: 1 - A server giving adreesses of

Xapian index size 475GB = 170 million documents (URLs)

2010 Dec 18

Xapian index size 475GB = 170 million documents (URLs)

Xapians, I am maintaining about two indexes for my search engines which approximately is each the same size. I would like to share this knowledge with you, since many of you have never seen Xapian index of this size. And of course you can search the index by yourself at - http://myhealthcare.com/ - http://find1friend.com/ I need 2 x 100 million more documents into each index, and I hope it will

stub-file and get_doccount

2015 Mar 11

stub-file and get_doccount

Hello, i switched from one big index to a stub file with many indexes and running into a problem. i have a tool to fetch a random document via: get_doccount random id up to get_doccount get_document with that id after changing to stub file this failes. Is there a nice way to get a random document from a stub file? ?MfG? Felix Ostmann

Warning from ExtUtils::MakeMaker

2012 Mar 22

Warning from ExtUtils::MakeMaker

Installation was OK and all works well, but this warning is bad :-/ tried 15 min to find the failure (there is no failure) ... $ perl Makefile.PL XAPIAN_CONFIG=/root/build/xapian-core/bin/xapian-config PREFIX=/root/build/Search-Xapian Checking if your kit is complete... Looks good 'XAPIAN_CONFIG' is not a known MakeMaker parameter name. Writing Makefile for Search::Xapian $ perl

understanding stemming and synonyms

2011 Sep 23

understanding stemming and synonyms

I am working with version 1.2.7 and want to use stemming and synonyms. I use the perl-bindings and get some problems. First of all: the perl-bindings dont allow the QueryParser a third argument when calling parse_query! So i cannot set a default prefix (which perhaps is the solution to my problem, but later more) i have a simple testcase: 3 documents, every document only has one word:

Synonyms of Abbreviations

2012 Oct 04

Synonyms of Abbreviations

Hello, I am looking for a documentation or an example to use the synonym function. I tried this db.add_synonym("omega","xapain"); and this works by adding the flag FLAG_AUTO_SYNONYMS. If i try to use the db.add_synonym("omega","xapain is search engine "); it fails why? Can xapian use synonym for Abbreviations like MBA => Master of business

Sticky results

2013 Feb 20

Sticky results

Hi there, I have a xapian index whose results are being sorted by a value, with (PHP bindings): $enquire->set_sort_by_value($sort_data_value); This is because I want the results returned in chronological order of publication date. However, I now have a need to have certain results be 'sticky' at the top of the resultset, regardless of their publication date. Obviously there are

Xapian::Database->close() for perl missing

2012 Apr 19

Xapian::Database->close() for perl missing

I have a xapian-daemon, which can be queried via http. A background-process generated every hour one new index and then remove and create a new symlink to the current database. /path/to/index/20120419010000 /path/to/index/20120419020000 /path/to/index/20120419030000 /path/to/index/default => /path/to/index/20120419030000 So the daemon only check the mtime of /path/to/index/default/iamchert

Xapian 1.4.0 released

2016 Jul 25

Xapian 1.4.0 released

Kevin writes: > Of course, I can fix it by myself and check every terms length, but > that will add more overhead to big data computing. How is the overhead different whether your code checks it or Xapian does? Best regards, Adam -- "Oh, we all like motorcycles, to some degree." Adam Sj?gren asjo

Amount of writes during index creation

2019 Jan 31

Amount of writes during index creation

Olly Betts writes: > On Mon, Jan 21, 2019 at 03:25:01PM +0100, Jean-Francois Dockes wrote: > > I have had a problem report from a Recoll user about the amount of writes > > during index creation. > > > > https://opensourceprojects.eu/p/recoll1/tickets/67/ > > > > The issue is that the index is on SSD and that the amount of writes is > >

Optimal usage of xapian-compact for merging

2010 Feb 02

Optimal usage of xapian-compact for merging

Greets, I've been wondering, what's the sane/optimal use of xapian-compact when merging many indexes with a view to maximum merging performance? The obvious: - only use -F on the final db. - use -m since I'm merging more than 3 dbs. Best strategy? a) loop: merge batches (of say 50, where the individual db's are small) into a temp index, then merge the (larger) temp into the

Compact databases and removing stale records at the same time

2013 Jun 19

Compact databases and removing stale records at the same time

On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum

Re: [Xapian-commits] 8351: trunk/xapian-core/ trunk/xapian-core/backends/flint/

2007 Apr 23

Re: [Xapian-commits] 8351: trunk/xapian-core/ trunk/xapian-core/backends/flint/

olly wrote: > SVN root: svn://svn.xapian.org/xapian > Changes by: olly > Revision: 8351 > Date: 2007-04-23 01:44:44 +0100 (Mon, 23 Apr 2007) > > Log message (2 lines): > backends/flint/flint_version.cc: Update the flint format version > since older flint versions can't read compressed tags. Am I correct in assuming that this means that when

xapian-tcpsrv need to reopen database?

2006 May 20

xapian-tcpsrv need to reopen database?

Hi, I'm adapting the omega (in a Python way ;) to do search across multiple remotes database, for now, I have only one xapian-tcpsrc running, but I have documents being inserted at the same time, so (like I have readed in others e-mails) the xapian-tcpsrc throw the following message: Connection from 192.168.0.101, port 64161 Got exception DatabaseModifiedError: The revision being read has

Xapian 1.4.3 "Db block overwritten - are there multiple writers?"

2017 May 17

Xapian 1.4.3 "Db block overwritten - are there multiple writers?"

Hi, I have a user reporting the following error during recoll indexing: flush() failed: Db block overwritten - are there multiple writers? "flush() failed" is from recoll, the rest is, I think the text of the Xapian exception. This is with Xapian 1.4.3 on Linux (I asked for more details, should be coming). I don't think that I've ever seen this error, and I also

Using a document id as metadata key and merges

2024 Dec 13

Using a document id as metadata key and merges

On Thu, Dec 12, 2024 at 09:51:44AM +0100, Jean-Francois Dockes wrote: > Following a discussion a few years ago, Recoll stores the documents text > contents in database metadata entries, with keys derived from document ids. > > More recently an index creation method using several temporary indexes > merged on completion was implemented. This is still a bit experimental. It >

Remote databases and daemons

2006 Mar 27

Remote databases and daemons

I've looked over the docs on remote backends, the protocol, and a bit of the c++ for doing distributed and remote searches. I've got a couple of questions: * The remote protocol is usable only as a Database, not as a WriteableDatabase -- is this correct? So, if I don't want my application to have a copy of the database on the same machine I'll need to write an indexer daemon on

Using a document id as metadata key and merges

2024 Dec 12

Using a document id as metadata key and merges

Hi, Following a discussion a few years ago, Recoll stores the documents text contents in database metadata entries, with keys derived from document ids. More recently an index creation method using several temporary indexes merged on completion was implemented. This is still a bit experimental. It brings a significant speed increase in some cases. I just realised that the merge lost many

prioritizing aggregated DBs

2020 Feb 19

prioritizing aggregated DBs

Olly Betts <olly at survex.com> wrote: > On Sat, Feb 08, 2020 at 06:04:42PM +0000, Eric Wong wrote: > > Olly Betts <olly at survex.com> wrote: > > > On Fri, Feb 07, 2020 at 09:33:08PM +0000, Eric Wong wrote: > > > > Or would I fiddle with wdf_inc for all ->index_text and ->add_term > > > > calls on a per-DB basis? > > > >

similar to: Multiple databases vs Single large database