Displaying 20 results from an estimated 20000 matches similar to: "Multiple databases vs Single large database"
2012 Jan 20
2
Perl version of sortable_serialize missing?
I attempted to use the sortable_serialize function from perl, however
doesn't seem to exist. The only occurrence of the string "sortable" in
the /usr/local/perl/5.10.1/Search/ tree is in the pod in Xapian.pm.
What am I doing wrong?
use Search::Xapian;
...
$doc->add_value(4,sortable_serialize($recdate));
Undefined subroutine &main::sortable_serialize called
2007 Jan 19
3
Are these numbers resonsable?
I have only one box[1] running 3 sub-systems[2] at my system, are these
numbers resonsable[3]??
[1] - From dmesg (FreeBSD 6.1-RELEASE):
AMD Sempron(tm) Processor 3000+ (1808.33-MHz K8-class CPU)
real memory = 2080309248 (1983 MB)
avail memory = 1997869056 (1905 MB)
ad0: 76350MB <SAMSUNG SP0802N TK200-04> at ata0-master UDMA33
[2] The sub-systems are:
1 - A server giving adreesses of
2010 Dec 18
1
Xapian index size 475GB = 170 million documents (URLs)
Xapians,
I am maintaining about two indexes for my search engines which
approximately is each the same size. I would like to share this
knowledge with you, since many of you have never seen Xapian index of
this size. And of course you can search the index by yourself at
- http://myhealthcare.com/
- http://find1friend.com/
I need 2 x 100 million more documents into each index, and I hope it
will
2015 Mar 11
2
stub-file and get_doccount
Hello,
i switched from one big index to a stub file with many indexes and running
into a problem.
i have a tool to fetch a random document via:
get_doccount
random id up to get_doccount
get_document with that id
after changing to stub file this failes. Is there a nice way to get a
random document from a stub file?
?MfG?
Felix Ostmann
2012 Mar 22
1
Warning from ExtUtils::MakeMaker
Installation was OK and all works well, but this warning is bad :-/ tried
15 min to find the failure (there is no failure) ...
$ perl Makefile.PL XAPIAN_CONFIG=/root/build/xapian-core/bin/xapian-config
PREFIX=/root/build/Search-Xapian
Checking if your kit is complete...
Looks good
'XAPIAN_CONFIG' is not a known MakeMaker parameter name.
Writing Makefile for Search::Xapian
$ perl
2011 Sep 23
2
understanding stemming and synonyms
I am working with version 1.2.7 and want to use stemming and synonyms.
I use the perl-bindings and get some problems.
First of all: the perl-bindings dont allow the QueryParser a third
argument when calling parse_query! So i cannot set a default prefix
(which perhaps is the solution to my problem, but later more)
i have a simple testcase:
3 documents, every document only has one word:
2012 Oct 04
1
Synonyms of Abbreviations
Hello,
I am looking for a documentation or an example to use the synonym function.
I tried this
db.add_synonym("omega","xapain");
and this works by adding the flag FLAG_AUTO_SYNONYMS.
If i try to use the
db.add_synonym("omega","xapain is search engine ");
it fails why? Can xapian use synonym for Abbreviations like MBA => Master
of business
2013 Feb 20
1
Sticky results
Hi there,
I have a xapian index whose results are being sorted by a value, with (PHP bindings):
$enquire->set_sort_by_value($sort_data_value);
This is because I want the results returned in chronological order of publication date. However, I now have a need to have certain results be 'sticky' at the top of the resultset, regardless of their publication date. Obviously there are
2012 Apr 19
1
Xapian::Database->close() for perl missing
I have a xapian-daemon, which can be queried via http. A background-process
generated every hour one new index and then remove and create a new symlink
to the current database.
/path/to/index/20120419010000
/path/to/index/20120419020000
/path/to/index/20120419030000
/path/to/index/default => /path/to/index/20120419030000
So the daemon only check the mtime of /path/to/index/default/iamchert
2016 Jul 25
1
Xapian 1.4.0 released
Kevin writes:
> Of course, I can fix it by myself and check every terms length, but
> that will add more overhead to big data computing.
How is the overhead different whether your code checks it or Xapian does?
Best regards,
Adam
--
"Oh, we all like motorcycles, to some degree." Adam Sj?gren
asjo
2019 Jan 31
4
Amount of writes during index creation
Olly Betts writes:
> On Mon, Jan 21, 2019 at 03:25:01PM +0100, Jean-Francois Dockes wrote:
> > I have had a problem report from a Recoll user about the amount of writes
> > during index creation.
> >
> > https://opensourceprojects.eu/p/recoll1/tickets/67/
> >
> > The issue is that the index is on SSD and that the amount of writes is
> >
2010 Feb 02
1
Optimal usage of xapian-compact for merging
Greets,
I've been wondering, what's the sane/optimal use of xapian-compact when
merging many indexes with a view to maximum merging performance?
The obvious:
- only use -F on the final db.
- use -m since I'm merging more than 3 dbs.
Best strategy?
a) loop: merge batches (of say 50, where the individual db's are small)
into a temp index, then merge the (larger) temp into the
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote:
> On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote:
> > The advantage of compact - it runs approximately 8 times as fast (we
> > are CPU limited in each case - writing to tmpfs first, then rsyncing
> > to the destination) and it takes approximately 75% of the space of a
> > fresh database with maximum
2007 Apr 23
1
Re: [Xapian-commits] 8351: trunk/xapian-core/ trunk/xapian-core/backends/flint/
olly wrote:
> SVN root: svn://svn.xapian.org/xapian
> Changes by: olly
> Revision: 8351
> Date: 2007-04-23 01:44:44 +0100 (Mon, 23 Apr 2007)
>
> Log message (2 lines):
> backends/flint/flint_version.cc: Update the flint format version
> since older flint versions can't read compressed tags.
Am I correct in assuming that this means that when
2006 May 20
2
xapian-tcpsrv need to reopen database?
Hi, I'm adapting the omega (in a Python way ;) to do search across multiple
remotes database, for now, I have only one xapian-tcpsrc running, but I have
documents being inserted at the same time, so (like I have readed in others
e-mails) the xapian-tcpsrc throw the following message:
Connection from 192.168.0.101, port 64161
Got exception DatabaseModifiedError: The revision being read has
2017 May 17
2
Xapian 1.4.3 "Db block overwritten - are there multiple writers?"
Hi,
I have a user reporting the following error during recoll indexing:
flush() failed: Db block overwritten - are there multiple writers?
"flush() failed" is from recoll, the rest is, I think the text of the Xapian
exception.
This is with Xapian 1.4.3 on Linux (I asked for more details, should be
coming).
I don't think that I've ever seen this error, and I also
2024 Dec 13
1
Using a document id as metadata key and merges
On Thu, Dec 12, 2024 at 09:51:44AM +0100, Jean-Francois Dockes wrote:
> Following a discussion a few years ago, Recoll stores the documents text
> contents in database metadata entries, with keys derived from document ids.
>
> More recently an index creation method using several temporary indexes
> merged on completion was implemented. This is still a bit experimental. It
>
2006 Mar 27
4
Remote databases and daemons
I've looked over the docs on remote backends, the protocol, and a bit
of the c++ for doing distributed and remote searches. I've got a
couple of questions:
* The remote protocol is usable only as a Database, not as a
WriteableDatabase -- is this correct? So, if I don't want my
application to have a copy of the database on the same machine I'll
need to write an indexer daemon on
2024 Dec 12
1
Using a document id as metadata key and merges
Hi,
Following a discussion a few years ago, Recoll stores the documents text
contents in database metadata entries, with keys derived from document ids.
More recently an index creation method using several temporary indexes
merged on completion was implemented. This is still a bit experimental. It
brings a significant speed increase in some cases.
I just realised that the merge lost many
2020 Feb 19
2
prioritizing aggregated DBs
Olly Betts <olly at survex.com> wrote:
> On Sat, Feb 08, 2020 at 06:04:42PM +0000, Eric Wong wrote:
> > Olly Betts <olly at survex.com> wrote:
> > > On Fri, Feb 07, 2020 at 09:33:08PM +0000, Eric Wong wrote:
> > > > Or would I fiddle with wdf_inc for all ->index_text and ->add_term
> > > > calls on a per-DB basis?
> > >
>