Hi all.
First, a note about remote database connection over Perl. We actually
found an easy way to work around the unwrapped Renote::open issue.. We
use a stub file.
You might say that open_stub in also not wrapped.. which is true...
HOWEVER... looking at the code, we realized that Database::open() opts
to using stub_open if the argument is a string pointing to a stub file
rather than a database directory... So instead of
Database::open('/data/ftsdirectory') you can do
Database::open('/stubfile.dat')
Pretty handy trick.. not as nice as proper remote database open (since
with that you can dynamically control via code which servers to connect
to) but still, it works.
So we then tested remote search... We faced several problems...
1) Only the xapian-tpcsrv worked. We couldn't figure out how to use
xapian-progsrv. The problem was the stub file format.
This works:
remote 10.0.0.27:33333
But these don't work
remote ssh ftsuser@10.0.0.27 xapian-progsrv /data/fts/Database2/
or
remote ssh 10.0.0.27 xapian-progsrv /data/fts/Database2/
The error we get is
Error creating DB with stub file: Exception: Bad line 1 in stub
database file `/fts/stub.dat' at ...
Can anyone shed a light on this one? ssh is configured properly. ftsuser
is allowed to ssh without a prompt (using proper key files). database is
in the right location.. the error seems to be from the parsing of the line.
What are we doing wrong? whats the right format for remote over
xapian-progsrv?
2) We tried remote search over tcpsrv... which we can not really use
besides for testing, until it supports parallel searches, which is
something xapian-progsrv does support as far as we understand.
search speed was bad. A search for a single word (like gift) takes well
over a second for the first search. something that fast when running
locally. Even when done on the same machine (with localhost) its not
that fast.
Furthermore, fetching the documents also takes a long time and even
worse than that, fetching the matching words. Even on localhost.
TCP overhead shouldn't be that bad, should it? Maybe its tcpsrv
performance in general?
It probably doesn't help that search speed is slow in general in our
searches (this issue is being discussed in another discussion in this
mailing list), but nonetheless, its much slower than the regular slow
search.
Any tips, ideas, thoughts on these two issues? Did anyone manage using
multiple remote databases effectively?
Best regards,
Ron
Answering myself for the first question, the right stub line format for
ssh is this:
remote :ssh ftsuser@10.0.0.27 xapian-progsrv /data/fts/Database2/
Notice the colon before the ssh. colon has to be the first char in the line.
Searches are considerably faster than over tcp. No idea why.
Getting documents and matching words is still very bad though.
if we have to get data of 25 results, since our internal document ID is
store there for identification, it takes about 0.03-0.04 per document to
fetch. that is almost a second only for document fetching. AND thisis
for second time fetch. This is very slow. I read somewhere that
documents can be somehow fetched in batches? Is that right? if so,
anyone knows how/if can it be done in perl?
On top of that, getting matching words takes an extra 2 seconds to
sometimes much much more. especially first time searches where it can
take even 20 seconds for a specific document.
Any ideas?
Cheers,
Ron
> 1) Only the xapian-tpcsrv worked. We couldn't figure out how to use
> xapian-progsrv. The problem was the stub file format.
> This works:
> remote 10.0.0.27:33333
> But these don't work
> remote ssh ftsuser@10.0.0.27 xapian-progsrv /data/fts/Database2/
> or
> remote ssh 10.0.0.27 xapian-progsrv /data/fts/Database2/
>
On Sat, Oct 27, 2007 at 07:03:34PM +0200, Ron Kass wrote:> So instead of > Database::open('/data/ftsdirectory') you can do > Database::open('/stubfile.dat')Ah yes, of course you can. Sorry I didn't think of that.> 2) We tried remote search over tcpsrv... which we can not really use > besides for testing, until it supports parallel searches, which is > something xapian-progsrv does support as far as we understand.No, that's wrong. Does it say that somewhere in the documentation? If so, let me know where and I'll fix it.> search speed was bad. A search for a single word (like gift) takes well > over a second for the first search. something that fast when running > locally. Even when done on the same machine (with localhost) its not > that fast.Until we track down your uninitialised weights issue, I don't think performance testing is going to give reliable results. Cheers, Olly
Reasonably Related Threads
- patch for xapian-spec(0.9.9)
- Unable to generate lcov test coverage reports (Out of memory error)
- Unable to build RPM for Centos 7
- Xapian 1.4.0 released
- Re: [Xapian-commits] 9092: trunk/xapian-core/ trunk/xapian-core/api/ trunk/xapian-core/common/ trunk/xapian-core/include/xapian/