Hi all. First, a note about remote database connection over Perl. We actually found an easy way to work around the unwrapped Renote::open issue.. We use a stub file. You might say that open_stub in also not wrapped.. which is true... HOWEVER... looking at the code, we realized that Database::open() opts to using stub_open if the argument is a string pointing to a stub file rather than a database directory... So instead of Database::open('/data/ftsdirectory') you can do Database::open('/stubfile.dat') Pretty handy trick.. not as nice as proper remote database open (since with that you can dynamically control via code which servers to connect to) but still, it works. So we then tested remote search... We faced several problems... 1) Only the xapian-tpcsrv worked. We couldn't figure out how to use xapian-progsrv. The problem was the stub file format. This works: remote 10.0.0.27:33333 But these don't work remote ssh ftsuser@10.0.0.27 xapian-progsrv /data/fts/Database2/ or remote ssh 10.0.0.27 xapian-progsrv /data/fts/Database2/ The error we get is Error creating DB with stub file: Exception: Bad line 1 in stub database file `/fts/stub.dat' at ... Can anyone shed a light on this one? ssh is configured properly. ftsuser is allowed to ssh without a prompt (using proper key files). database is in the right location.. the error seems to be from the parsing of the line. What are we doing wrong? whats the right format for remote over xapian-progsrv? 2) We tried remote search over tcpsrv... which we can not really use besides for testing, until it supports parallel searches, which is something xapian-progsrv does support as far as we understand. search speed was bad. A search for a single word (like gift) takes well over a second for the first search. something that fast when running locally. Even when done on the same machine (with localhost) its not that fast. Furthermore, fetching the documents also takes a long time and even worse than that, fetching the matching words. Even on localhost. TCP overhead shouldn't be that bad, should it? Maybe its tcpsrv performance in general? It probably doesn't help that search speed is slow in general in our searches (this issue is being discussed in another discussion in this mailing list), but nonetheless, its much slower than the regular slow search. Any tips, ideas, thoughts on these two issues? Did anyone manage using multiple remote databases effectively? Best regards, Ron
Answering myself for the first question, the right stub line format for ssh is this: remote :ssh ftsuser@10.0.0.27 xapian-progsrv /data/fts/Database2/ Notice the colon before the ssh. colon has to be the first char in the line. Searches are considerably faster than over tcp. No idea why. Getting documents and matching words is still very bad though. if we have to get data of 25 results, since our internal document ID is store there for identification, it takes about 0.03-0.04 per document to fetch. that is almost a second only for document fetching. AND thisis for second time fetch. This is very slow. I read somewhere that documents can be somehow fetched in batches? Is that right? if so, anyone knows how/if can it be done in perl? On top of that, getting matching words takes an extra 2 seconds to sometimes much much more. especially first time searches where it can take even 20 seconds for a specific document. Any ideas? Cheers, Ron> 1) Only the xapian-tpcsrv worked. We couldn't figure out how to use > xapian-progsrv. The problem was the stub file format. > This works: > remote 10.0.0.27:33333 > But these don't work > remote ssh ftsuser@10.0.0.27 xapian-progsrv /data/fts/Database2/ > or > remote ssh 10.0.0.27 xapian-progsrv /data/fts/Database2/ >
On Sat, Oct 27, 2007 at 07:03:34PM +0200, Ron Kass wrote:> So instead of > Database::open('/data/ftsdirectory') you can do > Database::open('/stubfile.dat')Ah yes, of course you can. Sorry I didn't think of that.> 2) We tried remote search over tcpsrv... which we can not really use > besides for testing, until it supports parallel searches, which is > something xapian-progsrv does support as far as we understand.No, that's wrong. Does it say that somewhere in the documentation? If so, let me know where and I'll fix it.> search speed was bad. A search for a single word (like gift) takes well > over a second for the first search. something that fast when running > locally. Even when done on the same machine (with localhost) its not > that fast.Until we track down your uninitialised weights issue, I don't think performance testing is going to give reliable results. Cheers, Olly
Seemingly Similar Threads
- patch for xapian-spec(0.9.9)
- Unable to generate lcov test coverage reports (Out of memory error)
- Unable to build RPM for Centos 7
- Xapian 1.4.0 released
- Re: [Xapian-commits] 9092: trunk/xapian-core/ trunk/xapian-core/api/ trunk/xapian-core/common/ trunk/xapian-core/include/xapian/