thr3ads.net - Xapian devel - [Xapian-devel] Up and running with ZeroMQ [Mar 2013]

If this information is useful, please help other people find it:
Share via:

Ankit Bhatnagar

2013-Mar-18 18:20 UTC

[Xapian-devel] Up and running with ZeroMQ

Hi,

As suggested earlier, I'm getting adept with the usage of ZeroMQ and
understand its usage in the present context.

What I have done till now is :-

(1) Study ZeroMQ API and get in ease with its usage. Still implementing
sample programs and learning basic functions.

(2) Installed, compiled and setup my development environment with Xapian
and ZeroMQ. Lots of work here already!

Though, I read and understood some code available in *xapian-core/net, *I
require a bit of help as follows :

*(1) Protocol usage :* The backend functions in two modes - prog and tcp.
Prog is the mode where the entire program-space is spawned for every
request and tcp functions normally as TCP sockets do. But both of them
remain quite different from the standard HTTP framework/s. Any insight on
how ZeroMQ would suffice?

*(2) Object handling :* Within the scope of the project, object handling is
yet another important feature to take care of.
>>> What kind of objects are handled or not handled within Xapian?
>>> Is there the need for serialization/deserialization of objects
beforeprocessing takes place?

*(3) Idea Deliverables :* As most other ideas have a set of deliverables
defined, what are those for this idea? Kindly enlist a few, if possible.

-- 
*Wishes.*
*Ankit.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20130318/b24f2a93/attachment.htm>

Olly Betts

2013-Mar-22 02:24 UTC

head link

[Xapian-devel] Up and running with ZeroMQ

On Mon, Mar 18, 2013 at 11:50:31PM +0530, Ankit Bhatnagar
wrote:> *(1) Protocol usage :* The backend functions in two modes - prog and tcp.
> Prog is the mode where the entire program-space is spawned for every
> request and tcp functions normally as TCP sockets do.
I don't think that's a helpful way to think about it.

I would say that currently Xapian's remote backend works by passing
messages back and forth across a file descriptor.  The difference
between the tcp and prog variants is simply in how the file descriptor
is created - for the tcp variant, a socket is opened, while for the prog
variant, a specified program is run in a subprocess, and it's stdin and
stdout connected to a pipe.

What the prog variant provides really provides is an easy way to do
things like tunnelling the remote protocol across an encrypted
connection - you just make the program to run something like:

ssh remotehost xapian-progsrv /path/to/db
> But both of them remain quite different from the standard HTTP
> framework/s. Any insight on how ZeroMQ would suffice?
The remote backend protocol passes messages back and forth across
a file descriptor.  ZeroMQ allows you to pass messages back and forth,
so ideally you don't need to change the current messages and can just
replace the transport layer.  If the current "conversations" don't
fit the message passing supported by ZeroMQ, then the messages may
need changing to suit, but I would suggest starting on the basis that
you're just trying to change the layers below that.

We'd probably lose the "prog" variant - I guess people would have
to set up an SSH tunnel or VPN first, then run the remote backend
over it.
> *(2) Object handling :* Within the scope of the project, object handling is
> yet another important feature to take care of.
> 
> >>> What kind of objects are handled or not handled within Xapian?
> 
> >>> Is there the need for serialization/deserialization of objects
before
> processing takes place?
This is all done already - the messages sent already contain serialised
objects in many cases.
> *(3) Idea Deliverables :* As most other ideas have a set of deliverables
> defined, what are those for this idea? Kindly enlist a few, if possible.
This idea is probably more of an experiment than most - if using ZeroMQ
works well, it would replace quite a lot of existing code (so less for
us to maintain) and give us IPv6 support, which the remote backend lacks
currently.

ZeroMQ claims to be faster than using TCP sockets, but even if that's a
true claim, it might not translate into a faster remote backend.  If
performance is worse, I don't think we'd want to make the switch.  If
they're pretty much the same, I think we'd want to look carefully at the
pros and cons of the two approaches.

So I think the deliverables are something like a version of Xapian where
the network layer used by the remote backend and by replication uses
ZeroMQ instead, plus a set of benchmark test results showing how it
compares with the existing code.

The next obvious step once you have benchmarks set up would be to
profile and see if you can identify hot spots which could be sped up.
There's not been much profiling work done on the remote backend or on
replication.  So that would make a good stretch goal for the project I
think.

Dan may have some further thoughts (this was his idea originally).

One issue I just noticed is that ZeroMQ is LGPL - we're trying to
get to a position where we can relicense Xapian as MIT/X, so adding
an LGPL dependency to the core library wouldn't be ideal.  This
isn't a show-stopper for the idea, but certainly it's a factor to
consider when deciding whether to merge at the end.

Cheers,
    Olly

Maybe Matching Threads

Search for more seemingly similar threads

Xapian devel - Mar 2013 - Up and running with ZeroMQ

[Xapian-devel] Up and running with ZeroMQ

[Xapian-devel] Up and running with ZeroMQ

Maybe Matching Threads