thr3ads.net - Eventmachine talk - [Eventmachine-talk] "packet" framing [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Tony Arcieri

2007-Oct-12 16:45 UTC

[Eventmachine-talk] "packet" framing

I was looking through the protocol handlers and I didn''t see one for
length-prefixed framing.

I''m thinking of something like in Erlang, which lets you specify a
{packet,
N} framing mode.  This means the protocol is framed with an N-byte big
endian integer representing the length of the subsequent payload.  Erlang
then gives complete frames to the controlling process.

Does something like this exist?  If not I think it''d make a nice
standard
facility.  It''s relatively trivial and I''d be willing to write
it.

--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
720-227-0129 ext. 202
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071012/4139d41a/attachment.html

Francis Cianfrocca

2007-Oct-12 22:44 UTC

head link

[Eventmachine-talk] "packet" framing

On 10/12/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> I was looking through the protocol handlers and I didn''t see one
for
> length-prefixed framing.
>
> I''m thinking of something like in Erlang, which lets you specify a
> {packet, N} framing mode.  This means the protocol is framed with an N-byte
> big endian integer representing the length of the subsequent payload.
> Erlang then gives complete frames to the controlling process.
>
> Does something like this exist?  If not I think it''d make a nice
standard
> facility.  It''s relatively trivial and I''d be willing to
write it.


Are you thinking of an includable module that would go into  lib/protocols?
And  it would have methods like #send_packet and #receive_packet that would
automatically handle the framing?

Sounds like a nice idea. I have two questions: what standard protocol(s) use
something like this? AMQP does, but size octets aren''t the only thing
that
appears in its frame headers.

And second, the usual reason to add this to a protocol is for performance,
so that clients and servers can allocate all needed memory for a frame in
one go. Is that one of the benefits you want to capture? If so, it will
require some tinkering down in the guts of the reactor. We''d have to
decide
how well worth doing it is, as memory-page handling in there is already
quite efficient.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071013/a629d389/attachment.html

Tony Arcieri

2007-Oct-13 18:09 UTC

head link

[Eventmachine-talk] "packet" framing

On 10/12/07, Francis Cianfrocca <garbagecat10 at gmail.com>
wrote:>
> On 10/12/07, Tony Arcieri <tony at clickcaster.com> wrote:
> >
> > I was looking through the protocol handlers and I didn''t see
one for
> > length-prefixed framing.
> >
> > I''m thinking of something like in Erlang, which lets you
specify a
> > {packet, N} framing mode.  This means the protocol is framed with an
N-byte
> > big endian integer representing the length of the subsequent payload.
> > Erlang then gives complete frames to the controlling process.
> >
> > Does something like this exist?  If not I think it''d make a
nice
> > standard facility.  It''s relatively trivial and I''d
be willing to write it.
>
>
>
> Are you thinking of an includable module that would go into
> lib/protocols? And  it would have methods like #send_packet and
> #receive_packet that would automatically handle the framing?
>
Yes and yes

Sounds like a nice idea. I have two questions: what standard protocol(s)
use> something like this? AMQP does, but size octets aren''t the only
thing that
> appears in its frame headers.
>
I don''t know of any protocols which use this offhand.  Erlang seems to
make
use of it for IPC.  My main reason for bringing this up is I''d like to
use
it for framing my protocol.

And second, the usual reason to add this to a protocol is for
performance,> so that clients and servers can allocate all needed memory for a frame in
> one go. Is that one of the benefits you want to capture?
>
Eventually.  I''ve lately been swayed by the arguments of Dan J.
Bernstein
and others against CRLF framing (or rather, the problems of
escaping/buffering in general).  While there''s performance to be
gained,
there''s also the simplicity of an idiot-proof framing.

If so, it will require some tinkering down in the guts of the reactor.
We''d> have to decide how well worth doing it is, as memory-page handling in there
> is already quite efficient.
>
I''m not too concerned about performance for the time being. 
I''m just
wondering if this would be useful as a standard feature.

By the way, my protocol/project should be seeing an open source release in
the very near future.  I''ll be sure to post to the list when that
happens.

--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
720-227-0129 ext. 202
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071013/4eff14d1/attachment.html

Francis Cianfrocca

2007-Oct-14 05:10 UTC

head link

[Eventmachine-talk] "packet" framing

On 10/13/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> On 10/12/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:
> >
> > On 10/12/07, Tony Arcieri <tony at clickcaster.com> wrote:
> > >
> > > I was looking through the protocol handlers and I didn''t
see one for
> > > length-prefixed framing.
> > >
> > > I''m thinking of something like in Erlang, which lets you
specify a
> > > {packet, N} framing mode.  This means the protocol is framed with
an N-byte
> > > big endian integer representing the length of the subsequent
payload.
> > > Erlang then gives complete frames to the controlling process.
> > >
> > > Does something like this exist?  If not I think it''d
make a nice
> > > standard facility.  It''s relatively trivial and
I''d be willing to write it.
> >
> >
> >
> > Are you thinking of an includable module that would go into
> > lib/protocols? And  it would have methods like #send_packet and
> > #receive_packet that would automatically handle the framing?
> >
>
> Yes and yes
>
> Sounds like a nice idea. I have two questions: what standard protocol(s)
> > use something like this? AMQP does, but size octets aren''t
the only thing
> > that appears in its frame headers.
> >
>
> I don''t know of any protocols which use this offhand.  Erlang
seems to
> make use of it for IPC.  My main reason for bringing this up is
I''d like to
> use it for framing my protocol.
>
> And second, the usual reason to add this to a protocol is for performance,
> > so that clients and servers can allocate all needed memory for a frame
in
> > one go. Is that one of the benefits you want to capture?
> >
>
> Eventually.  I''ve lately been swayed by the arguments of Dan J.
Bernstein
> and others against CRLF framing (or rather, the problems of
> escaping/buffering in general).  While there''s performance to be
gained,
> there''s also the simplicity of an idiot-proof framing.
>
> If so, it will require some tinkering down in the guts of the reactor.
> > We''d have to decide how well worth doing it is, as
memory-page handling in
> > there is already quite efficient.
> >
>
> I''m not too concerned about performance for the time being. 
I''m just
> wondering if this would be useful as a standard feature.

I think it''s a cool idea. It''s always seemed to me that
designing new
network protocols is to be avoided in general because it''s incredibly
difficult to do well, and there are so many standard ones out there anyway.
But I could be wrong. Clearly there''s an important use case for EM in
dealing with locally-defined ad hoc protocols that would allow newly-written
Ruby programs to interface with legacy apps.

What if you were to have a sized-frame protocol such as you describe, and
*inside* of each frame there were lines and text or some other complex
protocol? You''d want to pass each received frame through another
included
handler. I''m not sure how graceful it would be to *layer* protocol
handlers
on top of each other with the current way we do things. Jeff Rose did a lot
of work with this idea (and Twisted supports it to some extent) but he never
got very far with it, and it hasn''t really come up in practice too
often.

By the way, my protocol/project should be seeing an open source release
in> the very near future.  I''ll be sure to post to the list when that
happens.

Looking forward to that.  We also have the website rubyeventmachine.org to
fill up with FAQs and use cases, if I ever get the time to do that. (It
would be great it someone reading this list had some time to help with
that.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071014/483bc015/attachment-0001.html

Brian Candler

2007-Oct-14 05:50 UTC

head link

[Eventmachine-talk] "packet" framing

On Sun, Oct 14, 2007 at 08:10:52AM -0400, Francis Cianfrocca
wrote:>    Sounds like a nice idea. I have two questions: what standard
>    protocol(s) use something like this?
Aside: DRb does. An evented DRb client/server might be an interesting tool.

Regards,

Brian.

Francis Cianfrocca

2007-Oct-14 11:04 UTC

head link

[Eventmachine-talk] "packet" framing

On 10/14/07, Brian Candler <B.Candler at pobox.com>
wrote:>
> On Sun, Oct 14, 2007 at 08:10:52AM -0400, Francis Cianfrocca wrote:
> >    Sounds like a nice idea. I have two questions: what standard
> >    protocol(s) use something like this?
>
> Aside: DRb does. An evented DRb client/server might be an interesting
> tool.


It certainly would! What would it take to produce such a thing?

I would guess that the performance profile of DRb is dominated by
marshalling and unmarshalling rather than I/O. Anyone agree or disagree?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071014/35a881c6/attachment.html

Bill Kelly

2007-Oct-14 13:52 UTC

head link

[Eventmachine-talk] "packet" framing

From: Francis Cianfrocca> > On 10/14/07, Brian Candler <B.Candler at pobox.com> wrote:
> > On Sun, Oct 14, 2007 at 08:10:52AM -0400, Francis Cianfrocca wrote:
> > >    Sounds like a nice idea. I have two questions: what standard
> > >    protocol(s) use something like this?
> >
> > Aside: DRb does. An evented DRb client/server might be an interesting
tool.
>
> It certainly would! What would it take to produce such a thing?
>
> I would guess that the performance profile of DRb is dominated by
> marshalling and unmarshalling rather than I/O. Anyone agree or disagree?
For what it''s worth, I''m currently implementing something
similar,
which I''ll release open source (non-GPL.)

I''m not using DRb, because I need Ruby <--> C++ communication,
not
just Ruby <--> Ruby.

My protocol is focused on fast remote procedure call (over TCP or
UDP) between both Ruby and non-Ruby nodes.  However, the data types
understood by the protocol are deliberately Ruby-oriented.

The current types understood are:

// TransportLayerTypes
enum {
    TT_NIL = 0,     // some Ruby types must be preserved as distinct from
integers
    TT_FALSE,
    TT_TRUE,
    TT_FLOAT,       // ruby might not care float vs. double, but C++ <-->
C++ comm. could use it
    TT_DOUBLE,
    TT_BINARY_DATA, // data count will be represented as BERINT, followed by
DATA OCTETS
    TT_UTF8_STRING, // encoded same as T_BINARY_DATA
    TT_SYMBOL,      // encoded same as T_BINARY_DATA (what are symbol encodings
in ruby? latin1 ?)

    TT_RUBYOBJ,                 // This is T_BINARY_DATA which is simply
                                // a Marshal.dump() of a ruby object instance.
                                // (Not parsable by pure C/C++ implementations;
however
                                // a C-based network cache server could easily
cache
                                // any transport layer object, including
RUBYOBJ)

    TT_BERINT = 0x80,   // 0x80     - BERINT TYPE FLAG
                        // 0x40         - BERINT SIGN FLAG
                        // 0x20         - BERINT CONTINUATION FLAG
                        // 0x1F         - (0..31) : integer five least
significant bits
                        //
                        // If CONT bit is set, then the five bits under the 0x1F
mask
                        // become the low five bits of the continued BER coded
data.

    BERINT_FIRST_BYTE_BIT_LIMIT = 7
};

Since my application is game-like, I may end up adding some fixed-point
types as well.

But anyway, I''m initially focused on a subset of DRb: Just remote
procedure calls where the arguments can be integers, floats, strings,
vectors and hashes.

However, my intention is that marshalled Ruby objects will also be
able to be exchanged between Ruby <--> Ruby nodes.  Given that, it
shouldn''t be very hard to implement full DRb semantics on top of
this basic transport layer.

NOTE: For performance reasons, I''m probably going to head down a
slightly different path than DRb.  Because most of the time, I want
to avoid waiting for the result of one call before I can make the
next call.  I may end up using a transactional approach, where I
can begin a transaction, issue a whole list of commands, and if any
of the commands fails on the remote node, it can rollback to the
state at the beginning of the transaction, and return an error code.

My initial target is a Ruby node issuing commands to a remote C++
OpenGL scenegraph node.  But there will also be a number of separate
Ruby processes on localhost also communicating via this protocol.
(Probably no need to mention, since this is the Eventmachine list,
but yes, this will all be done using Eventmachine. :)

Anyway figured I''d mention where I''m headed with this, in case
it
sounds useful to anyone.

Questions/comments/criticism welcome.

Regards,

Bill

Brian Candler

2007-Oct-15 01:11 UTC

head link

[Eventmachine-talk] "packet" framing

On Sun, Oct 14, 2007 at 02:04:35PM -0400, Francis Cianfrocca
wrote:>
>    On 10/14/07, Brian Candler <[1]B.Candler at pobox.com> wrote:
>
>      On Sun, Oct 14, 2007 at 08:10:52AM -0400, Francis Cianfrocca wrote:
>      >    Sounds like a nice idea. I have two questions: what standard
>      >    protocol(s) use something like this?
>      Aside: DRb does. An evented DRb client/server might be an
>      interesting tool.
>
>    It certainly would! What would it take to produce such a thing?
Not too much I think. I''ve not seen the DRb protocol documented, but
it''s
pretty easy to work out from looking at the source
(/usr/lib/ruby/1.8/drb/drb.rb). The guts are in class DRbMessage.

* dump() sends a 4-byte length followed by a Marshall.dump of that size.
  If the object isn''t marshallable, then a proxy object is sent
instead.

* an RPC request (send_request / recv_request) consists of:
  - dump(target object id)
  - dump(method name)
  - dump(num args)
  - num args x { dump arg[i] }
  - dump(callback block)

* an RPC response (send_response / recv_response) consists of:

  - dump(success)   # false for exception, true for complete
  - dump(result)    # exception or return value

Since the whole message isn''t wrapped inside a (size, data) block,
you''re
forced to read it as a number of separate chunks. So if EventMachine
provided a receive_chunk function, you''d still need a little state
machine
to keep track of where you were.
>    I would guess that the performance profile of DRb is dominated by
>    marshalling and unmarshalling rather than I/O. Anyone agree or
>    disagree?
Ruby''s marshalling is extremely fast. But for each received request you
have
to read 4 or more dumps, each of which requires read(4) followed by read(N).
For small requests, hopefully most of those reads will come out of local
buffers, but there''s potentially a lot of context switching in the
server
side if there are multiple concurrent requests.

For a client which only has one outstanding request it should be very fast
though.

Regards,

Brian.

Brian Candler

2007-Oct-15 02:19 UTC

head link

[Eventmachine-talk] "packet" framing

On Sun, Oct 14, 2007 at 01:52:48PM -0700, Bill Kelly
wrote:> For what it''s worth, I''m currently implementing something
similar,
> which I''ll release open source (non-GPL.)
>
> I''m not using DRb, because I need Ruby <--> C++
communication, not
> just Ruby <--> Ruby.
>
> My protocol is focused on fast remote procedure call (over TCP or
> UDP) between both Ruby and non-Ruby nodes.  However, the data types
> understood by the protocol are deliberately Ruby-oriented.
I had some thoughts along these lines a while ago, as I needed a lightweight
IPC interface between Ruby and Perl. Actually the Perl part of this has
evaporated now, so I haven''t needed to develop this further.

But for what it''s worth, I have some suggestions if you''re
looking to
implement a better-than-DRb for heterogenous applications.

(1) Make the basic mandatory data type the "string". Allow the two
ends to
negotiate additional encodings, such as binary integers, IEEE floats,
arrays, Ruby Marshal dumps, Perl Storable, PHP serialize etc; but if both
ends don''t support anything else, fallback to string representation.

A bitmap representing known types would be a simple way to do this. Just AND
the sender and receiver capability bitmaps to get the acceptable common
ones. Then for any particular object the sender can choose their preferred
representation from among the common ones.

[Keeping the list of mandatory types small would make the protocol very
useful in embedded devices]

(2) Like DRb, keep the low-level distinction between "success" and
"fail"
results. This means you can have a method which *returns* an exception
object be distinct from a method which *raises* an exception during its
execution. This might be done by having "execution failure" be a basic
type
of its own.

(3) Make "proxy object" be a basic type too (a la DRbUndumped (*)).
This
allows the receiver to invoke methods on objects held at the caller. Making
this a basic type means that a Perl receiver could invoke a Ruby proxy
object without having to understand how to unmarshal a Ruby DRbObject.

(4) Include support for method calls to be sent in *both* directions down
the same socket; that is, have separate ''request'' and
''response'' message
types which can be distinguished on the wire, and allow "servers" to
send
new requests back to "clients". This means that callbacks can be made
back
down the same socket as the incoming request came on, which in turn means
that it will work properly through NAT firewalls - something which DRb is
miserable at.

[This does rely on the sender keeping the connection nailed up, however.
Perhaps each proxy object should also contain an "alternative callback
address" to be used in the event that the original connection has been
lost]

(5) Tag each request and response with an ID, so that you can send multiple
overlapping requests and responses down the same socket. This avoids what
DRb does, which is opening multiple transport-layer connections to the same
target if concurrent requests to the same target are made from separate
threads (see DRbConn)

A simple embedded client or server can always implement only simple
request-response ordering if it wishes.

(5a) Dividing large requests into chunks of, say, 64KB might make for better
interleaving in this case.

(6) Include support for running over Unix domain sockets (surprisingly
awkward with DRb)

(7) Include support for running over stdin/stdout, e.g. one process which
talks to another process that it started using popen().

Possibly also useful:

(8) Support for running over TLS (possibly upgrading the connection after
the initial handshake)

(9) Simple [possibly mutual] authentication at start of the connection

(10) Support for running over HTTP (and FastCGI), although that would
enforce calls to be unidirectional.

(11) Like DRb, you''ll probably need some concept of a "head
object" or
"default object", to which you can send requests if you don''t
know the
object ID of any other object on the peer. For simple servers you only want
to expose one object anyway. For more complex servers the application writer
can provide a directory object; I don''t think you need to worry about
this
within the protocol itself.

Regards,

Brian.

(*) If this doesn''t mean anything to you, google for DRbTutorial.
Unfortunately the RubyGarden wiki appears to be permanently down these days,
but you may be able to find a cached copy.

Bill Kelly

2007-Oct-15 02:51 UTC

head link

[Eventmachine-talk] "packet" framing

----- Original Message -----
From: "Brian Candler" <B.Candler at pobox.com>
To: <eventmachine-talk at rubyforge.org>
Sent: Monday, October 15, 2007 2:19 AM
Subject: Re: [Eventmachine-talk] "packet" framing

> On Sun, Oct 14, 2007 at 01:52:48PM -0700, Bill Kelly wrote:
>> For what it''s worth, I''m currently implementing
something similar,
>> which I''ll release open source (non-GPL.)
>>
>> I''m not using DRb, because I need Ruby <--> C++
communication, not
>> just Ruby <--> Ruby.
>>
>> My protocol is focused on fast remote procedure call (over TCP or
>> UDP) between both Ruby and non-Ruby nodes.  However, the data types
>> understood by the protocol are deliberately Ruby-oriented.
>
> I had some thoughts along these lines a while ago, as I needed a
lightweight
> IPC interface between Ruby and Perl. Actually the Perl part of this has
> evaporated now, so I haven''t needed to develop this further.
Dude, thanks, I''m <3

(My context is quake2 100hz 3000 mile round trip UDP packets,)
> But for what it''s worth, I have some suggestions if
you''re looking to
> implement a better-than-DRb for heterogenous applications.
>
> (1) Make the basic mandatory data type the "string". Allow the
two ends to
> negotiate additional encodings, such as binary integers, IEEE floats,
> arrays, Ruby Marshal dumps, Perl Storable, PHP serialize etc; but if both
> ends don''t support anything else, fallback to string
representation.
Dude, I''m with that, but I felt that most RPC calls have
OpenGL{ ..... }
>
> A bitmap representing known types would be a simple way to do this. Just
AND
> the sender and receiver capability bitmaps to get the acceptable common
> ones. Then for any particular object the sender can choose their preferred
> representation from among the common ones.
>
> [Keeping the list of mandatory types small would make the protocol very
> useful in embedded devices]
Dude, this is already netstrings :D

<3

I know mandatory types can be small if one goes for netstrings :D

That may be the approach

++good

Brian Candler

2007-Oct-15 04:45 UTC

head link

[Eventmachine-talk] "packet" framing

On Mon, Oct 15, 2007 at 02:51:07AM -0700, Bill Kelly
wrote:> > (1) Make the basic mandatory data type the "string". Allow
the two ends to
> > negotiate additional encodings, such as binary integers, IEEE floats,
> > arrays, Ruby Marshal dumps, Perl Storable, PHP serialize etc; but if
both
> > ends don''t support anything else, fallback to string
representation.
>
> Dude, I''m with that, but I felt that most RPC calls have
> OpenGL{ ..... }
Well, of course what I wrote was based on what *I* would have found useful
:-)

I''m not familiar with the OpenGL API; do you just want to call methods
OpenGLfoo(x,y,z) where x,y,z are primitive types like integers? Or do you
need a more complex data representation (arrays of primitive types? hashes
of primitive types? more than that?)
> > A bitmap representing known types would be a simple way to do this.
Just AND
> > the sender and receiver capability bitmaps to get the acceptable
common
> > ones. Then for any particular object the sender can choose their
preferred
> > representation from among the common ones.
> >
> > [Keeping the list of mandatory types small would make the protocol
very
> > useful in embedded devices]
>
> Dude, this is already netstrings :D
Hmm, that wasn''t quite what I was thinking of, and RPC is a different
layer
anyway. (Aside: netstrings aren''t particularly easy to parse for a
binary
protocol. Personally I''d go for a 4-byte length in network order. It
has a
limit of 4GB, but how many RPC calls do you make with >4GB of copy-by-value
arguments? :-)

Anyway, what I was thinking off was something like this.

1. client opens TCP connection

 ------->  (4 byte magic) (4 byte capability bitmap)
 <-------  (4 byte magic) (4 byte capability bitmap)

2. both ends AND their capability bitmaps. If the TLS bit is set, they
both start TLS negotiation (in client or server mode depending on who
initiated the connection). From this point onwards the protocol is entirely
symmetrical.

3. when either side wants to make an RPC, they send a message something
like this:

 ------->  1 byte tag: "request"
           4 bytes: request ID
           4 bytes: object ID  (if zero: default object)
           4 bytes: number of arguments, first argument is method name
           number of arguments x
               1 byte: type
               4 bytes: length
               N bytes: data

The ''type'' chosen for each value would correspond to one of
the bit
positions in the capability bitmap.

[Note: it may be more efficient in some situations to round everything up to
4 byte boundaries, this could be a capability flag too]

4. the response at some point later would be

 <-------  1 byte tag: "response"
           4 bytes: request ID
           1 byte: type
           4 bytes: length
           N bytes: data

The basic types would be:

(0) String. Payload is sequence of N bytes of data. 8-bit clean.

(1) Remote execution error. The N bytes are a string containing info
    to help diagnosis (e.g. backtrace). Only used as a return value.

(2) Proxy object. The data payload contains 4 bytes of object ID, and
    the remainder is the object origin URI.

(3) nil (value is ignored)

(4) boolean (value is 0 for false, 1 for true)

(5) signed integer (value is big-endian binary)
    [maybe should have a fixed 4-byte version for ease of decoding?]

... extend this list as you see fit

Then at the sender side you marshal using the ''best''
representation
supported by the sender. A Ruby implementation might look something like
this:

CAP_STRING = 0
CAP_ERROR = 1
CAP_PROXY = 2
CAP_NIL = 3
CAP_BOOL = 4
CAP_LONG = 5
CAP_RUBY = 6

class Object
  def to_tinyrpc(capability)
    if capability[CAP_RUBY] > 0
      [CAP_RUBY, Marshal.dump(self) rescue Marshal.dump(Proxy.new(self))]
    else
      [CAP_STRING, to_s]  # best we can do
    end
  end
end

class String
  def to_tinyrpc(capability)
    [CAP_STRING, self]  # always use the native String representation
  end
end

class NilClass
  def to_tinyrpc(capability)
    if capability[CAP_NIL] > 0  # prefer the native Nil representation
      [CAP_NIL, ""]
    elsif capability[CAP_RUBY] > 0
      [CAP_RUBY, Marshal.dump(nil)]
    else
      [CAP_STRING, ""]
    end
  end
end

class Fixnum
  def to_tinyrpc(capability)
    if capability[CAP_LONG] > 0 and self >= -0x80000000 and self <=
0x7fffffff
      [CAP_LONG, [self].pack("N")]
    else
      [CAP_STRING, to_s]
    end
  end
end

You get the idea. The RPC caller would then be something like:

  def method_missing(*args)
    @socket.write [TAG_REQUEST, reqid, @ref, args.size].pack("NNNN")
    args.each do |arg|
      tag, data = arg.to_tinyrpc(@capability)
      @socket.write [tag, data.size].pack("NN")
      @socket.write data
    end
    # for synchronous RPC: start a thread, wait for the response, then
    # return the response to the caller
  end

I''m not saying you shouldn''t write a domain-specific solution
for your
needs; I just think this approach would encourage RPC interoperability. Of
course there''s always XMLRPC and SOAP, so perhaps nobody wants another
solution - but both these are expensive, and neither maps well to the domain
which DRb seeks to address. DRb, meanwhile, is very Ruby-specific.

Regards,

Brian.

Chuck Remes

2007-Oct-15 07:42 UTC

head link

[Eventmachine-talk] "packet" framing

On Oct 15, 2007, at 4:51 AM, Bill Kelly wrote:
>
> ----- Original Message -----
> From: "Brian Candler" <B.Candler at pobox.com>
> To: <eventmachine-talk at rubyforge.org>
> Sent: Monday, October 15, 2007 2:19 AM
> Subject: Re: [Eventmachine-talk] "packet" framing
>
>> [snip]
>> A bitmap representing known types would be a simple way to do
>> this. Just AND
>> the sender and receiver capability bitmaps to get the acceptable
>> common
>> ones. Then for any particular object the sender can choose their
>> preferred
>> representation from among the common ones.
>>
>> [Keeping the list of mandatory types small would make the protocol
>> very
>> useful in embedded devices]
>
> Dude, this is already netstrings :D
>
> <3
>
> I know mandatory types can be small if one goes for netstrings :D
>
> That may be the approach
I promised a long time ago to provide a fast and robust netstrings
protocol for eventmachine. I actually began work on this project last
week. I should have something by this time next week. I''ll be sure to
share it with the list as it progresses.

BTW, the ruby code will be a FSM using ragel. It generates super-ugly
code but you never need to really look at it. (This is similar to how
mongrel uses ragel for HTTP parsing.)

Give me the rest of the week....

And now, back to your regularly scheduled discussion about
reinventing RPC/RMI/RIPC/etc.

cr
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/a6d5eb45/attachment.html

Francis Cianfrocca

2007-Oct-15 08:20 UTC

head link

[Eventmachine-talk] "packet" framing

On 10/15/07, Chuck Remes <cremes.devlist at mac.com>
wrote:>
>
>
> I promised a long time ago to provide a fast and robust netstrings
> protocol for eventmachine. I actually began work on this project last week.
> I should have something by this time next week. I''ll be sure to
share it
> with the list as it progresses.
>
> BTW, the ruby code will be a FSM using ragel. It generates super-ugly code
> but you never need to really look at it. (This is similar to how mongrel
> uses ragel for HTTP parsing.)
>
> Give me the rest of the week....
>

Looking forward to this, Chuck. Especially to see how easy ragel is to
integrate with EM.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/5b778c6c/attachment.html

Bill Kelly

2007-Oct-15 13:18 UTC

head link

[Eventmachine-talk] "packet" framing

From: "Bill Kelly" <billk at cts.com>> Dude,
> Dude,
> Dude,
Er.... pardon my redundancy. :D


From: "Brian Candler" <B.Candler at
pobox.com>>
> Hmm, that wasn''t quite what I was thinking of, and RPC is a
different layer
> anyway. (Aside: netstrings aren''t particularly easy to parse for a
binary
> protocol. Personally I''d go for a 4-byte length in network order.
It has a
> limit of 4GB, but how many RPC calls do you make with >4GB of
copy-by-value
> arguments? :-)
>
> Anyway, what I was thinking off was something like this.
>
> 1. client opens TCP connection
>
> ------->  (4 byte magic) (4 byte capability bitmap)
> <-------  (4 byte magic) (4 byte capability bitmap)
>
> 2. both ends AND their capability bitmaps. If the TLS bit is set, they
> both start TLS negotiation (in client or server mode depending on who
> initiated the connection). From this point onwards the protocol is entirely
> symmetrical.
>
> 3. when either side wants to make an RPC, they send a message something
> like this:
>
> ------->  1 byte tag: "request"
>           4 bytes: request ID
>           4 bytes: object ID  (if zero: default object)
>           4 bytes: number of arguments, first argument is method name
>           number of arguments x
>               1 byte: type
>               4 bytes: length
>               N bytes: data
>
> The ''type'' chosen for each value would correspond to one
of the bit
> positions in the capability bitmap.
>
> [Note: it may be more efficient in some situations to round everything up
to
> 4 byte boundaries, this could be a capability flag too]
>
> 4. the response at some point later would be
>
> <-------  1 byte tag: "response"
>           4 bytes: request ID
>           1 byte: type
>           4 bytes: length
>           N bytes: data
>
> The basic types would be:
>
> (0) String. Payload is sequence of N bytes of data. 8-bit clean.
>
> (1) Remote execution error. The N bytes are a string containing info
>    to help diagnosis (e.g. backtrace). Only used as a return value.
>
> (2) Proxy object. The data payload contains 4 bytes of object ID, and
>    the remainder is the object origin URI.
>
> (3) nil (value is ignored)
>
> (4) boolean (value is 0 for false, 1 for true)
>
> (5) signed integer (value is big-endian binary)
>    [maybe should have a fixed 4-byte version for ease of decoding?]
>
> ... extend this list as you see fit
>
> Then at the sender side you marshal using the ''best''
representation
> supported by the sender. A Ruby implementation might look something like
> this:
>
> CAP_STRING = 0
> CAP_ERROR = 1
> CAP_PROXY = 2
> CAP_NIL = 3
> CAP_BOOL = 4
> CAP_LONG = 5
> CAP_RUBY = 6
>
> class Object
>  def to_tinyrpc(capability)
>    if capability[CAP_RUBY] > 0
>      [CAP_RUBY, Marshal.dump(self) rescue Marshal.dump(Proxy.new(self))]
>    else
>      [CAP_STRING, to_s]  # best we can do
>    end
>  end
> end
>
> class String
>  def to_tinyrpc(capability)
>    [CAP_STRING, self]  # always use the native String representation
>  end
> end
>
> class NilClass
>  def to_tinyrpc(capability)
>    if capability[CAP_NIL] > 0  # prefer the native Nil representation
>      [CAP_NIL, ""]
>    elsif capability[CAP_RUBY] > 0
>      [CAP_RUBY, Marshal.dump(nil)]
>    else
>      [CAP_STRING, ""]
>    end
>  end
> end
>
> class Fixnum
>  def to_tinyrpc(capability)
>    if capability[CAP_LONG] > 0 and self >= -0x80000000 and self <=
0x7fffffff
>      [CAP_LONG, [self].pack("N")]
>    else
>      [CAP_STRING, to_s]
>    end
>  end
> end
>
> You get the idea. The RPC caller would then be something like:
>
>  def method_missing(*args)
>    @socket.write [TAG_REQUEST, reqid, @ref,
args.size].pack("NNNN")
>    args.each do |arg|
>      tag, data = arg.to_tinyrpc(@capability)
>      @socket.write [tag, data.size].pack("NN")
>      @socket.write data
>    end
>    # for synchronous RPC: start a thread, wait for the response, then
>    # return the response to the caller
>  end
>
> I''m not saying you shouldn''t write a domain-specific
solution for your
> needs; I just think this approach would encourage RPC interoperability. Of
> course there''s always XMLRPC and SOAP, so perhaps nobody wants
another
> solution - but both these are expensive, and neither maps well to the
domain
> which DRb seeks to address. DRb, meanwhile, is very Ruby-specific.
Hmm, interesting, thanks!

I hadn''t thought about having capability flags at that protocol level.

I''ll admit I''m biased toward Ruby, and so I''ve simply
written C++ code
that can handle all the various Ruby types, including nil, true, false,
etc.

On my previous implementation, Ruby -> C++ RPC was as easy as:

class RubyContextCommBridge < BlankSlate
  def method_missing(*name_and_args)
    @remote.query(name_and_args)
  end
end

And I want the new one to be similarly trivial from the Ruby side.
(So I''m OK with hard-wiring the capabilities to fit Ruby''s
types.)

>From Chuck Remes:
>
> And now, back to your regularly scheduled discussion about reinventing
> RPC/RMI/RIPC/etc.
LOL :)  Sorry for the noise :)

Well if it helps for context, here is a recent conversation about a game
protocol that is 10 years old this month:

[16:30] <R1CH> the bandwidth savings on the new usercmd scaling add up
pretty quick
[16:31] <R1CH> 33MB on quakdev
[16:31] <R1CH> and most people only recently updated their clients
[16:42] <quadz> nifty
[16:43] <quadz> what sort of data is being scaled ?
[16:44] <R1CH> movement speeds
[16:44] <R1CH> the client can send anywhere between -300 to 300 for x/y/z
velocity
[16:44] <R1CH> usually requiring 6 bytes
[16:44] <R1CH> however i steal some unused bits in the buttons byte to
indicate if
the x/y/z are multiples of 5, if so, only send the 1 byte and scale it on the
server
[16:45] <R1CH> 1 byte might not seem like much
[16:45] <R1CH> but client packet rate is usually a lot higher than server
[16:45] <R1CH> so it adds up quickly

Protocol 35 netcode has saved 56693514 bytes.
Protocol 35 compression has saved 89628522 bytes.
Protocol 35 usercommand scaling has saved 26026608 bytes.
R1Q2 playerstate quantization optimization has saved 37328413 bytes.
R1Q2 entity quantization optimization has saved 163290336 bytes.
R1Q2 custom delta management has saved 22637949 bytes.
R1Q2 sv_func_entities_hack has saved 0 bytes. (disabled)
Total byte savings: 395605342 (377.28 MB)

So from that perspective my current RPC protocol reinvention seems
incredibly decadent and wasteful.  :)


Regards,

Bill

Thomas Ptacek

2007-Oct-15 13:26 UTC

head link

[Eventmachine-talk] "packet" framing

Why are you BER-encoding integers? Isn''t that painful?
>     TT_BERINT = 0x80,   // 0x80     - BERINT TYPE FLAG
>                         // 0x40         - BERINT SIGN FLAG
>                         // 0x20         - BERINT CONTINUATION FLAG
>                         // 0x1F         - (0..31) : integer five least
significant bits
>                         //
>                         // If CONT bit is set, then the five bits under the
0x1F mask
>                         // become the low five bits of the continued BER
coded data.
>
>     BERINT_FIRST_BYTE_BIT_LIMIT = 7
> };
--
---
Thomas H. Ptacek // matasano security
read us on the web: http://www.matasano.com/log

Bill Kelly

2007-Oct-15 14:11 UTC

head link

[Eventmachine-talk] "packet" framing

From: "Thomas Ptacek" <tqbf at
matasano.com>>
> Why are you BER-encoding integers? Isn''t that painful?
In my initial draft, I had separate types for different-sized
integers: 8 bits; 16 bits; 32 bits; 64 bits...

BER encoding seemed to simplify things.  Now there''s only one
way of encoding an integer, and this even applied to data
types like strings that are prefixed with an octet count
value.  (However, it occurs to me such values are inherently
unsigned, so I''m wasting a sign bit in those cases.  Hmm...)



Regards,

Bill

Tony Arcieri

2007-Oct-24 23:51 UTC

head link

[Eventmachine-talk] "packet" framing

I did a length prefix framing implementation in the latest release of
DistribuStream and after benchmarking it, discovered that it''s about 3%
slower than LineAndTextProtocol.  I don''t see any immediate
deficiencies in
my implementation and am unsure why this is the case, as I expected the two
to have at the very least similar performance characteristics.

That said, I''ve gone ahead with length prefix framing as it at least
has the
potential to eliminate certain buffering concerns as well as performing
faster once properly optimized.

On 10/15/07, Bill Kelly <billk at cts.com> wrote:>
>
> From: "Thomas Ptacek" <tqbf at matasano.com>
> >
> > Why are you BER-encoding integers? Isn''t that painful?
>
> In my initial draft, I had separate types for different-sized
> integers: 8 bits; 16 bits; 32 bits; 64 bits...
>
> BER encoding seemed to simplify things.  Now there''s only one
> way of encoding an integer, and this even applied to data
> types like strings that are prefixed with an octet count
> value.  (However, it occurs to me such values are inherently
> unsigned, so I''m wasting a sign bit in those cases.  Hmm...)
>
>
>
> Regards,
>
> Bill
>
>
> _______________________________________________
> Eventmachine-talk mailing list
> Eventmachine-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/eventmachine-talk
>

--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
720-227-0129 ext. 202
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071025/3f641385/attachment.html

Tony Arcieri

2007-Oct-24 23:55 UTC

head link

[Eventmachine-talk] "packet" framing

I''m also wondering why the hacked multibyte character support in
LineAndTextProtocol is still in 0.9.0 (I thought it was breaking certain
tests)?  It might also be good to experiment with Bill''s suggestions
regarding using simple string concatenation versus trying to reassemble the
input buffer in an array... simple benchmarks of string concatenation versus
assembling an array then executing a #join show that string concatenation is
faster.

I''d also be willing to contribute the length prefix packet processor I
wrote
under the Ruby license, as I''d certainly like to see it get better
optimized, particularly if it can be incorporated deeper into the Reactor in
some way that mitigates some of the string processing it has to do
presently.

On 10/25/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> I did a length prefix framing implementation in the latest release of
> DistribuStream and after benchmarking it, discovered that it''s
about 3%
> slower than LineAndTextProtocol.  I don''t see any immediate
deficiencies in
> my implementation and am unsure why this is the case, as I expected the two
> to have at the very least similar performance characteristics.
>
> That said, I''ve gone ahead with length prefix framing as it at
least has
> the potential to eliminate certain buffering concerns as well as performing
> faster once properly optimized.
>
> On 10/15/07, Bill Kelly <billk at cts.com> wrote:
> >
> >
> > From: "Thomas Ptacek" <tqbf at matasano.com>
> > >
> > > Why are you BER-encoding integers? Isn''t that painful?
> >
> > In my initial draft, I had separate types for different-sized
> > integers: 8 bits; 16 bits; 32 bits; 64 bits...
> >
> > BER encoding seemed to simplify things.  Now there''s only one
> > way of encoding an integer, and this even applied to data
> > types like strings that are prefixed with an octet count
> > value.  (However, it occurs to me such values are inherently
> > unsigned, so I''m wasting a sign bit in those cases.  Hmm...)
> >
> >
> >
> > Regards,
> >
> > Bill
> >
> >
> > _______________________________________________
> > Eventmachine-talk mailing list
> > Eventmachine-talk at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/eventmachine-talk
> >
>
>
>
> --
> Tony Arcieri
> ClickCaster, Inc.
> tony at clickcaster.com
> 720-227-0129 ext. 202
>


--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
720-227-0129 ext. 202
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071025/c7654a2e/attachment-0001.html

Francis Cianfrocca

2007-Oct-25 05:22 UTC

head link

[Eventmachine-talk] "packet" framing

On 10/25/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> I''m also wondering why the hacked multibyte character support in
> LineAndTextProtocol is still in 0.9.0 (I thought it was breaking certain
> tests)?  It might also be good to experiment with Bill''s
suggestions
> regarding using simple string concatenation versus trying to reassemble the
> input buffer in an array... simple benchmarks of string concatenation
versus
> assembling an array then executing a #join show that string concatenation
is
> faster.

I thought we got rid of that multibyte stuff. At any rate, to my disgust I
found a while ago that LineAndTextProtocol  had problems with the Stomp
protocol (which in effect needs to change the line delimiter in midstream)
and I didn''t want to disturb LineAndTextProtocol, so I wrote LineText2.
Have
a look and see if you can use it.

I''d also be willing to contribute the length prefix packet processor I
wrote> under the Ruby license, as I''d certainly like to see it get better
> optimized, particularly if it can be incorporated deeper into the Reactor
in
> some way that mitigates some of the string processing it has to do
> presently.

How did you package the processor? Send it to me privately and I''ll
have a
look. If the style and the integration make sense to me, we can add it to
the distro. Would need a dual license (just clone the text in other files in
the distro).
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071025/a15345fe/attachment.html

Tony Arcieri

2007-Oct-25 13:38 UTC

head link

[Eventmachine-talk] "packet" framing

On 10/25/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:
> How did you package the processor? Send it to me privately and
I''ll have a
> look.
>
It''s a subclass of EM::Connection.  You can check it out here:

http://distribustream.rubyforge.org/svn/trunk/lib/pdtp/common/length_prefix_protocol.rb

Tested with RSpec (I assume you''d want Test::Unit):

http://distribustream.rubyforge.org/svn/trunk/spec/common/length_prefix_protocol_spec.rb

It supports frames 2-byte or 4-byte NBO prefixes, and allows you to switch
between them on-the-fly if you so desire

If the style and the integration make sense to me, we can add it to
the> distro. Would need a dual license (just clone the text in other files in
the
> distro).
>
That''s fine

--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
720-227-0129 ext. 202
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071025/cf0d00aa/attachment.html

Eventmachine talk - Oct 2007 - "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing

[Eventmachine-talk] "packet" framing