I was looking through the protocol handlers and I didn''t see one for length-prefixed framing. I''m thinking of something like in Erlang, which lets you specify a {packet, N} framing mode. This means the protocol is framed with an N-byte big endian integer representing the length of the subsequent payload. Erlang then gives complete frames to the controlling process. Does something like this exist? If not I think it''d make a nice standard facility. It''s relatively trivial and I''d be willing to write it. -- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com 720-227-0129 ext. 202 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071012/4139d41a/attachment.html
On 10/12/07, Tony Arcieri <tony at clickcaster.com> wrote:> > I was looking through the protocol handlers and I didn''t see one for > length-prefixed framing. > > I''m thinking of something like in Erlang, which lets you specify a > {packet, N} framing mode. This means the protocol is framed with an N-byte > big endian integer representing the length of the subsequent payload. > Erlang then gives complete frames to the controlling process. > > Does something like this exist? If not I think it''d make a nice standard > facility. It''s relatively trivial and I''d be willing to write it.Are you thinking of an includable module that would go into lib/protocols? And it would have methods like #send_packet and #receive_packet that would automatically handle the framing? Sounds like a nice idea. I have two questions: what standard protocol(s) use something like this? AMQP does, but size octets aren''t the only thing that appears in its frame headers. And second, the usual reason to add this to a protocol is for performance, so that clients and servers can allocate all needed memory for a frame in one go. Is that one of the benefits you want to capture? If so, it will require some tinkering down in the guts of the reactor. We''d have to decide how well worth doing it is, as memory-page handling in there is already quite efficient. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071013/a629d389/attachment.html
On 10/12/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> > On 10/12/07, Tony Arcieri <tony at clickcaster.com> wrote: > > > > I was looking through the protocol handlers and I didn''t see one for > > length-prefixed framing. > > > > I''m thinking of something like in Erlang, which lets you specify a > > {packet, N} framing mode. This means the protocol is framed with an N-byte > > big endian integer representing the length of the subsequent payload. > > Erlang then gives complete frames to the controlling process. > > > > Does something like this exist? If not I think it''d make a nice > > standard facility. It''s relatively trivial and I''d be willing to write it. > > > > Are you thinking of an includable module that would go into > lib/protocols? And it would have methods like #send_packet and > #receive_packet that would automatically handle the framing? >Yes and yes Sounds like a nice idea. I have two questions: what standard protocol(s) use> something like this? AMQP does, but size octets aren''t the only thing that > appears in its frame headers. >I don''t know of any protocols which use this offhand. Erlang seems to make use of it for IPC. My main reason for bringing this up is I''d like to use it for framing my protocol. And second, the usual reason to add this to a protocol is for performance,> so that clients and servers can allocate all needed memory for a frame in > one go. Is that one of the benefits you want to capture? >Eventually. I''ve lately been swayed by the arguments of Dan J. Bernstein and others against CRLF framing (or rather, the problems of escaping/buffering in general). While there''s performance to be gained, there''s also the simplicity of an idiot-proof framing. If so, it will require some tinkering down in the guts of the reactor. We''d> have to decide how well worth doing it is, as memory-page handling in there > is already quite efficient. >I''m not too concerned about performance for the time being. I''m just wondering if this would be useful as a standard feature. By the way, my protocol/project should be seeing an open source release in the very near future. I''ll be sure to post to the list when that happens. -- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com 720-227-0129 ext. 202 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071013/4eff14d1/attachment.html
On 10/13/07, Tony Arcieri <tony at clickcaster.com> wrote:> > On 10/12/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote: > > > > On 10/12/07, Tony Arcieri <tony at clickcaster.com> wrote: > > > > > > I was looking through the protocol handlers and I didn''t see one for > > > length-prefixed framing. > > > > > > I''m thinking of something like in Erlang, which lets you specify a > > > {packet, N} framing mode. This means the protocol is framed with an N-byte > > > big endian integer representing the length of the subsequent payload. > > > Erlang then gives complete frames to the controlling process. > > > > > > Does something like this exist? If not I think it''d make a nice > > > standard facility. It''s relatively trivial and I''d be willing to write it. > > > > > > > > Are you thinking of an includable module that would go into > > lib/protocols? And it would have methods like #send_packet and > > #receive_packet that would automatically handle the framing? > > > > Yes and yes > > Sounds like a nice idea. I have two questions: what standard protocol(s) > > use something like this? AMQP does, but size octets aren''t the only thing > > that appears in its frame headers. > > > > I don''t know of any protocols which use this offhand. Erlang seems to > make use of it for IPC. My main reason for bringing this up is I''d like to > use it for framing my protocol. > > And second, the usual reason to add this to a protocol is for performance, > > so that clients and servers can allocate all needed memory for a frame in > > one go. Is that one of the benefits you want to capture? > > > > Eventually. I''ve lately been swayed by the arguments of Dan J. Bernstein > and others against CRLF framing (or rather, the problems of > escaping/buffering in general). While there''s performance to be gained, > there''s also the simplicity of an idiot-proof framing. > > If so, it will require some tinkering down in the guts of the reactor. > > We''d have to decide how well worth doing it is, as memory-page handling in > > there is already quite efficient. > > > > I''m not too concerned about performance for the time being. I''m just > wondering if this would be useful as a standard feature.I think it''s a cool idea. It''s always seemed to me that designing new network protocols is to be avoided in general because it''s incredibly difficult to do well, and there are so many standard ones out there anyway. But I could be wrong. Clearly there''s an important use case for EM in dealing with locally-defined ad hoc protocols that would allow newly-written Ruby programs to interface with legacy apps. What if you were to have a sized-frame protocol such as you describe, and *inside* of each frame there were lines and text or some other complex protocol? You''d want to pass each received frame through another included handler. I''m not sure how graceful it would be to *layer* protocol handlers on top of each other with the current way we do things. Jeff Rose did a lot of work with this idea (and Twisted supports it to some extent) but he never got very far with it, and it hasn''t really come up in practice too often. By the way, my protocol/project should be seeing an open source release in> the very near future. I''ll be sure to post to the list when that happens.Looking forward to that. We also have the website rubyeventmachine.org to fill up with FAQs and use cases, if I ever get the time to do that. (It would be great it someone reading this list had some time to help with that.) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071014/483bc015/attachment-0001.html
On Sun, Oct 14, 2007 at 08:10:52AM -0400, Francis Cianfrocca wrote:> Sounds like a nice idea. I have two questions: what standard > protocol(s) use something like this?Aside: DRb does. An evented DRb client/server might be an interesting tool. Regards, Brian.
On 10/14/07, Brian Candler <B.Candler at pobox.com> wrote:> > On Sun, Oct 14, 2007 at 08:10:52AM -0400, Francis Cianfrocca wrote: > > Sounds like a nice idea. I have two questions: what standard > > protocol(s) use something like this? > > Aside: DRb does. An evented DRb client/server might be an interesting > tool.It certainly would! What would it take to produce such a thing? I would guess that the performance profile of DRb is dominated by marshalling and unmarshalling rather than I/O. Anyone agree or disagree? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071014/35a881c6/attachment.html
From: Francis Cianfrocca> > On 10/14/07, Brian Candler <B.Candler at pobox.com> wrote: > > On Sun, Oct 14, 2007 at 08:10:52AM -0400, Francis Cianfrocca wrote: > > > Sounds like a nice idea. I have two questions: what standard > > > protocol(s) use something like this? > > > > Aside: DRb does. An evented DRb client/server might be an interesting tool. > > It certainly would! What would it take to produce such a thing? > > I would guess that the performance profile of DRb is dominated by > marshalling and unmarshalling rather than I/O. Anyone agree or disagree?For what it''s worth, I''m currently implementing something similar, which I''ll release open source (non-GPL.) I''m not using DRb, because I need Ruby <--> C++ communication, not just Ruby <--> Ruby. My protocol is focused on fast remote procedure call (over TCP or UDP) between both Ruby and non-Ruby nodes. However, the data types understood by the protocol are deliberately Ruby-oriented. The current types understood are: // TransportLayerTypes enum { TT_NIL = 0, // some Ruby types must be preserved as distinct from integers TT_FALSE, TT_TRUE, TT_FLOAT, // ruby might not care float vs. double, but C++ <--> C++ comm. could use it TT_DOUBLE, TT_BINARY_DATA, // data count will be represented as BERINT, followed by DATA OCTETS TT_UTF8_STRING, // encoded same as T_BINARY_DATA TT_SYMBOL, // encoded same as T_BINARY_DATA (what are symbol encodings in ruby? latin1 ?) TT_RUBYOBJ, // This is T_BINARY_DATA which is simply // a Marshal.dump() of a ruby object instance. // (Not parsable by pure C/C++ implementations; however // a C-based network cache server could easily cache // any transport layer object, including RUBYOBJ) TT_BERINT = 0x80, // 0x80 - BERINT TYPE FLAG // 0x40 - BERINT SIGN FLAG // 0x20 - BERINT CONTINUATION FLAG // 0x1F - (0..31) : integer five least significant bits // // If CONT bit is set, then the five bits under the 0x1F mask // become the low five bits of the continued BER coded data. BERINT_FIRST_BYTE_BIT_LIMIT = 7 }; Since my application is game-like, I may end up adding some fixed-point types as well. But anyway, I''m initially focused on a subset of DRb: Just remote procedure calls where the arguments can be integers, floats, strings, vectors and hashes. However, my intention is that marshalled Ruby objects will also be able to be exchanged between Ruby <--> Ruby nodes. Given that, it shouldn''t be very hard to implement full DRb semantics on top of this basic transport layer. NOTE: For performance reasons, I''m probably going to head down a slightly different path than DRb. Because most of the time, I want to avoid waiting for the result of one call before I can make the next call. I may end up using a transactional approach, where I can begin a transaction, issue a whole list of commands, and if any of the commands fails on the remote node, it can rollback to the state at the beginning of the transaction, and return an error code. My initial target is a Ruby node issuing commands to a remote C++ OpenGL scenegraph node. But there will also be a number of separate Ruby processes on localhost also communicating via this protocol. (Probably no need to mention, since this is the Eventmachine list, but yes, this will all be done using Eventmachine. :) Anyway figured I''d mention where I''m headed with this, in case it sounds useful to anyone. Questions/comments/criticism welcome. Regards, Bill
On Sun, Oct 14, 2007 at 02:04:35PM -0400, Francis Cianfrocca wrote:> > On 10/14/07, Brian Candler <[1]B.Candler at pobox.com> wrote: > > On Sun, Oct 14, 2007 at 08:10:52AM -0400, Francis Cianfrocca wrote: > > Sounds like a nice idea. I have two questions: what standard > > protocol(s) use something like this? > Aside: DRb does. An evented DRb client/server might be an > interesting tool. > > It certainly would! What would it take to produce such a thing?Not too much I think. I''ve not seen the DRb protocol documented, but it''s pretty easy to work out from looking at the source (/usr/lib/ruby/1.8/drb/drb.rb). The guts are in class DRbMessage. * dump() sends a 4-byte length followed by a Marshall.dump of that size. If the object isn''t marshallable, then a proxy object is sent instead. * an RPC request (send_request / recv_request) consists of: - dump(target object id) - dump(method name) - dump(num args) - num args x { dump arg[i] } - dump(callback block) * an RPC response (send_response / recv_response) consists of: - dump(success) # false for exception, true for complete - dump(result) # exception or return value Since the whole message isn''t wrapped inside a (size, data) block, you''re forced to read it as a number of separate chunks. So if EventMachine provided a receive_chunk function, you''d still need a little state machine to keep track of where you were.> I would guess that the performance profile of DRb is dominated by > marshalling and unmarshalling rather than I/O. Anyone agree or > disagree?Ruby''s marshalling is extremely fast. But for each received request you have to read 4 or more dumps, each of which requires read(4) followed by read(N). For small requests, hopefully most of those reads will come out of local buffers, but there''s potentially a lot of context switching in the server side if there are multiple concurrent requests. For a client which only has one outstanding request it should be very fast though. Regards, Brian.
On Sun, Oct 14, 2007 at 01:52:48PM -0700, Bill Kelly wrote:> For what it''s worth, I''m currently implementing something similar, > which I''ll release open source (non-GPL.) > > I''m not using DRb, because I need Ruby <--> C++ communication, not > just Ruby <--> Ruby. > > My protocol is focused on fast remote procedure call (over TCP or > UDP) between both Ruby and non-Ruby nodes. However, the data types > understood by the protocol are deliberately Ruby-oriented.I had some thoughts along these lines a while ago, as I needed a lightweight IPC interface between Ruby and Perl. Actually the Perl part of this has evaporated now, so I haven''t needed to develop this further. But for what it''s worth, I have some suggestions if you''re looking to implement a better-than-DRb for heterogenous applications. (1) Make the basic mandatory data type the "string". Allow the two ends to negotiate additional encodings, such as binary integers, IEEE floats, arrays, Ruby Marshal dumps, Perl Storable, PHP serialize etc; but if both ends don''t support anything else, fallback to string representation. A bitmap representing known types would be a simple way to do this. Just AND the sender and receiver capability bitmaps to get the acceptable common ones. Then for any particular object the sender can choose their preferred representation from among the common ones. [Keeping the list of mandatory types small would make the protocol very useful in embedded devices] (2) Like DRb, keep the low-level distinction between "success" and "fail" results. This means you can have a method which *returns* an exception object be distinct from a method which *raises* an exception during its execution. This might be done by having "execution failure" be a basic type of its own. (3) Make "proxy object" be a basic type too (a la DRbUndumped (*)). This allows the receiver to invoke methods on objects held at the caller. Making this a basic type means that a Perl receiver could invoke a Ruby proxy object without having to understand how to unmarshal a Ruby DRbObject. (4) Include support for method calls to be sent in *both* directions down the same socket; that is, have separate ''request'' and ''response'' message types which can be distinguished on the wire, and allow "servers" to send new requests back to "clients". This means that callbacks can be made back down the same socket as the incoming request came on, which in turn means that it will work properly through NAT firewalls - something which DRb is miserable at. [This does rely on the sender keeping the connection nailed up, however. Perhaps each proxy object should also contain an "alternative callback address" to be used in the event that the original connection has been lost] (5) Tag each request and response with an ID, so that you can send multiple overlapping requests and responses down the same socket. This avoids what DRb does, which is opening multiple transport-layer connections to the same target if concurrent requests to the same target are made from separate threads (see DRbConn) A simple embedded client or server can always implement only simple request-response ordering if it wishes. (5a) Dividing large requests into chunks of, say, 64KB might make for better interleaving in this case. (6) Include support for running over Unix domain sockets (surprisingly awkward with DRb) (7) Include support for running over stdin/stdout, e.g. one process which talks to another process that it started using popen(). Possibly also useful: (8) Support for running over TLS (possibly upgrading the connection after the initial handshake) (9) Simple [possibly mutual] authentication at start of the connection (10) Support for running over HTTP (and FastCGI), although that would enforce calls to be unidirectional. (11) Like DRb, you''ll probably need some concept of a "head object" or "default object", to which you can send requests if you don''t know the object ID of any other object on the peer. For simple servers you only want to expose one object anyway. For more complex servers the application writer can provide a directory object; I don''t think you need to worry about this within the protocol itself. Regards, Brian. (*) If this doesn''t mean anything to you, google for DRbTutorial. Unfortunately the RubyGarden wiki appears to be permanently down these days, but you may be able to find a cached copy.
----- Original Message ----- From: "Brian Candler" <B.Candler at pobox.com> To: <eventmachine-talk at rubyforge.org> Sent: Monday, October 15, 2007 2:19 AM Subject: Re: [Eventmachine-talk] "packet" framing> On Sun, Oct 14, 2007 at 01:52:48PM -0700, Bill Kelly wrote: >> For what it''s worth, I''m currently implementing something similar, >> which I''ll release open source (non-GPL.) >> >> I''m not using DRb, because I need Ruby <--> C++ communication, not >> just Ruby <--> Ruby. >> >> My protocol is focused on fast remote procedure call (over TCP or >> UDP) between both Ruby and non-Ruby nodes. However, the data types >> understood by the protocol are deliberately Ruby-oriented. > > I had some thoughts along these lines a while ago, as I needed a lightweight > IPC interface between Ruby and Perl. Actually the Perl part of this has > evaporated now, so I haven''t needed to develop this further.Dude, thanks, I''m <3 (My context is quake2 100hz 3000 mile round trip UDP packets,)> But for what it''s worth, I have some suggestions if you''re looking to > implement a better-than-DRb for heterogenous applications. > > (1) Make the basic mandatory data type the "string". Allow the two ends to > negotiate additional encodings, such as binary integers, IEEE floats, > arrays, Ruby Marshal dumps, Perl Storable, PHP serialize etc; but if both > ends don''t support anything else, fallback to string representation.Dude, I''m with that, but I felt that most RPC calls have OpenGL{ ..... }> > A bitmap representing known types would be a simple way to do this. Just AND > the sender and receiver capability bitmaps to get the acceptable common > ones. Then for any particular object the sender can choose their preferred > representation from among the common ones. > > [Keeping the list of mandatory types small would make the protocol very > useful in embedded devices]Dude, this is already netstrings :D <3 I know mandatory types can be small if one goes for netstrings :D That may be the approach ++good
On Mon, Oct 15, 2007 at 02:51:07AM -0700, Bill Kelly wrote:> > (1) Make the basic mandatory data type the "string". Allow the two ends to > > negotiate additional encodings, such as binary integers, IEEE floats, > > arrays, Ruby Marshal dumps, Perl Storable, PHP serialize etc; but if both > > ends don''t support anything else, fallback to string representation. > > Dude, I''m with that, but I felt that most RPC calls have > OpenGL{ ..... }Well, of course what I wrote was based on what *I* would have found useful :-) I''m not familiar with the OpenGL API; do you just want to call methods OpenGLfoo(x,y,z) where x,y,z are primitive types like integers? Or do you need a more complex data representation (arrays of primitive types? hashes of primitive types? more than that?)> > A bitmap representing known types would be a simple way to do this. Just AND > > the sender and receiver capability bitmaps to get the acceptable common > > ones. Then for any particular object the sender can choose their preferred > > representation from among the common ones. > > > > [Keeping the list of mandatory types small would make the protocol very > > useful in embedded devices] > > Dude, this is already netstrings :DHmm, that wasn''t quite what I was thinking of, and RPC is a different layer anyway. (Aside: netstrings aren''t particularly easy to parse for a binary protocol. Personally I''d go for a 4-byte length in network order. It has a limit of 4GB, but how many RPC calls do you make with >4GB of copy-by-value arguments? :-) Anyway, what I was thinking off was something like this. 1. client opens TCP connection -------> (4 byte magic) (4 byte capability bitmap) <------- (4 byte magic) (4 byte capability bitmap) 2. both ends AND their capability bitmaps. If the TLS bit is set, they both start TLS negotiation (in client or server mode depending on who initiated the connection). From this point onwards the protocol is entirely symmetrical. 3. when either side wants to make an RPC, they send a message something like this: -------> 1 byte tag: "request" 4 bytes: request ID 4 bytes: object ID (if zero: default object) 4 bytes: number of arguments, first argument is method name number of arguments x 1 byte: type 4 bytes: length N bytes: data The ''type'' chosen for each value would correspond to one of the bit positions in the capability bitmap. [Note: it may be more efficient in some situations to round everything up to 4 byte boundaries, this could be a capability flag too] 4. the response at some point later would be <------- 1 byte tag: "response" 4 bytes: request ID 1 byte: type 4 bytes: length N bytes: data The basic types would be: (0) String. Payload is sequence of N bytes of data. 8-bit clean. (1) Remote execution error. The N bytes are a string containing info to help diagnosis (e.g. backtrace). Only used as a return value. (2) Proxy object. The data payload contains 4 bytes of object ID, and the remainder is the object origin URI. (3) nil (value is ignored) (4) boolean (value is 0 for false, 1 for true) (5) signed integer (value is big-endian binary) [maybe should have a fixed 4-byte version for ease of decoding?] ... extend this list as you see fit Then at the sender side you marshal using the ''best'' representation supported by the sender. A Ruby implementation might look something like this: CAP_STRING = 0 CAP_ERROR = 1 CAP_PROXY = 2 CAP_NIL = 3 CAP_BOOL = 4 CAP_LONG = 5 CAP_RUBY = 6 class Object def to_tinyrpc(capability) if capability[CAP_RUBY] > 0 [CAP_RUBY, Marshal.dump(self) rescue Marshal.dump(Proxy.new(self))] else [CAP_STRING, to_s] # best we can do end end end class String def to_tinyrpc(capability) [CAP_STRING, self] # always use the native String representation end end class NilClass def to_tinyrpc(capability) if capability[CAP_NIL] > 0 # prefer the native Nil representation [CAP_NIL, ""] elsif capability[CAP_RUBY] > 0 [CAP_RUBY, Marshal.dump(nil)] else [CAP_STRING, ""] end end end class Fixnum def to_tinyrpc(capability) if capability[CAP_LONG] > 0 and self >= -0x80000000 and self <= 0x7fffffff [CAP_LONG, [self].pack("N")] else [CAP_STRING, to_s] end end end You get the idea. The RPC caller would then be something like: def method_missing(*args) @socket.write [TAG_REQUEST, reqid, @ref, args.size].pack("NNNN") args.each do |arg| tag, data = arg.to_tinyrpc(@capability) @socket.write [tag, data.size].pack("NN") @socket.write data end # for synchronous RPC: start a thread, wait for the response, then # return the response to the caller end I''m not saying you shouldn''t write a domain-specific solution for your needs; I just think this approach would encourage RPC interoperability. Of course there''s always XMLRPC and SOAP, so perhaps nobody wants another solution - but both these are expensive, and neither maps well to the domain which DRb seeks to address. DRb, meanwhile, is very Ruby-specific. Regards, Brian.
On Oct 15, 2007, at 4:51 AM, Bill Kelly wrote:> > ----- Original Message ----- > From: "Brian Candler" <B.Candler at pobox.com> > To: <eventmachine-talk at rubyforge.org> > Sent: Monday, October 15, 2007 2:19 AM > Subject: Re: [Eventmachine-talk] "packet" framing > >> [snip] >> A bitmap representing known types would be a simple way to do >> this. Just AND >> the sender and receiver capability bitmaps to get the acceptable >> common >> ones. Then for any particular object the sender can choose their >> preferred >> representation from among the common ones. >> >> [Keeping the list of mandatory types small would make the protocol >> very >> useful in embedded devices] > > Dude, this is already netstrings :D > > <3 > > I know mandatory types can be small if one goes for netstrings :D > > That may be the approachI promised a long time ago to provide a fast and robust netstrings protocol for eventmachine. I actually began work on this project last week. I should have something by this time next week. I''ll be sure to share it with the list as it progresses. BTW, the ruby code will be a FSM using ragel. It generates super-ugly code but you never need to really look at it. (This is similar to how mongrel uses ragel for HTTP parsing.) Give me the rest of the week.... And now, back to your regularly scheduled discussion about reinventing RPC/RMI/RIPC/etc. cr -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/a6d5eb45/attachment.html
On 10/15/07, Chuck Remes <cremes.devlist at mac.com> wrote:> > > > I promised a long time ago to provide a fast and robust netstrings > protocol for eventmachine. I actually began work on this project last week. > I should have something by this time next week. I''ll be sure to share it > with the list as it progresses. > > BTW, the ruby code will be a FSM using ragel. It generates super-ugly code > but you never need to really look at it. (This is similar to how mongrel > uses ragel for HTTP parsing.) > > Give me the rest of the week.... >Looking forward to this, Chuck. Especially to see how easy ragel is to integrate with EM. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/5b778c6c/attachment.html
From: "Bill Kelly" <billk at cts.com>> Dude, > Dude, > Dude,Er.... pardon my redundancy. :D From: "Brian Candler" <B.Candler at pobox.com>> > Hmm, that wasn''t quite what I was thinking of, and RPC is a different layer > anyway. (Aside: netstrings aren''t particularly easy to parse for a binary > protocol. Personally I''d go for a 4-byte length in network order. It has a > limit of 4GB, but how many RPC calls do you make with >4GB of copy-by-value > arguments? :-) > > Anyway, what I was thinking off was something like this. > > 1. client opens TCP connection > > -------> (4 byte magic) (4 byte capability bitmap) > <------- (4 byte magic) (4 byte capability bitmap) > > 2. both ends AND their capability bitmaps. If the TLS bit is set, they > both start TLS negotiation (in client or server mode depending on who > initiated the connection). From this point onwards the protocol is entirely > symmetrical. > > 3. when either side wants to make an RPC, they send a message something > like this: > > -------> 1 byte tag: "request" > 4 bytes: request ID > 4 bytes: object ID (if zero: default object) > 4 bytes: number of arguments, first argument is method name > number of arguments x > 1 byte: type > 4 bytes: length > N bytes: data > > The ''type'' chosen for each value would correspond to one of the bit > positions in the capability bitmap. > > [Note: it may be more efficient in some situations to round everything up to > 4 byte boundaries, this could be a capability flag too] > > 4. the response at some point later would be > > <------- 1 byte tag: "response" > 4 bytes: request ID > 1 byte: type > 4 bytes: length > N bytes: data > > The basic types would be: > > (0) String. Payload is sequence of N bytes of data. 8-bit clean. > > (1) Remote execution error. The N bytes are a string containing info > to help diagnosis (e.g. backtrace). Only used as a return value. > > (2) Proxy object. The data payload contains 4 bytes of object ID, and > the remainder is the object origin URI. > > (3) nil (value is ignored) > > (4) boolean (value is 0 for false, 1 for true) > > (5) signed integer (value is big-endian binary) > [maybe should have a fixed 4-byte version for ease of decoding?] > > ... extend this list as you see fit > > Then at the sender side you marshal using the ''best'' representation > supported by the sender. A Ruby implementation might look something like > this: > > CAP_STRING = 0 > CAP_ERROR = 1 > CAP_PROXY = 2 > CAP_NIL = 3 > CAP_BOOL = 4 > CAP_LONG = 5 > CAP_RUBY = 6 > > class Object > def to_tinyrpc(capability) > if capability[CAP_RUBY] > 0 > [CAP_RUBY, Marshal.dump(self) rescue Marshal.dump(Proxy.new(self))] > else > [CAP_STRING, to_s] # best we can do > end > end > end > > class String > def to_tinyrpc(capability) > [CAP_STRING, self] # always use the native String representation > end > end > > class NilClass > def to_tinyrpc(capability) > if capability[CAP_NIL] > 0 # prefer the native Nil representation > [CAP_NIL, ""] > elsif capability[CAP_RUBY] > 0 > [CAP_RUBY, Marshal.dump(nil)] > else > [CAP_STRING, ""] > end > end > end > > class Fixnum > def to_tinyrpc(capability) > if capability[CAP_LONG] > 0 and self >= -0x80000000 and self <= 0x7fffffff > [CAP_LONG, [self].pack("N")] > else > [CAP_STRING, to_s] > end > end > end > > You get the idea. The RPC caller would then be something like: > > def method_missing(*args) > @socket.write [TAG_REQUEST, reqid, @ref, args.size].pack("NNNN") > args.each do |arg| > tag, data = arg.to_tinyrpc(@capability) > @socket.write [tag, data.size].pack("NN") > @socket.write data > end > # for synchronous RPC: start a thread, wait for the response, then > # return the response to the caller > end > > I''m not saying you shouldn''t write a domain-specific solution for your > needs; I just think this approach would encourage RPC interoperability. Of > course there''s always XMLRPC and SOAP, so perhaps nobody wants another > solution - but both these are expensive, and neither maps well to the domain > which DRb seeks to address. DRb, meanwhile, is very Ruby-specific.Hmm, interesting, thanks! I hadn''t thought about having capability flags at that protocol level. I''ll admit I''m biased toward Ruby, and so I''ve simply written C++ code that can handle all the various Ruby types, including nil, true, false, etc. On my previous implementation, Ruby -> C++ RPC was as easy as: class RubyContextCommBridge < BlankSlate def method_missing(*name_and_args) @remote.query(name_and_args) end end And I want the new one to be similarly trivial from the Ruby side. (So I''m OK with hard-wiring the capabilities to fit Ruby''s types.)>From Chuck Remes: > > And now, back to your regularly scheduled discussion about reinventing > RPC/RMI/RIPC/etc.LOL :) Sorry for the noise :) Well if it helps for context, here is a recent conversation about a game protocol that is 10 years old this month: [16:30] <R1CH> the bandwidth savings on the new usercmd scaling add up pretty quick [16:31] <R1CH> 33MB on quakdev [16:31] <R1CH> and most people only recently updated their clients [16:42] <quadz> nifty [16:43] <quadz> what sort of data is being scaled ? [16:44] <R1CH> movement speeds [16:44] <R1CH> the client can send anywhere between -300 to 300 for x/y/z velocity [16:44] <R1CH> usually requiring 6 bytes [16:44] <R1CH> however i steal some unused bits in the buttons byte to indicate if the x/y/z are multiples of 5, if so, only send the 1 byte and scale it on the server [16:45] <R1CH> 1 byte might not seem like much [16:45] <R1CH> but client packet rate is usually a lot higher than server [16:45] <R1CH> so it adds up quickly Protocol 35 netcode has saved 56693514 bytes. Protocol 35 compression has saved 89628522 bytes. Protocol 35 usercommand scaling has saved 26026608 bytes. R1Q2 playerstate quantization optimization has saved 37328413 bytes. R1Q2 entity quantization optimization has saved 163290336 bytes. R1Q2 custom delta management has saved 22637949 bytes. R1Q2 sv_func_entities_hack has saved 0 bytes. (disabled) Total byte savings: 395605342 (377.28 MB) So from that perspective my current RPC protocol reinvention seems incredibly decadent and wasteful. :) Regards, Bill
Why are you BER-encoding integers? Isn''t that painful?> TT_BERINT = 0x80, // 0x80 - BERINT TYPE FLAG > // 0x40 - BERINT SIGN FLAG > // 0x20 - BERINT CONTINUATION FLAG > // 0x1F - (0..31) : integer five least significant bits > // > // If CONT bit is set, then the five bits under the 0x1F mask > // become the low five bits of the continued BER coded data. > > BERINT_FIRST_BYTE_BIT_LIMIT = 7 > };-- --- Thomas H. Ptacek // matasano security read us on the web: http://www.matasano.com/log
From: "Thomas Ptacek" <tqbf at matasano.com>> > Why are you BER-encoding integers? Isn''t that painful?In my initial draft, I had separate types for different-sized integers: 8 bits; 16 bits; 32 bits; 64 bits... BER encoding seemed to simplify things. Now there''s only one way of encoding an integer, and this even applied to data types like strings that are prefixed with an octet count value. (However, it occurs to me such values are inherently unsigned, so I''m wasting a sign bit in those cases. Hmm...) Regards, Bill
I did a length prefix framing implementation in the latest release of DistribuStream and after benchmarking it, discovered that it''s about 3% slower than LineAndTextProtocol. I don''t see any immediate deficiencies in my implementation and am unsure why this is the case, as I expected the two to have at the very least similar performance characteristics. That said, I''ve gone ahead with length prefix framing as it at least has the potential to eliminate certain buffering concerns as well as performing faster once properly optimized. On 10/15/07, Bill Kelly <billk at cts.com> wrote:> > > From: "Thomas Ptacek" <tqbf at matasano.com> > > > > Why are you BER-encoding integers? Isn''t that painful? > > In my initial draft, I had separate types for different-sized > integers: 8 bits; 16 bits; 32 bits; 64 bits... > > BER encoding seemed to simplify things. Now there''s only one > way of encoding an integer, and this even applied to data > types like strings that are prefixed with an octet count > value. (However, it occurs to me such values are inherently > unsigned, so I''m wasting a sign bit in those cases. Hmm...) > > > > Regards, > > Bill > > > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk >-- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com 720-227-0129 ext. 202 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071025/3f641385/attachment.html
I''m also wondering why the hacked multibyte character support in LineAndTextProtocol is still in 0.9.0 (I thought it was breaking certain tests)? It might also be good to experiment with Bill''s suggestions regarding using simple string concatenation versus trying to reassemble the input buffer in an array... simple benchmarks of string concatenation versus assembling an array then executing a #join show that string concatenation is faster. I''d also be willing to contribute the length prefix packet processor I wrote under the Ruby license, as I''d certainly like to see it get better optimized, particularly if it can be incorporated deeper into the Reactor in some way that mitigates some of the string processing it has to do presently. On 10/25/07, Tony Arcieri <tony at clickcaster.com> wrote:> > I did a length prefix framing implementation in the latest release of > DistribuStream and after benchmarking it, discovered that it''s about 3% > slower than LineAndTextProtocol. I don''t see any immediate deficiencies in > my implementation and am unsure why this is the case, as I expected the two > to have at the very least similar performance characteristics. > > That said, I''ve gone ahead with length prefix framing as it at least has > the potential to eliminate certain buffering concerns as well as performing > faster once properly optimized. > > On 10/15/07, Bill Kelly <billk at cts.com> wrote: > > > > > > From: "Thomas Ptacek" <tqbf at matasano.com> > > > > > > Why are you BER-encoding integers? Isn''t that painful? > > > > In my initial draft, I had separate types for different-sized > > integers: 8 bits; 16 bits; 32 bits; 64 bits... > > > > BER encoding seemed to simplify things. Now there''s only one > > way of encoding an integer, and this even applied to data > > types like strings that are prefixed with an octet count > > value. (However, it occurs to me such values are inherently > > unsigned, so I''m wasting a sign bit in those cases. Hmm...) > > > > > > > > Regards, > > > > Bill > > > > > > _______________________________________________ > > Eventmachine-talk mailing list > > Eventmachine-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/eventmachine-talk > > > > > > -- > Tony Arcieri > ClickCaster, Inc. > tony at clickcaster.com > 720-227-0129 ext. 202 >-- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com 720-227-0129 ext. 202 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071025/c7654a2e/attachment-0001.html
On 10/25/07, Tony Arcieri <tony at clickcaster.com> wrote:> > I''m also wondering why the hacked multibyte character support in > LineAndTextProtocol is still in 0.9.0 (I thought it was breaking certain > tests)? It might also be good to experiment with Bill''s suggestions > regarding using simple string concatenation versus trying to reassemble the > input buffer in an array... simple benchmarks of string concatenation versus > assembling an array then executing a #join show that string concatenation is > faster.I thought we got rid of that multibyte stuff. At any rate, to my disgust I found a while ago that LineAndTextProtocol had problems with the Stomp protocol (which in effect needs to change the line delimiter in midstream) and I didn''t want to disturb LineAndTextProtocol, so I wrote LineText2. Have a look and see if you can use it. I''d also be willing to contribute the length prefix packet processor I wrote> under the Ruby license, as I''d certainly like to see it get better > optimized, particularly if it can be incorporated deeper into the Reactor in > some way that mitigates some of the string processing it has to do > presently.How did you package the processor? Send it to me privately and I''ll have a look. If the style and the integration make sense to me, we can add it to the distro. Would need a dual license (just clone the text in other files in the distro). -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071025/a15345fe/attachment.html
On 10/25/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> How did you package the processor? Send it to me privately and I''ll have a > look. >It''s a subclass of EM::Connection. You can check it out here: http://distribustream.rubyforge.org/svn/trunk/lib/pdtp/common/length_prefix_protocol.rb Tested with RSpec (I assume you''d want Test::Unit): http://distribustream.rubyforge.org/svn/trunk/spec/common/length_prefix_protocol_spec.rb It supports frames 2-byte or 4-byte NBO prefixes, and allows you to switch between them on-the-fly if you so desire If the style and the integration make sense to me, we can add it to the> distro. Would need a dual license (just clone the text in other files in the > distro). >That''s fine -- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com 720-227-0129 ext. 202 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071025/cf0d00aa/attachment.html