thr3ads.net - Eventmachine talk - [Eventmachine-talk] receive_data and define

If this information is useful, please help other people find it:
Share via:

Mark Van De Vyver

2007-May-08 17:36 UTC

[Eventmachine-talk] receive_data and define_method

Hi,
I have a use case where there is some handshake between a server and
client at the start of a connection.  Because data is sent and
received I understand that this handshake should be conducted in the
''connection_completed'' method.

This handshake invovles receive_data, so in the recevie_data method I
need to test whether the handshake has finished or not.  This imposes
some (small?) cost every time receive_data is called.

My thought was to define receive_data dynamically at the instance
level using Ruby''s define_method, the handshake version would be
defined at the end of the ''post_init'' method and the regular
''receive_data'' method defined at the end of the
''connection_completed''
method.
My understanding is the code to use

self.class.send(:define_method, :receive_data, { |data|
receive_handshake_data(data) } )
and
self.class.send(:define_method, :receive_data, { |data|
receive_regular_data(data) } )

where receive_*_data methods are defined in the module/class.

Is this discouraged?  i.e. Is it the case that receive_data will never
be called before completed_connection is called? (That is my
understanding/recollection)

I can see that there might be an issue with the server sending data
between these two calls, where there are several connections,
especially if I defined receive_data at the class level.
I thought that I protected against this by defining receive_data as an
instance method and by doing this at the end of the post_init method
and at the end of the completed_connection method.

Or should I just bear the cost of checking whether the handshake is
done in each call to receive_data?

Appreciate any suggestions.

Regards
Mark

Francis Cianfrocca

2007-May-14 05:19 UTC

head link

[Eventmachine-talk] receive_data and define_method

On 5/8/07, Mark Van De Vyver <mvyver at gmail.com>
wrote:>
> Hi,
> I have a use case where there is some handshake between a server and
> client at the start of a connection.  Because data is sent and
> received I understand that this handshake should be conducted in the
> ''connection_completed'' method.
>
> This handshake invovles receive_data, so in the recevie_data method I
> need to test whether the handshake has finished or not.  This imposes
> some (small?) cost every time receive_data is called.
>
> My thought was to define receive_data dynamically at the instance
> level using Ruby''s define_method, the handshake version would be
> defined at the end of the ''post_init'' method and the
regular
> ''receive_data'' method defined at the end of the
''connection_completed''
> method.
> My understanding is the code to use
>
> self.class.send(:define_method, :receive_data, { |data|
> receive_handshake_data(data) } )
> and
> self.class.send(:define_method, :receive_data, { |data|
> receive_regular_data(data) } )
>
> where receive_*_data methods are defined in the module/class.
>
> Is this discouraged?  i.e. Is it the case that receive_data will never
> be called before completed_connection is called? (That is my
> understanding/recollection)
>
> I can see that there might be an issue with the server sending data
> between these two calls, where there are several connections,
> especially if I defined receive_data at the class level.
> I thought that I protected against this by defining receive_data as an
> instance method and by doing this at the end of the post_init method
> and at the end of the completed_connection method.
>
> Or should I just bear the cost of checking whether the handshake is
> done in each call to receive_data?
>
> Appreciate any suggestions.

The shortest answer I''d be inclined to give you would be to eat the
cost of
the handshake-complete check in receive_data. It doesn''t cost very much
to
set an instance variable in your protocol-handler class or module when the
handshake completes, and then to take a true/false branch on each subsequent
pass through the handler. If your application is running on an actual
network, the cost of doing this will be lost in the noise background.

But I did want to touch on some of the questions you asked as a matter of
general interest. One of the key design points for EM is to behave in a
deterministic, dependable and documented way to incoming network events.

The connection_completed event (or method, from the perspective of user
code) is ONLY fired in classes that are handing TCP client connections. EM
uses nonblocking TCP connects, which are an incredible performance booster,
and connection_completed is used to tell you that a connection to a remote
server has successfully completed. There are several reasons not to do this
in post_init. One of them is that post_init is reasonably expected to
execute almost immediately after you instantiate EventMachine::Connection or
a subclass thereof, and it can easily take dozens of milliseconds to
complete a connection on an internet link. And if the connection fails
entirely, the delay can be thousands of milliseconds! (In the latter case,
EM is *guaranteed* to fire BOTH #post_init and #unbind, but NOT
#connection_completed, which is how you can detect that a connection failure
occurred.)

#receive_data will NEVER be called before #connection_completed. (Although
in some protocols, SMTP for one, it can easily be called immediately
thereafter.) For network geeks: look in the C++ source code for some
commentary on how this is achieved. It differs interestingly from OS to OS.

As an aside, you can easily use the same class to handle a server or a
client. In this case, you can switch in any distinct behaviors of the
client-side in the connection_completed handler, which will never be called
on the server side.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20070514/7701481e/attachment.html

Mark Van De Vyver

2007-May-15 18:35 UTC

head link

[Eventmachine-talk] receive_data and define_method

Hi Francis,
Thanks for your patient explanations.  Yes it works to switch in the
receive_data method.

On reflection, and after tinkering, I realize I was expecting
connection_completed to behave in a special way.  I don''t think this
would be useful, but I suppose I needed a
''post_connect_receive_data(data)'' or
''initial_receive_data(data)''
method that is called only if it is defined by the user''s connection
class. This method would take the place of the _first_ call to receive
data.  On reflection this would start to get messy.  Testing for a
connection and branching in receive_data is definitely cleaner.  As
you reminded, dealing with network transmissions means this type of
test/branch is never going to be the costly part of the application :)

Regards
Mark

On 5/16/07, Francis Cianfrocca <garbagecat10 at gmail.com>
wrote:> Is it working out ok to just switch the handshake-completed detection in
> receive_data?
>

Francis Cianfrocca

2007-May-16 01:48 UTC

head link

[Eventmachine-talk] receive_data and define_method

On 5/8/07, Mark Van De Vyver <mvyver at gmail.com> wrote:
On reflection, and after tinkering, I realize I was expecting
connection_completed to behave in a special way. I don''t think this
would be useful, but I suppose I needed a
''post_connect_receive_data(data)'' or
''initial_receive_data(data)''
method that is called only if it is defined by the user''s connection
class. This method would take the place of the _first_ call to receive
data. On reflection this would start to get messy. Testing for a
connection and branching in receive_data is definitely cleaner. As
you reminded, dealing with network transmissions means this type of
test/branch is never going to be the costly part of the application :)

Mark, I took the liberty of quoting this excerpt from your private email to
me and am answering here so that others may read it and add their comments.

I wonder if you''re thinking of a feature that the Microsoft async
libraries
have, where you can specify that the first chunk of incoming data can be
added to the notification of an accepted server-side connection. If I recall
correctly, IIS uses this as a performance enhancer because (again, if I
recall correctly) it avoids a kernel-crossing.

If that''s what you had in mind, I don''t think I''d be
in favor of something
similar in EM because it would pollute the API (you''d have to add data
handling code to both post_init and receive_data). It might avoid a kernel
crossing, but I doubt the performance benefit would be life-changing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20070516/8f2b59de/attachment.html

Mark Van De Vyver

2007-May-16 16:16 UTC

head link

[Eventmachine-talk] receive_data and define_method

> On 5/8/07, Mark Van De Vyver <mvyver at gmail.com> wrote:
> On reflection, and after tinkering, I realize I was expecting
> connection_completed to behave in a special way.  I don''t think
this
> would be useful, but I suppose I needed a
> ''post_connect_receive_data
> (data)'' or ''initial_receive_data(data)''
> method that is called only if it is defined by the user''s
connection
> class. This method would take the place of the _first_ call to receive
> data.  On reflection this would start to get messy.  Testing for a
> connection and branching in receive_data is definitely cleaner.  As
> you reminded, dealing with network transmissions means this type of
> test/branch is never going to be the costly part of the application :)
>
>
> Mark, I took the liberty of quoting this excerpt from your private email to
> me and am answering here so that others may read it and add their comments.
I should have indicated the list was cc''d. Sorry.
> I wonder if you''re thinking of a feature that the Microsoft async
libraries
> have, where you can specify that the first chunk of incoming data can be
> added to the notification of an accepted server-side connection. If I
recall
> correctly, IIS uses this as a performance enhancer because (again, if I
> recall correctly) it avoids a kernel-crossing.
>
> If that''s what you had in mind, I don''t think
I''d be in favor of something
> similar in EM because it would pollute the API (you''d have to add
data
> handling code to both post_init and receive_data). It might avoid a kernel
> crossing, but I doubt the performance benefit would be life-changing.
My reasoning wasn''t so sophisticated :)
I agree though, when I thought it through, this separate definition of
receive_data didn''t offer any benefit in terms of performance or
parsimonious code.

Regards
Mark

Eventmachine talk - May 2007 - receive_data and define_method

[Eventmachine-talk] receive_data and define_method

[Eventmachine-talk] receive_data and define_method

[Eventmachine-talk] receive_data and define_method

[Eventmachine-talk] receive_data and define_method

[Eventmachine-talk] receive_data and define_method