Mark Van De Vyver
2007-May-08 17:36 UTC
[Eventmachine-talk] receive_data and define_method
Hi, I have a use case where there is some handshake between a server and client at the start of a connection. Because data is sent and received I understand that this handshake should be conducted in the ''connection_completed'' method. This handshake invovles receive_data, so in the recevie_data method I need to test whether the handshake has finished or not. This imposes some (small?) cost every time receive_data is called. My thought was to define receive_data dynamically at the instance level using Ruby''s define_method, the handshake version would be defined at the end of the ''post_init'' method and the regular ''receive_data'' method defined at the end of the ''connection_completed'' method. My understanding is the code to use self.class.send(:define_method, :receive_data, { |data| receive_handshake_data(data) } ) and self.class.send(:define_method, :receive_data, { |data| receive_regular_data(data) } ) where receive_*_data methods are defined in the module/class. Is this discouraged? i.e. Is it the case that receive_data will never be called before completed_connection is called? (That is my understanding/recollection) I can see that there might be an issue with the server sending data between these two calls, where there are several connections, especially if I defined receive_data at the class level. I thought that I protected against this by defining receive_data as an instance method and by doing this at the end of the post_init method and at the end of the completed_connection method. Or should I just bear the cost of checking whether the handshake is done in each call to receive_data? Appreciate any suggestions. Regards Mark
Francis Cianfrocca
2007-May-14 05:19 UTC
[Eventmachine-talk] receive_data and define_method
On 5/8/07, Mark Van De Vyver <mvyver at gmail.com> wrote:> > Hi, > I have a use case where there is some handshake between a server and > client at the start of a connection. Because data is sent and > received I understand that this handshake should be conducted in the > ''connection_completed'' method. > > This handshake invovles receive_data, so in the recevie_data method I > need to test whether the handshake has finished or not. This imposes > some (small?) cost every time receive_data is called. > > My thought was to define receive_data dynamically at the instance > level using Ruby''s define_method, the handshake version would be > defined at the end of the ''post_init'' method and the regular > ''receive_data'' method defined at the end of the ''connection_completed'' > method. > My understanding is the code to use > > self.class.send(:define_method, :receive_data, { |data| > receive_handshake_data(data) } ) > and > self.class.send(:define_method, :receive_data, { |data| > receive_regular_data(data) } ) > > where receive_*_data methods are defined in the module/class. > > Is this discouraged? i.e. Is it the case that receive_data will never > be called before completed_connection is called? (That is my > understanding/recollection) > > I can see that there might be an issue with the server sending data > between these two calls, where there are several connections, > especially if I defined receive_data at the class level. > I thought that I protected against this by defining receive_data as an > instance method and by doing this at the end of the post_init method > and at the end of the completed_connection method. > > Or should I just bear the cost of checking whether the handshake is > done in each call to receive_data? > > Appreciate any suggestions.The shortest answer I''d be inclined to give you would be to eat the cost of the handshake-complete check in receive_data. It doesn''t cost very much to set an instance variable in your protocol-handler class or module when the handshake completes, and then to take a true/false branch on each subsequent pass through the handler. If your application is running on an actual network, the cost of doing this will be lost in the noise background. But I did want to touch on some of the questions you asked as a matter of general interest. One of the key design points for EM is to behave in a deterministic, dependable and documented way to incoming network events. The connection_completed event (or method, from the perspective of user code) is ONLY fired in classes that are handing TCP client connections. EM uses nonblocking TCP connects, which are an incredible performance booster, and connection_completed is used to tell you that a connection to a remote server has successfully completed. There are several reasons not to do this in post_init. One of them is that post_init is reasonably expected to execute almost immediately after you instantiate EventMachine::Connection or a subclass thereof, and it can easily take dozens of milliseconds to complete a connection on an internet link. And if the connection fails entirely, the delay can be thousands of milliseconds! (In the latter case, EM is *guaranteed* to fire BOTH #post_init and #unbind, but NOT #connection_completed, which is how you can detect that a connection failure occurred.) #receive_data will NEVER be called before #connection_completed. (Although in some protocols, SMTP for one, it can easily be called immediately thereafter.) For network geeks: look in the C++ source code for some commentary on how this is achieved. It differs interestingly from OS to OS. As an aside, you can easily use the same class to handle a server or a client. In this case, you can switch in any distinct behaviors of the client-side in the connection_completed handler, which will never be called on the server side. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20070514/7701481e/attachment.html
Mark Van De Vyver
2007-May-15 18:35 UTC
[Eventmachine-talk] receive_data and define_method
Hi Francis, Thanks for your patient explanations. Yes it works to switch in the receive_data method. On reflection, and after tinkering, I realize I was expecting connection_completed to behave in a special way. I don''t think this would be useful, but I suppose I needed a ''post_connect_receive_data(data)'' or ''initial_receive_data(data)'' method that is called only if it is defined by the user''s connection class. This method would take the place of the _first_ call to receive data. On reflection this would start to get messy. Testing for a connection and branching in receive_data is definitely cleaner. As you reminded, dealing with network transmissions means this type of test/branch is never going to be the costly part of the application :) Regards Mark On 5/16/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> Is it working out ok to just switch the handshake-completed detection in > receive_data? >
Francis Cianfrocca
2007-May-16 01:48 UTC
[Eventmachine-talk] receive_data and define_method
On 5/8/07, Mark Van De Vyver <mvyver at gmail.com> wrote: On reflection, and after tinkering, I realize I was expecting connection_completed to behave in a special way. I don''t think this would be useful, but I suppose I needed a ''post_connect_receive_data(data)'' or ''initial_receive_data(data)'' method that is called only if it is defined by the user''s connection class. This method would take the place of the _first_ call to receive data. On reflection this would start to get messy. Testing for a connection and branching in receive_data is definitely cleaner. As you reminded, dealing with network transmissions means this type of test/branch is never going to be the costly part of the application :) Mark, I took the liberty of quoting this excerpt from your private email to me and am answering here so that others may read it and add their comments. I wonder if you''re thinking of a feature that the Microsoft async libraries have, where you can specify that the first chunk of incoming data can be added to the notification of an accepted server-side connection. If I recall correctly, IIS uses this as a performance enhancer because (again, if I recall correctly) it avoids a kernel-crossing. If that''s what you had in mind, I don''t think I''d be in favor of something similar in EM because it would pollute the API (you''d have to add data handling code to both post_init and receive_data). It might avoid a kernel crossing, but I doubt the performance benefit would be life-changing. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20070516/8f2b59de/attachment.html
Mark Van De Vyver
2007-May-16 16:16 UTC
[Eventmachine-talk] receive_data and define_method
> On 5/8/07, Mark Van De Vyver <mvyver at gmail.com> wrote: > On reflection, and after tinkering, I realize I was expecting > connection_completed to behave in a special way. I don''t think this > would be useful, but I suppose I needed a > ''post_connect_receive_data > (data)'' or ''initial_receive_data(data)'' > method that is called only if it is defined by the user''s connection > class. This method would take the place of the _first_ call to receive > data. On reflection this would start to get messy. Testing for a > connection and branching in receive_data is definitely cleaner. As > you reminded, dealing with network transmissions means this type of > test/branch is never going to be the costly part of the application :) > > > Mark, I took the liberty of quoting this excerpt from your private email to > me and am answering here so that others may read it and add their comments.I should have indicated the list was cc''d. Sorry.> I wonder if you''re thinking of a feature that the Microsoft async libraries > have, where you can specify that the first chunk of incoming data can be > added to the notification of an accepted server-side connection. If I recall > correctly, IIS uses this as a performance enhancer because (again, if I > recall correctly) it avoids a kernel-crossing. > > If that''s what you had in mind, I don''t think I''d be in favor of something > similar in EM because it would pollute the API (you''d have to add data > handling code to both post_init and receive_data). It might avoid a kernel > crossing, but I doubt the performance benefit would be life-changing.My reasoning wasn''t so sophisticated :) I agree though, when I thought it through, this separate definition of receive_data didn''t offer any benefit in terms of performance or parsimonious code. Regards Mark