In answer to ''how to make post_init more useful'' The following was my original suggestion. I think it boils down to the fact that we wish that some local variables could be made visible to instances at creation time (and therefore to all functions), through some means or other. One way would be to allow users to ''tweak'' with the mixed-in class that will be used for each connection, thus: One code block operates on the ''server mixed-in class'' itself and then the second ("normal") block operates on each connection, as they are created. Kind of like this: file_name_you_want = ''b'' code_block_run_once_against_new_class = Proc.new { |new_generic_mixed_in_class_itself, connection_info_hash | # this one operates on the class itself, so it could define functions or set class variables def new_generic_mixed_in_class_itself.my_identifier_name # define identification function or what not return file_name_you_want end } # now we pass in that block, and also supply the client block EM.start_server(''localhost'', 8081, block_run_once_against_new_class) {|conn| # this is stuff run dealing only with clients, not server initialization itself (accomplished previously) } This would work well for my type of scenario because I have one server on one port ''per file'' so I can instruct the server (once) which file it''s associated with and have that information available for each client as it attaches. This way makes more sense, as it has an initialization phase, then a connection phase. It also wouldn''t break pre-existing ordering, and might (maybe) lead to a small speed improvement from not having to assign things over and over. It also might not be the most efficient answer, however, as this would prevent you from caching class definitions (should we ever want to, as an optimization), but it would make sense in ones mind for how it works. Despite it being ''more clear'' than what exists, it is also ''still a little trippy'' to have to define two code blocks and pass one in, and to have to operate on a Class instance itself for the one code-block. Just my $0.02 I might code it up sometime. Now...a few questions about EM I still have are: Francis said " EM has its own internal buffers for each inbound connection. " So I am to understand that if one socket''s data incoming data is processed and control returns to EM, it will buffer all the incoming data for each open socket, then process the next socket''s buffered data? Just double checking that things run sensibly :) Another question: how large are EM''s send buffers? I.e. if I do a "send_data ''a''*1000000" will it basically block everything else, or does it buffer strings of arbitrary length? That''s about it! Thanks for everything. -Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071029/96b7f0d0/attachment.html
Francis Cianfrocca
2007-Oct-30 01:27 UTC
[Eventmachine-talk] post init, and some EM questions
On 10/30/07, Roger Pack <rogerpack2005 at gmail.com> wrote:> > In answer to ''how to make post_init more useful'' > The following was my original suggestion. > I think it boils down to the fact that we wish that some local variables > could be made visible to instances at creation time (and therefore to all > functions), through some means or other. One way would be to allow users to > ''tweak'' with the mixed-in class that will be used for each connection, thus: > > > > One code block operates on the ''server mixed-in class'' itself and then > the second ("normal") block operates on each connection, as they are > created. > Kind of like this: > > file_name_you_want = ''b'' > > code_block_run_once_against_new_class = Proc.new { > |new_generic_mixed_in_class_itself, connection_info_hash | > # this one operates on the class itself, so it could define functions or > set class variables > def new_generic_mixed_in_class_itself.my_identifier_name # define > identification function or what not > return file_name_you_want > end > } > > # now we pass in that block, and also supply the client block > EM.start_server(''localhost'', 8081, block_run_once_against_new_class) > {|conn| > # this is stuff run dealing only with clients, not server initialization > itself (accomplished previously) > } > > This would work well for my type of scenario because I have one server on > one port ''per file'' > so I can instruct the server (once) which file it''s associated with and > have that information available for each client as it attaches.I''m not seeing what you need to do that can''t be accomplished by using a Class for your connection-handler rather than an mixed-in module. For example: class A < EM::Connection end class B < EM::Connection end EM.run { start_server( addressA, portA, A ) {|conn| ...... } start_server( addressB, portB, B ) {|conn| ...... } }> Now...a few questions about EM I still have are: > Francis said > " EM has its own internal buffers for each inbound connection. " > So I am to understand that if one socket''s data incoming data is processed > and control returns to EM, it will buffer all the incoming data for each > open socket, then process the next socket''s buffered data? Just double > checking that things run sensibly :)The EM reactor loops around to each connected socket. For each one that has data available to read, "some or all" of the data are read out and passed to user-written event handlers. The reactor blocks until the user code returns, at which point it dispatches data from the next readable socket in the sequence. While this is happening, of course, the kernel is still pulling data off the network as it arrives. On any given pass through the loop, the reactor will only pull some or all of what is available in the kernel''s inbound buffers for any given connection. I put weasel quotes around "some or all" because the reactor tries to be sophisticated about how it reads inbound data. It will coalesce multiple small data packets if possible. With connections that are reading a lot of data, it will sometimes avoid reading it all on each pass to keep from "starving" other connections. There are some other optimizations as well. Another question: how large are EM''s send buffers? I.e. if I do a "send_data> ''a''*1000000" will it basically block everything else, or does it buffer > strings of arbitrary length?There''s no built-in limit on the outbound buffer size. In general, outbound data will be buffered in the process''s userland memory, and will be scheduled out to the network as outbound kernel buffers become available. In order to manage this process better, you can read the size of the outbound (userland) data buffer for any connection and schedule your data writes using EM#next_tick. This technique works spectacularly well with servers that generate a lot of output, but unfortunately is underdocumented. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071030/00b5450b/attachment-0001.html
**>* Hmm. It would be useful for me because I have the *>* FileServer < EM::Connection *>* end *>* class *>* *>* then arbitrarily, later, I want to start new servers to handle new files, *>* on request. So a ''setup'' phase for each server class instance is to tell it *>* which file it is serving, then, as clients connect, it should process those *>* clients appropriately. ** *>I''m still not sure I''m getting this. Would you like to start up multiple >servers, one for each of the files you want to serve? So you''d allocate a >different IP address/port combination for each of these several servers?Correct. It''s kind of for load testing ''what if 5000 of these servers were running simultaneously would it work?'' that type of thing. Regardless of this, some means of passing variables to instances before things like post_init are called would still be useful (even in traditional use). Just some thoughts. -Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071030/ec762a17/attachment.html
Francis Cianfrocca
2007-Oct-30 19:05 UTC
[Eventmachine-talk] post init, and some EM questions
On 10/30/07, Roger Pack <rogerpack2005 at gmail.com> wrote:> > **>* Hmm. It would be useful for me because I have the > *>* FileServer < EM::Connection > *>* end > *>* class > *>* > *>* then arbitrarily, later, I want to start new servers to handle new files, > > *>* on request. So a ''setup'' phase for each server class instance is to tell it > *>* which file it is serving, then, as clients connect, it should process those > *>* clients appropriately. > ** > * > >I''m still not sure I''m getting this. Would you like to start up multiple > >servers, one for each of the files you want to serve? So you''d allocate a > >different IP address/port combination for each of these several servers? > > > Correct. It''s kind of for load testing ''what if 5000 of these servers > were running simultaneously would it work?'' that type of thing. > > Regardless of this, some means of passing variables to instances before > things like post_init are called would still be useful (even in traditional > use). > Just some thoughts. > -RogerMake sure you do this with epoll under Linux 2.6. Otherwise you won''t manage to get 5000 servers running simultaneously. (You''ll top out somewhere around 1020.) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071030/6804a817/attachment.html