A few random questions for a newbie like me: Is there a way to start a server on ''the next open port'' instead of specifying one? Does send_file send the entire file at once? i.e. is sending a 1G file unadvised? Does eventmachine buffer all input to sockets, until Ruby gets the chance to process it, eventually? I.e. say it takes one incoming chunk of data 5s to process--will incoming data on other ports just fill network queues, or be accepted and buffered, during that time? That might be nice... Do servers have a large backlog? Does that matter? Is there a way to flush a socket? More questions: I would like to submit patches to help the documentation, add useful functions/features, etc. How could I best contribute? Where should I also post suggestions for new features I don''t write? And finally--why is eventmachine so cool? Anyway that''s it for now. Happy to be here :) -- -Roger Pack I like belief. http://www.google.com/search?q=free+bible -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/c6a337f6/attachment.html
On 10/15/07, Roger Pack <rogerpack2005 at gmail.com> wrote:> > A few random questions for a newbie like me: > > Is there a way to start a server on ''the next open port'' instead of > specifying one?No. What does "the next open port" mean in this context? Does send_file send the entire file at once? i.e. is sending a 1G file> unadvised?#send_file was written by Kirk Haines to speed up HTTP servers. It does NOT send the entire file at once, but schedules it out piece by piece, carefully monitoring the size of the outbound kernel buffers as it goes. You can probably send terabytes through it. Does eventmachine buffer all input to sockets, until Ruby gets the chance to> process it, eventually? I.e. say it takes one incoming chunk of data 5s to > process--will incoming data on other ports just fill network queues, or be > accepted and buffered, during that time? That might be nice...EM has its own internal buffers for each inbound connection. They''re somewhat larger than the kernel buffers, and when they fill up, the kernel starts applying backpressure to the remote peers. So the short answer to your question is yes, but the slightly longer answer is why does your processing take 5 seconds? That''s worth reducing or breaking up if you can. Otherwise in a server with a lot of connections, you''ll starve the other ones. Do servers have a large backlog? Does that matter? I assume you mean an accept backlog. It''s 50 or 100 connections iirc, not sure which. Higher than the default on most platforms. Is there a way to flush a socket? Everything in EM is nonblocking. When you call #send_data, it returns immediately, and the data will be sent when the system gets to it. Why do you need an explicit flush? (Which usually means nothing more than copying data to the kernel buffers.) If necessary, that can be simulated now, but if you really needed it, we should add some syntactic sugar to make it easier. More questions:> I would like to submit patches to help the documentation, add useful > functions/features, etc. How could I best contribute? Where should I also > post suggestions for new features I don''t write?Post feature requests to this list. It''s a lot more actively monitored than the lists at rubyforge. Docs: We need FAQs and use cases. If you look at Twisted''s documentation, they have large lists of questions like "How do I do xyz" where xyz is some generally useful thing. Basically recipe lists. EM has gotten so full of functionality that it''s pretty hard to learn all of it, so shortcuts to specific requirements would be a huge help. And finally--why is eventmachine so cool? Because it makes network programming brain-dead easy for Ruby programmers, while *simultaneously* providing extremely good performance and scalability. It eliminates the need for threaded programming in most cases. And it supports a large number of standard internet protocols out of the box, with more on the way. Anyway that''s it for now. Happy to be here :) Glad to have you. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/0ad44e58/attachment.html
Thank you for the quick response! Answer and questions below:> > Is there a way to start a server on ''the next open port'' instead of > > specifying one? > > No. What does "the next open port" mean in this context? >The equivalent of what establishing a client does--just uses any random port. Usefulness being ''I don''t want to choose a bad port!'' [which, BTW I''m not even sure how to handle in EM--I should figure it out and write it up, as you suggested.]> Do servers have a large backlog? Does that matter? > > I assume you mean an accept backlog. It''s 50 or 100 connections iirc, not > sure which. Higher than the default on most platforms. >I believe mongrel itself has the backlog at 1024. Might be worth considering. Is there a way to flush a socket?> > Everything in EM is nonblocking. When you call #send_data, it returns > immediately, and the data will be sent when the system gets to it. Why do > you need an explicit flush? (Which usually means nothing more than copying > data to the kernel buffers.) If necessary, that can be simulated now, but if > you really needed it, we should add some syntactic sugar to make it easier. >If I understand correctly, either the TCP or IP layer caches data (in an attempt to create ''full'' IP packets, instead of partial ones). After a certain timeout it goes ahead and sends the partial packet. Flushing it forces it to send any (currently) partial packets immediately. What might be nice is if EM would just always flush after a send_data so that developers never had to worry about it, ever :) I would totally dig that. Or had a send_data_and_flush_immediately command or what not. Though I do understand your confusion, as it would seem that this is useless to the user, per se, as EM (currently) doesn''t call any functions when data writing is done (see below for a sugg (suggestion) on that). Post feature requests to this list. It''s a lot more actively monitored than> the lists at rubyforge. Docs: We need FAQs and use cases. >I will start a FAQ on the wiki, and add these questions to it :) Is that good place for it? I think I will also make a list of all the ''user definable'' functions and the order each is called (and when), and put it there. Also I have edited the default 0.9.0 files slightly to make them more verbose and comprehensive. Should I just submit patches? Glad to have you. Glad to be here. In terms of suggestions, I do have a few. The first I have is to execute any "associated connection block" (meaning EventMachine::start_server("127.0.0.1", port, EchoServer) { |conn| # this block right here } BEFORE post_init (or is there a different callback that is called after the block, but only once at init that I don''t know of?) as it then allows for more old school style code like module EchoServer attr_accessor :your_special_number def post_init print "my number is #{@your_special_number}" end end 25.times do { |n| EventMachine::start_server("127.0.0.1", port+n, EchoServer) { |conn| conn.your_special_number = n } Which allows post_init to be more of an initializer for the instance, as it can then do things with ''passed in parameters'' (from the block). Sorry if that didn''t make much sense, but anyway it was quite useful for me. It would be a trivial code fix, I could do it and commit it. Another suggestion might be reusable class instances--say you have something that for some reason has a high churn and so is generating tons of instances--it might save memory or what not to be able to just reuse those (of course, this might be a very bad idea, too, in the case of people redefining modules for each instance or what not). A question: is there a function "what is _my_ port (host port, host name)?" suggestion: I created some get_peername_ip and get_peername_host helper functions. Is there a reason they don''t exist? Should I commit them? suggestion: the addition of a post_write function might be nice, as well (or does it exist?) i.e. if I wanted to write the numbers from 1-10000000 onto the wire for some reason, I couldn''t do that all in one loop (or could I?--even if I could I might not want to as it might starve the others). If I wanted to, a post_write function would be nice (maybe pass it the total number of bytes ever sent or something). Then I could send it piece-wise. Another suggestion might be the creation of a function send_partial (the equivalent of send in python) -- maybe it can report how much it was able to send ''instantaneously''. I wouldn''t actually need it for anything, but maybe somebody somewhere would like such a feature :) Well that''s about it for now! EM rox! -Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/2d220ce3/attachment.html
I second the "who am I" queries, I''ve wanted such information recently and just had to do a very hacky counting scheme. A block that''s run at Connection creation, as Roger suggested, seems like the right way to go. And yes, some sort of helper for: port, address = Socket.unpack_addr_in(EM.get_peername) because that''s just ugly. Maybe EM::get_connection_info? Also, I''d like to put up some information on the website, mainly a few quick tutorials on the simple stuff I''ve done so far with EM. As I''m a huge fan of Trac, what would it take to install and run it on the site? Then we have a wiki, code browser, and bug tracker all in one. Jason On 10/15/07, Roger Pack <rogerpack2005 at gmail.com> wrote:> > Thank you for the quick response! Answer and questions below: > > > > > Is there a way to start a server on ''the next open port'' instead of > > > specifying one? > > > > No. What does "the next open port" mean in this context? > > > > The equivalent of what establishing a client does--just uses any random > port. Usefulness being ''I don''t want to choose a bad port!'' [which, BTW I''m > not even sure how to handle in EM--I should figure it out and write it up, > as you suggested.] > > > > Do servers have a large backlog? Does that matter? > > > > I assume you mean an accept backlog. It''s 50 or 100 connections iirc, > > not sure which. Higher than the default on most platforms. > > > > I believe mongrel itself has the backlog at 1024. Might be worth > considering. > > Is there a way to flush a socket? > > > > Everything in EM is nonblocking. When you call #send_data, it returns > > immediately, and the data will be sent when the system gets to it. Why do > > you need an explicit flush? (Which usually means nothing more than copying > > data to the kernel buffers.) If necessary, that can be simulated now, but if > > you really needed it, we should add some syntactic sugar to make it easier. > > > > If I understand correctly, either the TCP or IP layer caches data (in an > attempt to create ''full'' IP packets, instead of partial ones). After a > certain timeout it goes ahead and sends the partial packet. Flushing it > forces it to send any (currently) partial packets immediately. What might > be nice is if EM would just always flush after a send_data so that > developers never had to worry about it, ever :) I would totally dig that. > Or had a send_data_and_flush_immediately command or what not. Though I do > understand your confusion, as it would seem that this is useless to the > user, per se, as EM (currently) doesn''t call any functions when data writing > is done (see below for a sugg (suggestion) on that). > > > Post feature requests to this list. It''s a lot more actively monitored > > than the lists at rubyforge. Docs: We need FAQs and use cases. > > > > > I will start a FAQ on the wiki, and add these questions to it :) Is that > good place for it? I think I will also make a list of all the ''user > definable'' functions and the order each is called (and when), and put it > there. > > Also I have edited the default 0.9.0 files slightly to make them more > verbose and comprehensive. Should I just submit patches? > > Glad to have you. > > Glad to be here. > > In terms of suggestions, I do have a few. The first I have is to execute > any "associated connection block" (meaning > > EventMachine::start_server(" 127.0.0.1", port, EchoServer) { |conn| > # this block right here > } > BEFORE post_init (or is there a different callback that is called after > the block, but only once at init that I don''t know of?) > > as it then allows for more old school style code like > module EchoServer > attr_accessor :your_special_number > def post_init > print "my number is #{@your_special_number}" > end > end > > 25.times do { |n| > EventMachine::start_server("127.0.0.1", port+n, EchoServer) { |conn| > conn.your_special_number = n > } > > Which allows post_init to be more of an initializer for the instance, as > it can then do things with ''passed in parameters'' (from the block). Sorry > if that didn''t make much sense, but anyway it was quite useful for me. > It would be a trivial code fix, I could do it and commit it. > > > Another suggestion might be reusable class instances--say you have > something that for some reason has a high churn and so is generating tons of > instances--it might save memory or what not to be able to just reuse those > (of course, this might be a very bad idea, too, in the case of people > redefining modules for each instance or what not). > > A question: is there a function "what is _my_ port (host port, host > name)?" > > suggestion: I created some get_peername_ip and get_peername_host helper > functions. Is there a reason they don''t exist? Should I commit them? > > > suggestion: the addition of a post_write function might be nice, as well > (or does it exist?) i.e. if I wanted to write the numbers from 1-10000000 > onto the wire for some reason, I couldn''t do that all in one loop (or could > I?--even if I could I might not want to as it might starve the others). If > I wanted to, a post_write function would be nice (maybe pass it the total > number of bytes ever sent or something). Then I could send it piece-wise. > > Another suggestion might be the creation of a function send_partial (the > equivalent of send in python) -- maybe it can report how much it was able to > send ''instantaneously''. I wouldn''t actually need it for anything, but maybe > somebody somewhere would like such a feature :) > > > Well that''s about it for now! > EM rox! > -Roger > > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/36764c4f/attachment-0001.html
There''s a small wiki at http://rubyforge.org/plugins/usemodwiki/include/?group_id=1555 maybe that might be worth considering.> > Also, I''d like to put up some information on the website, mainly a few > quick tutorials on the simple stuff I''ve done so far with EM. As I''m a huge > fan of Trac, what would it take to install and run it on the site? Then we > have a wiki, code browser, and bug tracker all in one. > > Jason >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/b17410bf/attachment.html
On 10/15/07, Roger Pack <rogerpack2005 at gmail.com> wrote:> > > The equivalent of what establishing a client does--just uses any random > port. Usefulness being ''I don''t want to choose a bad port!'' [which, BTW I''m > not even sure how to handle in EM--I should figure it out and write it up, > as you suggested.] >Not really useful, because you have to then somehow tell your clients which ephemeral port you picked! In regard to "bad ports": you can''t pick an adress/port combination that is already in use, because EM will complain loudly. If you''re concerned about not picking a port that is already "standard" for some usage or other, then look in /etc/services to see if the number you want is already in use. Do servers have a large backlog? Does that matter?> > > > I assume you mean an accept backlog. It''s 50 or 100 connections iirc, > > not sure which. Higher than the default on most platforms. > > > > I believe mongrel itself has the backlog at 1024. Might be worth > considering. >Setting a large backlog actually consumes a lot of kernel resources. I don''t see that it''s worth doing unless socket acceptance proceeds slowly for some reason. (Remember, mongrel is multithreaded in Ruby, so it won''t be as fast as EM.) I could be convinced otherwise. Is there a way to flush a socket?> > > > Everything in EM is nonblocking. When you call #send_data, it returns > > immediately, and the data will be sent when the system gets to it. Why do > > you need an explicit flush? (Which usually means nothing more than copying > > data to the kernel buffers.) If necessary, that can be simulated now, but if > > you really needed it, we should add some syntactic sugar to make it easier. > > > > If I understand correctly, either the TCP or IP layer caches data (in an > attempt to create ''full'' IP packets, instead of partial ones). After a > certain timeout it goes ahead and sends the partial packet. Flushing it > forces it to send any (currently) partial packets immediately. What might > be nice is if EM would just always flush after a send_data so that > developers never had to worry about it, ever :) I would totally dig that. > Or had a send_data_and_flush_immediately command or what not. Though I do > understand your confusion, as it would seem that this is useless to the > user, per se, as EM (currently) doesn''t call any functions when data writing > is done (see below for a sugg (suggestion) on that). >It sounds like you''re talking about disabling the "slow-start" or Nagle algorithm as it applies to TCP. Usually I always turn that off when I write a network app, so I checked in EM just to see. Lo and behold, it does *not* disable Nagling. That''s worth a try, so thanks for pointing it out. Post feature requests to this list. It''s a lot more actively monitored than> > the lists at rubyforge. Docs: We need FAQs and use cases. > > > > > I will start a FAQ on the wiki, and add these questions to it :) Is that > good place for it? I think I will also make a list of all the ''user > definable'' functions and the order each is called (and when), and put it > there. >There''s a web site at http://rubyeventmachine.org which has nothing but an under-construction page at the moment. That would be the ideal place to post new material. There''s no TRAC instance up there yet, but I''ve wanted to do that for a while now. I''ll post here when that''s done. It would be great to have interested folks help getting this effort off the ground. I''d put up your FAQ on the Rubyforge Wiki, and expect that it will migrate to the actual EM site, hopefully soon. Just make sure you post announcements here as you add content. Also I have edited the default 0.9.0 files slightly to make them more> verbose and comprehensive. Should I just submit patches? >Please do.> > > In terms of suggestions, I do have a few. The first I have is to execute > any "associated connection block" (meaning > > EventMachine::start_server(" 127.0.0.1", port, EchoServer) { |conn| > # this block right here > } > BEFORE post_init (or is there a different callback that is called after > the block, but only once at init that I don''t know of?) > > as it then allows for more old school style code like > module EchoServer > attr_accessor :your_special_number > def post_init > print "my number is #{@your_special_number}" > end > end > > 25.times do { |n| > EventMachine::start_server("127.0.0.1", port+n, EchoServer) { |conn| > conn.your_special_number = n > } > > Which allows post_init to be more of an initializer for the instance, as > it can then do things with ''passed in parameters'' (from the block). Sorry > if that didn''t make much sense, but anyway it was quite useful for me. > It would be a trivial code fix, I could do it and commit it.There already is a specific and defined protocol for the associated block, and we shouldn''t change it because there is probably a lot of code that depends on it. Take the following code: class X < EM::Connection def initialize *args super STDERR.puts "initializer" end def post_init STDERR.puts "post init" end end EM.run { EM.connect( some_host, some_port, X ) {|conn| STDERR.puts "block" } } When a connection is accepted, the subclassed constructor (initialize) is called FIRST. #post_init is called INSIDE of the base class constructor. (You MUST remember to call #super if you override #initialize.) Obviously you can execute your own code before and/or after the super call, but keep in mind the #post_init will execute and return before #super returns. Hope that was clear. The block that you pass to #connect will be called AFTER both initialize and post_init have been called and returned. Another suggestion might be reusable class instances--say you have something> that for some reason has a high churn and so is generating tons of > instances--it might save memory or what not to be able to just reuse those > (of course, this might be a very bad idea, too, in the case of people > redefining modules for each instance or what not).Hmm. That''s not such a bad idea. User code would need to provide a "reinitialize" method of some kind. A question: is there a function "what is _my_ port (host port, host name)?"> > suggestion: I created some get_peername_ip and get_peername_host helper > functions. Is there a reason they don''t exist? Should I commit them?I found that to be something of a pain because some connection types (like Unix-domain sockets and UDP sockets) don''t implement these things the same way as TCP clients do. What''s the best approach here? Provide helper functions for all the different connection types and just throw exceptions if you use one that doesn''t make sense? suggestion: the addition of a post_write function might be nice, as well (or> does it exist?) i.e. if I wanted to write the numbers from 1-10000000 onto > the wire for some reason, I couldn''t do that all in one loop (or could > I?--even if I could I might not want to as it might starve the others). If > I wanted to, a post_write function would be nice (maybe pass it the total > number of bytes ever sent or something). Then I could send it piece-wise.There is a complete mechanism for doing this, but it could really use a FAQ entry. Look at the rdoc for the #next_tick function. Also, read the SPAWNED_PROCESSES document. Another suggestion might be the creation of a function send_partial (the> equivalent of send in python) -- maybe it can report how much it was able to > send ''instantaneously''. I wouldn''t actually need it for anything, but maybe > somebody somewhere would like such a feature :)I can see the point of this in a program that needs to explicitly manage the size of the outbound network buffers, but EM wraps all this up. I''m open to adding advanced functions that might expose something like this, but in general it''s better not to rely on the fancy stuff unless really necessary. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/5dd3efd4/attachment.html
On 10/15/07, Jason Roelofs <jameskilton at gmail.com> wrote:> > I second the "who am I" queries, I''ve wanted such information recently and > just had to do a very hacky counting scheme. A block that''s run at > Connection creation, as Roger suggested, seems like the right way to go.EM.run { EM.connect( host, port, handler ) {|conn| # will be called after #post_init } } And yes, some sort of helper for:> > port, address = Socket.unpack_addr_in(EM.get_peername) > > because that''s just ugly. Maybe EM::get_connection_info?Whatever we do here has to make sense for TCP clients, unix-domain sockets, UDP sockets, even process IDs and file handles. Any suggestions on a good API that would behave reasonably for any connection type? Maybe we return a hash with only the values filled in that make sense? Also, I''d like to put up some information on the website, mainly a few quick> tutorials on the simple stuff I''ve done so far with EM. As I''m a huge fan of > Trac, what would it take to install and run it on the site? Then we have a > wiki, code browser, and bug tracker all in one.Are you a good Trac admin? Want to set it up on rubyeventmachine.org? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/299e7c24/attachment-0001.html
> It would be a trivial code fix, I could do it and commit it.The repurcussions are not so trivial. This "fix" would doubtless break existing code that depends on the order of those two calls. Mark Z. _____ From: eventmachine-talk-bounces at rubyforge.org [mailto:eventmachine-talk-bounces at rubyforge.org] On Behalf Of Roger Pack Sent: Monday, October 15, 2007 1:41 PM To: eventmachine-talk at rubyforge.org Subject: Re: [Eventmachine-talk] EM random questions In terms of suggestions, I do have a few. The first I have is to execute any "associated connection block" (meaning EventMachine::start_server(" 127.0.0.1 <http://127.0.0.1> ", port, EchoServer) { |conn| # this block right here } BEFORE post_init (or is there a different callback that is called after the block, but only once at init that I don''t know of?) as it then allows for more old school style code like module EchoServer attr_accessor :your_special_number def post_init print "my number is #{@your_special_number}" end end 25.times do { |n| EventMachine::start_server("127.0.0.1", port+n, EchoServer) { |conn| conn.your_special_number = n } Which allows post_init to be more of an initializer for the instance, as it can then do things with ''passed in parameters'' (from the block). Sorry if that didn''t make much sense, but anyway it was quite useful for me. It would be a trivial code fix, I could do it and commit it. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/fc7e9d60/attachment-0001.html
Comments below:> > The equivalent of what establishing a client does--just uses any random > > port. Usefulness being ''I don''t want to choose a bad port!'' [which, BTW I''m > > not even sure how to handle in EM--I should figure it out and write it up, > > as you suggested.] > > >Not really useful, because you have to then somehow tell your clients which> ephemeral port you picked! In regard to "bad ports": you can''t pick an > adress/port combination that is already in use, because EM will complain > loudly. If you''re concerned about not picking a port that is already > "standard" for some usage or other, then look in /etc/services to see if the > number you want is already in use. >Python does this for server acceptors, and it''s not truly useful but more convenient than anything. Just a thought for a future version :) Do servers have a large backlog? Does that matter?> > > > > > I assume you mean an accept backlog. It''s 50 or 100 connections iirc, > > > not sure which. Higher than the default on most platforms. > > > > > > > I believe mongrel itself has the backlog at 1024. Might be worth > > considering. > > > Setting a large backlog actually consumes a lot of kernel resources. I > don''t see that it''s worth doing unless socket acceptance proceeds slowly for > some reason. (Remember, mongrel is multithreaded in Ruby, so it won''t be as > fast as EM.) I could be convinced otherwise. >It''s probably fine Is there a way to flush a socket?> > > > > > It sounds like you''re talking about disabling the "slow-start" or > Nagle algorithm as it applies to TCP. Usually I always turn that off when I > write a network app, so I checked in EM just to see. Lo and behold, it does > *not* disable Nagling. That''s worth a try, so thanks for pointing it out. >I had no idea it is called a nagle algorithm :) Yeah turning it off would work or what not. I know back in the day when I did C socket stuff we''d need to flush to ensure that what I sent was actually put on the wire. slow-start is impossible to turn off AFAIK :) Post feature requests to this list. It''s a lot more actively monitored than> > > the lists at rubyforge. Docs: We need FAQs and use cases. > > > > > > > > > I will start a FAQ on the wiki, and add these questions to it :) Is > > that good place for it? I think I will also make a list of all the ''user > > definable'' functions and the order each is called (and when), and put it > > there. > > > There''s a web site at http://rubyeventmachine.org which has nothing but an > under-construction page at the moment. That would be the ideal place to post > new material. There''s no TRAC instance up there yet, but I''ve wanted to do > that for a while now. I''ll post here when that''s done. It would be great to > have interested folks help getting this effort off the ground. I''d put up > your FAQ on the Rubyforge Wiki, and expect that it will migrate to the > actual EM site, hopefully soon. Just make sure you post announcements here > as you add content. > > > Also I have edited the default 0.9.0 files slightly to make them more > > verbose and comprehensive. Should I just submit patches? > > > Please do. > > > > > > > > In terms of suggestions, I do have a few. The first I have is to > > execute any "associated connection block" (meaning > > > > EventMachine::start_server(" 127.0.0.1", port, EchoServer) { |conn| > > # this block right here > > } > > BEFORE post_init (or is there a different callback that is called after > > the block, but only once at init that I don''t know of?) > > > > as it then allows for more old school style code like > > module EchoServer > > attr_accessor :your_special_number > > def post_init > > print "my number is #{@your_special_number}" > > end > > end > > > > 25.times do { |n| > > EventMachine::start_server("127.0.0.1", port+n, EchoServer) { |conn| > > conn.your_special_number = n > > } > > > > Which allows post_init to be more of an initializer for the instance, as > > it can then do things with ''passed in parameters'' (from the block). Sorry > > if that didn''t make much sense, but anyway it was quite useful for me. > > It would be a trivial code fix, I could do it and commit it. > > > There already is a specific and defined protocol for the associated block, > and we shouldn''t change it because there is probably a lot of code that > depends on it. Take the following code: > class X < EM::Connection > def initialize *args > super > STDERR.puts "initializer" > end > def post_init > STDERR.puts "post init" > end > end > > EM.run { > EM.connect ( some_host, some_port, X ) {|conn| > STDERR.puts "block" > } > } > > When a connection is accepted, the subclassed constructor (initialize) is > called FIRST. #post_init is called INSIDE of the base class constructor. > (You MUST remember to call #super if you override #initialize.) Obviously > you can execute your own code before and/or after the super call, but keep > in mind the #post_init will execute and return before #super returns. Hope > that was clear. > > The block that you pass to #connect will be called AFTER both initialize > and post_init have been called and returned. >Right so the main justification to not changing it is that it might break legacy code, n''est pas? I think that what I was referring to is that we don''t have direct access to the initialize call, but in retrospect I guess it shouldn''t matter too much what order things go in--you just have to accomodate for it--for example if I want post_init to be called after the block, I can just name my own function called "post_block" and call it at the end of my block :) Thanks :) A question: is there a function "what is _my_ port (my port, my hostname or> > ip)?" > >Any answer to that one? suggestion: I created some get_peername_ip and get_peername_host helper> > functions. Is there a reason they don''t exist? Should I commit them? > > I found that to be something of a pain because some connection types (like > Unix-domain sockets and UDP sockets) don''t implement these things the same > way as TCP clients do. What''s the best approach here? Provide helper > functions for all the different connection types and just throw exceptions > if you use one that doesn''t make sense? >A hash sounds like a pretty good idea, without thinking about it too hard.> There is a complete mechanism for doing this, but it could really use aFAQ entry. Look at the rdoc for the #next_tick function. Also, read the SPAWNED_PROCESSES document. Thanks! Another suggestion might be the creation of a function send_partial (the> > equivalent of send in Python) -- maybe it can report how much it was able to > > send ''instantaneously''. I wouldn''t actually need it for anything, but maybe > > somebody somewhere would like such a feature :) > > I can see the point of this in a program that needs to explicitly manage > the size of the outbound network buffers, but EM wraps all this up. I''m open > to adding advanced functions that might expose something like this, but in > general it''s better not to rely on the fancy stuff unless really necessary. >I wouldn''t use it myself, but maybe someday it would indeed be useful. Sounds good. When I figure out the tick stuff I''ll post it. Go EM. -Roger Pack I like belief. http://www.google.com/search?q=free+bible -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071015/f5a9042c/attachment.html
On 10/16/07, Roger Pack <rogerpack2005 at gmail.com> wrote:> Comments below: ><snip>> > > > > suggestion: I created some get_peername_ip and get_peername_host helper > functions. Is there a reason they don''t exist? Should I commit them? > > > > I found that to be something of a pain because some connection types (like > Unix-domain sockets and UDP sockets) don''t implement these things the same > way as TCP clients do. What''s the best approach here? Provide helper > functions for all the different connection types and just throw exceptions > if you use one that doesn''t make sense? > > A hash sounds like a pretty good idea, without thinking about it too hard.a using arrayfields ? http://www.codeforpeople.com/lib/ruby/arrayfields/arrayfields-4.5.0/README benefit: "arrays with keyword access require much less memory when compared to hashes/objects and yet still provide fast lookup and preserve data order." HTH Mark> > > There is a complete mechanism for doing this, but it could really use a > FAQ entry. Look at the rdoc for the #next_tick function. Also, read the > SPAWNED_PROCESSES document. > > Thanks! > > > > > > > > Another suggestion might be the creation of a function send_partial (the > equivalent of send in Python) -- maybe it can report how much it was able to > send ''instantaneously''. I wouldn''t actually need it for anything, but maybe > somebody somewhere would like such a feature :) > > > > I can see the point of this in a program that needs to explicitly manage > the size of the outbound network buffers, but EM wraps all this up. I''m open > to adding advanced functions that might expose something like this, but in > general it''s better not to rely on the fancy stuff unless really necessary. > > I wouldn''t use it myself, but maybe someday it would indeed be useful. > Sounds good. When I figure out the tick stuff I''ll post it. Go EM. > > -Roger Pack > I like belief. http://www.google.com/search?q=free+bible > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk >
On 10/15/07, Roger Pack <rogerpack2005 at gmail.com> wrote:> > > > > > Is there a way to flush a socket? > > > > > > > > It sounds like you''re talking about disabling the "slow-start" or > > Nagle algorithm as it applies to TCP. Usually I always turn that off when I > > write a network app, so I checked in EM just to see. Lo and behold, it does > > *not* disable Nagling. That''s worth a try, so thanks for pointing it out. > > > > > I had no idea it is called a nagle algorithm :) Yeah turning it off would > work or what not. I know back in the day when I did C socket stuff we''d > need to flush to ensure that what I sent was actually put on the wire. > slow-start is impossible to turn off AFAIK :) >You turn it off with the sockopt TCP_NODELAY. I was so surprised to find it missing in EM that I looked a little harder. It''s in ext/ed.cpp, line 874 of the HEAD revision. "Flushing" a socket or diasbling slow-start really doesn''t necessarily put data on the wire any faster. All it does is ensure that it''s in the outbound kernel buffers. The network-hardware driver determines when the data actually gets transmitted, and afaik there''s no generally portable way to control this. Nor would you want to.> > EventMachine::start_server(" 127.0.0.1", port, EchoServer) { |conn| > > > # this block right here > > > } > > > BEFORE post_init (or is there a different callback that is called > > > after the block, but only once at init that I don''t know of?) > > > > > > as it then allows for more old school style code like > > > module EchoServer > > > attr_accessor :your_special_number > > > def post_init > > > print "my number is #{@your_special_number}" > > > end > > > end > > > > > > 25.times do { |n| > > > EventMachine::start_server("127.0.0.1", port+n, EchoServer) { |conn| > > > conn.your_special_number = n > > > } > > > >Yes, I''m really seeing that it''s inconvenient not to have any access to the #initialize method of an accepted connection, because by definition EM calls it. That''s precisely the reason that #start_server was given the ability to process a block which received the newly-accepted connection as a parameter. It does the job in the sense that there''s nothing you can''t do, but it''s somewhat clunky and un-Rubyesque. And you have to remember that #initialize and #post_init are all finished by the time the block gets called. I''d be delighted if someone came up with a better syntax. Using #post_init as an initializer is a bit of a challenge because EM calls it without parameters. On the whole it''s probably better to define your own initializer method and call it (with parameters) from the block that''s passed to start_server. We probably can''t change the invocation order of #post_init without breaking existing code. A question: is there a function "what is _my_ port (my port, my hostname or> > > ip)?" > > > > > Any answer to that one? >I''m being stupid :-), I don''t understand the question. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071016/4c94be51/attachment.html
Thanks for the replies! More as follows. You turn it off with the sockopt TCP_NODELAY. I was so surprised to find it> missing in EM that I looked a little harder. It''s in ext/ed.cpp, line 874 of > the HEAD revision. "Flushing" a socket or diasbling slow-start really > doesn''t necessarily put data on the wire any faster. All it does is ensure > that it''s in the outbound kernel buffers. The network-hardware driver > determines when the data actually gets transmitted, and afaik there''s no > generally portable way to control this. Nor would you want to. >Ahh good. It appears that (from this) So to sum it all up, if you are having trouble and need to flush the socket, setting the TCP_NODELAY option will usually solve the problem. with nagle disabled flushing is unnecessary. Nice. Thanks. Slow start is a little different :) It does the job in the sense that there''s nothing you can''t do, but it''s> somewhat clunky and un-Rubyesque. And you have to remember that #initialize > and #post_init are all finished by the time the block gets called. I''d be > delighted if someone came up with a better syntax. >Me too--you really have to wrap your mind around it to understand ''there is no real available initialize function--not even the block.'' The thing I found odd to understand is that since there''s no ''instance'' when you start a server you can''t do EM.start_server ''localhost'', 8080 { |server| server.your_special_number_is 25} {|conn| print "got connection to server"} (wants two blocks in my mind) If the block were executed first, then post_init would be like the client block. Unfortunately using the block as an initializer is no longer an option, which option would have made a lot of sense. Something to think about I guess. Using #post_init as an initializer is a bit of a challenge because EM calls> it without parameters. On the whole it''s probably better to define your own > initializer method and call it (with parameters) from the block that''s > passed to start_server. >Another possibility would be to pass in an optional hash to the definition, a la EM.start_server(''localhost'', 8080 {:your_special_server_number => 25}) {|conn| print "got connection!" } And just chuck it into each connection or the class definition or what not. Another option would be to pass parameters which will be passed to the intialization function each time (a hash, or an array, or...just a variable number of them) and teach people how to call super correctly, or tell them ''your initialize will be called with your hostname and port first, and then your args'' :) We probably can''t change the invocation order of #post_init without breaking> existing code. > > > A question: is there a function "what is _my_ port (my port, my hostname > > > > or ip)?" > > > > > > > > Any answer to that one? > > > I''m being stupid :-), I don''t understand the question. >Maybe I''m being stupid. Most likely. a = TCPSocket.new(''google.com'', 80) a.addr :) Thanks for the help. -Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071016/74962cdc/attachment-0001.html
On 10/16/07, Roger Pack <rogerpack2005 at gmail.com> wrote:> > > It does the job in the sense that there''s nothing you can''t do, but it''s > > somewhat clunky and un-Rubyesque. And you have to remember that #initialize > > and #post_init are all finished by the time the block gets called. I''d be > > delighted if someone came up with a better syntax. > > > > Me too--you really have to wrap your mind around it to understand ''there > is no real available initialize function--not even the block.'' > > The thing I found odd to understand is that since there''s no ''instance'' > when you start a server you can''t do > EM.start_server ''localhost'', 8080 { |server| server.your_special_number_is= 25} > {|conn| print "got connection to server"} (wants two blocks in my mind) > > If the block were executed first, then post_init would be like the client > block. > > Unfortunately using the block as an initializer is no longer an option, > which option would have made a lot of sense. Something to think about I > guess. >Nothing wrong with using the block as an initializer, that''s what it''s there for. You can be sure that the block will be called after the base-class contructor returns. However, it really is worth thinking about how to define a relationship between an acceptor and the connections it accepts. I''m guessing the best approach will be to define an EM::Acceptor that wraps EM#start_server and allow it to pass parameters to the constructor of an EM::Connection.> > > > > > > A question: is there a function "what is _my_ port (my port, my > > > > > hostname or ip)?" > > > > > > > > > > > Any answer to that one? > > > > > I''m being stupid :-), I don''t understand the question. > > > > Maybe I''m being stupid. Most likely. > a = TCPSocket.new(''google.com'', 80) > a.addr >Oh I get it, you need a wrapper for getsockname(2) as well as getpeername(2). Good idea, hadn''t thought of that. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071016/876813de/attachment.html
Thanks! Some thoughts below... Unfortunately using the block as an initializer is no longer an option,> > which option would have made a lot of sense. Something to think about I > > guess. > > >Nothing wrong with using the block as an initializer, that''s what it''s> there for. You can be sure that the block will be called after the > base-class contructor returns. However, it really is worth thinking about > how to define a relationship between an acceptor and the connections it > accepts. I''m guessing the best approach will be to define an EM::Acceptor > that wraps EM#start_server and allow it to pass parameters to the > constructor of an EM::Connection. >One related thing that I''d love to see would be the ability to have one code block operate on the ''server mixed-in class'' itself and then one on each connection, as they come in. (Just wishing). Kind of like this code_block_run_once_against_new_class proc.new{|new_generic_mixed_in_class_itself| def new_generic_mixed_in_class.my_name return ''blue'' end } EM.start_server(''localhost'', 8081, block_run_once_against_new_class) {|conn| # stuff run over and over dealing only with clients, not server initialization itself, accomplished in the block parameter } That would make it tie in with the old-school style program, for me--i.e. make it feel more intuitive. The old school way which is to have a (server) class instance you can play with to ''initialize'' it (i.e. say "you''re serving file X" or what not), then subsequently you work on each connection as they come in. In the way described we would have something similar--a mixed-in class definition you can play with to initialize, then a clear separation of ''work done per connection.'' That would make sense to at least me. It would also alleviate whatever concerns I might have that are ''But I''m running server initialization over and over once per client--I just want to run that code once!'' so it satisfies the efficiency lean (though it may not be extremely more efficient, it would seem more elegant). Some way to differentiate, anyway. I''m not sure exactly how EM works with its mixed-in class creation--I am under the assumption it only does it once, and that it does it once per start_server call in the above suggestion (new anonymous class per start_server or what have you).> > Oh I get it, you need a wrapper for getsockname(2) as well as > getpeername(2). Good idea, hadn''t thought of that. >Or perhaps "get_local_connection_info" returning a hash of data or what not, as per the recent discussion (whatever was decided). If I could make a suggestion about those, it would be that if we do create "get_peer_connection_info_hash" and "get_local_connection_info_hash" or derivatives then we could optimize it, too. We could have a single object created and saved per class instance on the first request, then always pass out the same object when the call to get_local_connection_info_hash is made, as I assume that it won''t change. Just thinking ahead to premature optimizations :) Another question I had was that you mentioned that when sending a file it will carefully maintain full send buffers--am I correct in then assuming that say I did send_data("a" *1000000) it would also carefully maintain full send buffers (from Ruby strings)? The answer is obvious, but just double checking. Also another question: does EM keep running and queueing data during a GC collection? Making it resilient to GC collection would be convenient :) And finally a suggestion. It would be nice to (optionally) have EM re-use string objects for receive_data data (if possible). Then if your protocol just writes its input to disk, you can save on memory a lot (hopefully like almost 100%). I know that Ruby String.whatever! functions can edit strings themselves, so it might be possible. Anyway just a random suggestion. A follow-up on that would be the option ''don''t pass me a string for that receive_data--just write it straight to disk to this file or what not'' but that might not be too important. Well that''s about it for today. Thanks for reading! -Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071016/bddee531/attachment.html
On Mon, Oct 15, 2007 at 05:44:57PM -0400, Francis Cianfrocca wrote:> > On 10/15/07, Roger Pack <[1]rogerpack2005 at gmail.com> wrote: > > The equivalent of what establishing a client does--just uses any > random port. Usefulness being ''I don''t want to choose a bad port!'' > [which, BTW I''m not even sure how to handle in EM--I should figure it > out and write it up, as you suggested.] > > Not really useful, because you have to then somehow tell your clients > which ephemeral port you picked!I find this very useful for unit tests of servers, where the client and the server are on the same machine, and therefore the test framework can tell the client which port to talk to. With normal Ruby, you just do something like: require ''socket'' class MyTest < Test::Unit::TestCase def setup @server = TCPServer.new("127.0.0.1", nil) @port = @server.addr[1] # gives the dynamically chosen port end def teardown @server.close end end I''ve not tried this with EM though. Brian.
<snip>> > > I found that to be something of a pain because some connection types (like > > Unix-domain sockets and UDP sockets) don''t implement these things the same > > way as TCP clients do. What''s the best approach here? Provide helper > > functions for all the different connection types and just throw exceptions > > if you use one that doesn''t make sense? > > > > A hash sounds like a pretty good idea, without thinking about it too hard. > > a using arrayfields ? > http://www.codeforpeople.com/lib/ruby/arrayfields/arrayfields-4.5.0/READMEscrap that suggestion - performance results posted on the sequel mail list suggest that Hash is still the way to go. Apologies for the noise. Mark> > benefit: > "arrays with keyword access require > much less memory when compared to hashes/objects and yet still provide fast > lookup and preserve data order." > > HTH > Mark > > > > > > > There is a complete mechanism for doing this, but it could really use a > > FAQ entry. Look at the rdoc for the #next_tick function. Also, read the > > SPAWNED_PROCESSES document. > > > > Thanks! > > > > > > > > > > > > Another suggestion might be the creation of a function send_partial (the > > equivalent of send in Python) -- maybe it can report how much it was able to > > send ''instantaneously''. I wouldn''t actually need it for anything, but maybe > > somebody somewhere would like such a feature :) > > > > > > I can see the point of this in a program that needs to explicitly manage > > the size of the outbound network buffers, but EM wraps all this up. I''m open > > to adding advanced functions that might expose something like this, but in > > general it''s better not to rely on the fancy stuff unless really necessary. > > > > I wouldn''t use it myself, but maybe someday it would indeed be useful. > > Sounds good. When I figure out the tick stuff I''ll post it. Go EM. > > > > -Roger Pack > > I like belief. http://www.google.com/search?q=free+bible > > _______________________________________________ > > Eventmachine-talk mailing list > > Eventmachine-talk at rubyforge.org > > http://rubyforge.org/mailman/listinfo/eventmachine-talk > > >
> You turn it off with the sockopt TCP_NODELAY. I was so surprised to find it > missing in EM that I looked a little harder. It''s in ext/ed.cpp, line 874 of > the HEAD revision. "Flushing" a socket or diasbling slow-start really > doesn''t necessarily put data on the wire any faster. All it does is ensure > that it''s in the outbound kernel buffers. The network-hardware driver > determines when the data actually gets transmitted, and afaik there''s no > generally portable way to control this. Nor would you want to.Question: do we set or need to TCP_NODELAY on sockets that we create, as in connect () ? Just wondering. Thanks! -Roger
On Jan 4, 2008 1:17 PM, Roger Pack <rogerpack2005 at gmail.com> wrote:> > You turn it off with the sockopt TCP_NODELAY. I was so surprised to find > it > > missing in EM that I looked a little harder. It''s in ext/ed.cpp, line > 874 of > > the HEAD revision. "Flushing" a socket or diasbling slow-start really > > doesn''t necessarily put data on the wire any faster. All it does is > ensure > > that it''s in the outbound kernel buffers. The network-hardware driver > > determines when the data actually gets transmitted, and afaik there''s no > > generally portable way to control this. Nor would you want to. > Question: do we set or need to TCP_NODELAY on sockets that we create, > as in connect () ? > >When you say "we," do you mean EM''s reactor core? And by connect(), do you mean the stdlib connect (connect(2)), or Ruby''s? I guess strictly speaking, not setting TCP_NODELAY is a matter of opinion. In my opinion, it hasn''t been useful for at least ten years not to set it. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20080104/3ad4b144/attachment.html
> > Question: do we set or need to TCP_NODELAY on sockets that we create, > > as in connect () ? > > > > > When you say "we," do you mean EM''s reactor core? And by connect(), do you > mean the stdlib connect (connect(2)), or Ruby''s?Yeah within the core (connect(2)). em.cpp (?) or ed.cpp, one of the two.> I guess strictly speaking, not setting TCP_NODELAY is a matter of opinion. > In my opinion, it hasn''t been useful for at least ten years not to set it.I agree.>Think about it for a moment. If s==1, then the process by definitionisn''t heavily loaded so it >doesn''t need optimizing. This might make it go faster in a benchmark but there''s no benefit in the real world. :-) That''s true. I guess it would just free it up for other processes. Thanks again! -Roger
> > > Question: do we set or need to TCP_NODELAY on sockets that we create, > > > as in connect () ?I was hinting that it may need to be set for connect(2) sockets, as well as for accept sockets (is it?). Also http://eventmachine.rubyforge.org/wiki/wiki.pl?CodeSnippets is where I''ve put some code examples. Feel free to contribute or elsewhere. Take care. -Roger-- Wherefore if ye shall press forward, feasting upon the word of Christ, and endure to the end, behold thus saith the Father: ye shall have eternal life. - 2 Nephi 31:20