Hello All, There hasn''t been a solution to the massively-concurrent-server problem in ruby yet until EM, as you all know. When I was looking to develop a backend for a chat application a few years ago, there was simply no way to do it in ruby. Is there any reason this would be a problem with EM? Has anyone pushed the boundaries of event-based IO in ruby with EM? On a related note, I was wondering if there has been any work done with EM and COMET? Perhaps someone has extended Mongrel or written a simple version of HTTP for EM? The current group working on cometd (http://cometd.com) are batshit insane and I''m hoping someone will just implement the damn thing already. Jan Kneschke of lighttpd is working on an implementation of bayeux (the cometd protocol) for lighttpd called "mod_mailbox," but he has yet to even fix the load balancer in lighttpd let alone rewrite the "io subsystem" (which he claims would be necessary for mod_mailbox). So anyway, COMET, or even just massive-concurrency in general, is an interesting (and frustrating) topic and I was just wondering what were everyone''s experiences with EM. Thanks in advance, Alan
On 5/7/07, Norbauer Alan <altano at gmail.com> wrote:> On a related note, I was wondering if there has been any work done > with EM and COMET? Perhaps someone has extended Mongrel or written a > simple version of HTTP for EM? The current group working on cometdI have a version of mongrel that runs inside an eventmachine event loop. It''s being released as part of a ruby based clustering proxy for web apps that I''m releasing as soon as some promised scripts for Merb and Rails are sent to me. http://swiftiply.swiftcore.org/benchmarks.html I''ve been using the Mongrel HTTP parser from within EM for quite a while, and Francis also has an http parser that I am testing currently that may well be a viable choice to build things on, as well.> (http://cometd.com) are batshit insane and I''m hoping someone will > just implement the damn thing already. Jan Kneschke of lighttpd is > working on an implementation of bayeux (the cometd protocol) for > lighttpd called "mod_mailbox," but he has yet to even fix the load > balancer in lighttpd let alone rewrite the "io subsystem" (which he > claims would be necessary for mod_mailbox). > > So anyway, COMET, or even just massive-concurrency in general, is an > interesting (and frustrating) topic and I was just wondering what > were everyone''s experiences with EM.The only major limitation is that Ruby has a sharp file descriptor limit -- 1024. I''ve driven EM to that limit many times in testing, with no real problems except that there is a measurable (but not profound) decline in performance because of select() (I surmise, anyway) after concurrencies start climbing into the triple digits. Kirk Haines
What Kirk said, and additionally: FD limits: I''m hopeful that we can blow through the 1024-limit on Linux using epoll. There is a tentative implementation of EM with epoll instead of select but it''s not ready for prime time yet. You need superuser privs to use ulimit to bump up the max number of descriptors your process can use. Ruby itself will never be able to see more than 1024 per process no matter what, as Kirk said, but EM sockets don''t use the Ruby wrappers. They''re just kernel-level IO descriptors. So this just might work. Stay tuned. Another approach to massive-scalability is multiple processes. If epoll doesn''t work out for us, I expect one of us will write a pattern for multi-process scalability. On some kernels, you can open an acceptor socket, fork a few times, and the kernel will distribute the incoming connections more or less fairly across the fork children. I know that by casual experimentation, but I don''t yet know if it''s a reliable, defined behavior. Comet/Bayeux: Zane Shelby and I spent a fair bit of time with this protocol back in Jnauary. It''s not hard at all to fit it into the EM framework. The problem is Bayeux itself. It looks like it was designed by people who are not real network experts, and the documentation is execrable. Not at all ready for prime time, in my opinion. We reached out to the Comet guys and heard nothing but crickets chirping. When you say they''re "batshit insane" I don''t know if you mean that in a good way or a bad way :-). If you have a relationship with the Comet people (required so we can get clarifications on their crappy protocol), then let me know and we''ll consider re-opening the effort. Chat in general: it always amazes me that so many people undertake to do a chat server as their first effort. I would have guessed that there were any number of competent implementations out there. Evidently not, which means EM would do well to have one, especially one that can support multiple protocols. One of the things on my personal to-do list is an EM-based XMPP server (because I need to make an authorizing proxy for Jabber). I had to write an event-driven SAX2 parser because neither REXML nor Ruby-libxml had the features needed to make the protocol event-driven. It''s working pretty well, actually. XMPP is next. Kirk mentioned that EM needs a web presence beyond Rubyforge. www.rubyeventmachine.com is now registered and DNS''ed. When there is some content up there, I''ll announce it here and on the Ruby ML. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20070507/061f5f42/attachment.html
response inline below: On May 7, 2007, at 11:27 AM, Francis Cianfrocca wrote:> What Kirk said, and additionally: > > FD limits: I''m hopeful that we can blow through the 1024-limit on > Linux using epoll. There is a tentative implementation of EM with > epoll instead of select but it''s not ready for prime time yet. You > need superuser privs to use ulimit to bump up the max number of > descriptors your process can use. Ruby itself will never be able to > see more than 1024 per process no matter what, as Kirk said, but EM > sockets don''t use the Ruby wrappers. They''re just kernel-level IO > descriptors. So this just might work. Stay tuned.Yes, epoll/kqueue are what I was thinking of. I was under the impression that this is what EventMachine was built on, probably because of the way I was directed at the project, but I guess I was mistaken. I''m very glad to hear that it is possible.> > Another approach to massive-scalability is multiple processes. If > epoll doesn''t work out for us, I expect one of us will write a > pattern for multi-process scalability. On some kernels, you can > open an acceptor socket, fork a few times, and the kernel will > distribute the incoming connections more or less fairly across the > fork children. I know that by casual experimentation, but I don''t > yet know if it''s a reliable, defined behavior.This is definitely reliable behavior for nix variants, or at least that''s what I was taught in college. To go that route just for massive concurrency is just a hack that is much uglier than using epoll/kqueue, though.> > Comet/Bayeux: Zane Shelby and I spent a fair bit of time with this > protocol back in Jnauary. It''s not hard at all to fit it into the > EM framework. The problem is Bayeux itself. It looks like it was > designed by people who are not real network experts, and the > documentation is execrable. Not at all ready for prime time, in my > opinion. We reached out to the Comet guys and heard nothing but > crickets chirping. When you say they''re "batshit insane" I don''t > know if you mean that in a good way or a bad way :-). If you have a > relationship with the Comet people (required so we can get > clarifications on their crappy protocol), then let me know and > we''ll consider re-opening the effort.No, I most definitely meant it in a bad way. I don''t know if you looked at the earlier documentation or what, but it just gets worse and worse. Even just attempting to read their docs made me feel like I was at work reading MSDN. *shudder* The only reason it is interesting is that it is the only attempt by anyone at a standard when it comes to this stuff, which is why I brought it up. Plus I don''t have the time to write any of this stuff, so I''ll take what I can get.> > Chat in general: it always amazes me that so many people undertake > to do a chat server as their first effort. I would have guessed > that there were any number of competent implementations out there. > Evidently not, which means EM would do well to have one, especially > one that can support multiple protocols.Precisely: one would think! It wasn''t my first effort in network programming, but definitely still a bit over my head. Unfortunately, over the course of my project I came to realize that not only was I not taking the correct approach, but that the correct approach was not possible in ruby because of the fdset size limits, and so the server was written in python.> One of the things on my personal to-do list is an EM-based XMPP > server (because I need to make an authorizing proxy for Jabber). I > had to write an event-driven SAX2 parser because neither REXML nor > Ruby-libxml had the features needed to make the protocol event- > driven. It''s working pretty well, actually. XMPP is next.Very cool. Looking forward to EM becoming more popular. Ruby definitely needs it. Thanks very much for your hard work. -alan> > Kirk mentioned that EM needs a web presence beyond Rubyforge. > www.rubyeventmachine.com is now registered and DNS''ed. When there > is some content up there, I''ll announce it here and on the Ruby ML. > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk
On 5/7/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> What Kirk said, and additionally: > > FD limits: I''m hopeful that we can blow through the 1024-limit on Linux > using epoll. There is a tentative implementation of EM with epoll instead of > select but it''s not ready for prime time yet. You need superuser privs to > use ulimit to bump up the max number of descriptors your process can use. > Ruby itself will never be able to see more than 1024 per process no matter > what, as Kirk said, but EM sockets don''t use the Ruby wrappers. They''re just > kernel-level IO descriptors. So this just might work. Stay tuned. > > Another approach to massive-scalability is multiple processes. If epoll > doesn''t work out for us, I expect one of us will write a pattern for > multi-process scalability. On some kernels, you can open an acceptor socket, > fork a few times, and the kernel will distribute the incoming connections > more or less fairly across the fork children. I know that by casual > experimentation, but I don''t yet know if it''s a reliable, defined behavior. > > Comet/Bayeux: Zane Shelby and I spent a fair bit of time with this protocol > back in Jnauary. It''s not hard at all to fit it into the EM framework. The > problem is Bayeux itself. It looks like it was designed by people who are > not real network experts, and the documentation is execrable. Not at all > ready for prime time, in my opinion. We reached out to the Comet guys and > heard nothing but crickets chirping. When you say they''re "batshit insane" I > don''t know if you mean that in a good way or a bad way :-). If you have a > relationship with the Comet people (required so we can get clarifications on > their crappy protocol), then let me know and we''ll consider re-opening the > effort. > > Chat in general: it always amazes me that so many people undertake to do a > chat server as their first effort. I would have guessed that there were any > number of competent implementations out there. Evidently not, which means EM > would do well to have one, especially one that can support multiple > protocols. One of the things on my personal to-do list is an EM-based XMPP > server (because I need to make an authorizing proxy for Jabber). I had to > write an event-driven SAX2 parser because neither REXML nor Ruby-libxml had > the features needed to make the protocol event-driven. It''s working pretty > well, actually. XMPP is next. > > Kirk mentioned that EM needs a web presence beyond Rubyforge. > www.rubyeventmachine.com is now registered and DNS''ed. When there is some > content up there, I''ll announce it here and on the Ruby ML.So Francis, technical I don''t think its a limitation of Ruby, is it? I mean if FD limit is there, programs written in any language can''t bypass that limit unless they have a dynamic forking implementation. I might be completely wrong!
On May 9, 2007, at 11:22 AM, hemant wrote:> On 5/7/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote: >> What Kirk said, and additionally: >> >> FD limits: I''m hopeful that we can blow through the 1024-limit on >> Linux >> using epoll. There is a tentative implementation of EM with epoll >> instead of >> select but it''s not ready for prime time yet. You need superuser >> privs to >> use ulimit to bump up the max number of descriptors your process >> can use. >> Ruby itself will never be able to see more than 1024 per process >> no matter >> what, as Kirk said, but EM sockets don''t use the Ruby wrappers. >> They''re just >> kernel-level IO descriptors. So this just might work. Stay tuned. >> >> Another approach to massive-scalability is multiple processes. If >> epoll >> doesn''t work out for us, I expect one of us will write a pattern for >> multi-process scalability. On some kernels, you can open an >> acceptor socket, >> fork a few times, and the kernel will distribute the incoming >> connections >> more or less fairly across the fork children. I know that by casual >> experimentation, but I don''t yet know if it''s a reliable, defined >> behavior. >> >> Comet/Bayeux: Zane Shelby and I spent a fair bit of time with this >> protocol >> back in Jnauary. It''s not hard at all to fit it into the EM >> framework. The >> problem is Bayeux itself. It looks like it was designed by people >> who are >> not real network experts, and the documentation is execrable. Not >> at all >> ready for prime time, in my opinion. We reached out to the Comet >> guys and >> heard nothing but crickets chirping. When you say they''re "batshit >> insane" I >> don''t know if you mean that in a good way or a bad way :-). If you >> have a >> relationship with the Comet people (required so we can get >> clarifications on >> their crappy protocol), then let me know and we''ll consider re- >> opening the >> effort. >> >> Chat in general: it always amazes me that so many people undertake >> to do a >> chat server as their first effort. I would have guessed that there >> were any >> number of competent implementations out there. Evidently not, >> which means EM >> would do well to have one, especially one that can support multiple >> protocols. One of the things on my personal to-do list is an EM- >> based XMPP >> server (because I need to make an authorizing proxy for Jabber). I >> had to >> write an event-driven SAX2 parser because neither REXML nor Ruby- >> libxml had >> the features needed to make the protocol event-driven. It''s >> working pretty >> well, actually. XMPP is next. >> >> Kirk mentioned that EM needs a web presence beyond Rubyforge. >> www.rubyeventmachine.com is now registered and DNS''ed. When there >> is some >> content up there, I''ll announce it here and on the Ruby ML. > > So Francis, technical I don''t think its a limitation of Ruby, is it? > I mean if FD limit is there, programs written in any language can''t > bypass that limit unless they have a dynamic forking implementation. > > I might be completely wrong!Most OSes have a low default open file descriptor limit in the operating system itself. This default can be changed with administrator privileges. There is also a separate limitation, which is a limit of 1024 on the number of file descriptors you pass to the select() call, which is the only system call for doing non-blocking io available in ruby. There is no way around this limit in ruby yet because ruby currently doesn''t have a way at getting at any of the system calls that don''t experience this limitation, such as poll, epoll, etc. Ideally it would be best to get at epoll/kqueue system calls, because not only do they not experience hard limits, but they don''t experience soft limits either. select/poll experience O(n) performance where n is the # of file descriptors, while kqueue/epoll are O(1). So this isn''t a ruby-specific problem, but rather that ruby doesn''t have access to the better system functions while other languages do. If my information is outdated or just flat out incorrect, someone feel free to correct me :) -alan> _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2417 bytes Desc: not available Url : http://rubyforge.org/pipermail/eventmachine-talk/attachments/20070510/b0331334/attachment.bin
On 5/10/07, Norbauer Alan <alan at norbauer.com> wrote:> > > Most OSes have a low default open file descriptor limit in the > operating system itself. This default can be changed with > administrator privileges. > > There is also a separate limitation, which is a limit of 1024 on the > number of file descriptors you pass to the select() call, which is > the only system call for doing non-blocking io available in ruby. > There is no way around this limit in ruby yet because ruby currently > doesn''t have a way at getting at any of the system calls that don''t > experience this limitation, such as poll, epoll, etc. Ideally it > would be best to get at epoll/kqueue system calls, because not only > do they not experience hard limits, but they don''t experience soft > limits either. select/poll experience O(n) performance where n is > the # of file descriptors, while kqueue/epoll are O(1). So this > isn''t a ruby-specific problem, but rather that ruby doesn''t have > access to the better system functions while other languages do. > > If my information is outdated or just flat out incorrect, someone > feel free to correct me :)Basically right, but one of the reasons EM is based on a C++ extension to Ruby is to get access to all of the system-level IO features Ruby doesn''t expose. In fact, when EM was first written (April of 2006), Ruby was missing several key nbio features that EM just worked around. (Matz approved some improvements and they made it into the core distro in late May of 2006, as you can see from the Ruby-core archives.) epoll support is clearly going to be a requirement for us. Kqueue, I''m not so sure, unless a lot of Mac and BSD folks pop their heads up and ask for it. IOCP would require some serious rewriting of the EM core. I spent a day working on it at one point and then decided life was too short. It can be done if enough people need it. To your comment about select being the only way, etc: the real problem here is to integrate the handling of IO events with Ruby''s thread scheduler, which is based on timer interrupts that call select. This kills performance to an awe-inspiring degree, but this is just as true in Ruby programs without EM. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20070514/9c3cea9c/attachment.html
On 5/14/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> On 5/10/07, Norbauer Alan <alan at norbauer.com> wrote: > > > > Most OSes have a low default open file descriptor limit in the > > operating system itself. This default can be changed with > > administrator privileges. > > > > There is also a separate limitation, which is a limit of 1024 on the > > number of file descriptors you pass to the select() call, which is > > the only system call for doing non-blocking io available in ruby. > > There is no way around this limit in ruby yet because ruby currently > > doesn''t have a way at getting at any of the system calls that don''t > > experience this limitation, such as poll, epoll, etc. Ideally it > > would be best to get at epoll/kqueue system calls, because not only > > do they not experience hard limits, but they don''t experience soft > > limits either. select/poll experience O(n) performance where n is > > the # of file descriptors, while kqueue/epoll are O(1). So this > > isn''t a ruby-specific problem, but rather that ruby doesn''t have > > access to the better system functions while other languages do. > > > > If my information is outdated or just flat out incorrect, someone > > feel free to correct me :) > > > Basically right, but one of the reasons EM is based on a C++ extension to > Ruby is to get access to all of the system-level IO features Ruby doesn''t > expose. In fact, when EM was first written (April of 2006), Ruby was missing > several key nbio features that EM just worked around. (Matz approved some > improvements and they made it into the core distro in late May of 2006, as > you can see from the Ruby-core archives.) > > epoll support is clearly going to be a requirement for us. Kqueue, I''m not > so sure, unless a lot of Mac and BSD folks pop their heads up and ask for > it. IOCP would require some serious rewriting of the EM core. I spent a day > working on it at one point and then decided life was too short. It can be > done if enough people need it. > > To your comment about select being the only way, etc: the real problem here > is to integrate the handling of IO events with Ruby''s thread scheduler, > which is based on timer interrupts that call select. This kills performance > to an awe-inspiring degree, but this is just as true in Ruby programs > without EM.Alan, thanks a lot.