I''m glad to announce the first initial public release of DistribuStream: http://distribustream.org/ DistribuStream is a peercasting system, facilitating streaming media over a managed peer-to-peer network. You can think of it as being similar to BitTorrent, but employing progressive downloads so media can be consumed in on-demand streams (and it''s not written in Python! :). It was developed by ClickCaster, Inc. <http://www.clickcaster.com/> through the University of Colorado Computer Science Program. The whole project is written in Ruby and uses EventMachine as the basis of both the client and server. It''s available as a gem (gem install distribustream) and you can find a basic writeup on how to use it here: http://www.clickcaster.com/items/distribustream-0-1-0-released This release targets primarily developers as it has some pretty major issues that need to get resolved before it''s generally usable. The most pressing is the client implementation, which currently stores the entire file in RAM before writing it out to an IO object. This needs to be changed to a disk buffer, and / or combined with the protocol''s ability to "unprovide" certain file segments from the peer network, so the client can store a manageable moving window of data. In terms of implementation details that would nice to get resolved, the present implementation uses a hodgepodge of both EM and threaded components. While the underlying protocol (the Peer Distributed Transfer Protocol, see pdtp.org) is implemented using EventMachine, all of the peer-to-peer file transfers utilize the regular threaded Mongrel (i.e. not Kirk''s Mongrel) and the regular (and abominable) Net::HTTP client. I would very much like to eliminate all of the threaded components and running everything inside the Reactor loop with the EM HttpClient and either Kirk''s Mongrel or Francis'' em_httpserver. In earlier implementations where we tried this we ran into inexplicable stalling inside of the Reactor itself, and because of that the team decided to go with the hybrid Reactor / threaded approach. Now that the project is open source I''d like to start looking into this again, and would greatly appreciate whatever help anyone can offer. I''ll be announcing it to ruby-talk in a few days or so, but thought I''d announce it here first and get some feedback. -- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071018/6a4a2db2/attachment-0001.html
On 10/18/07, Tony Arcieri <tony at clickcaster.com> wrote:> > I''m glad to announce the first initial public release of DistribuStream: > > http://distribustream.org/ > > DistribuStream is a peercasting system, facilitating streaming media over > a managed peer-to-peer network. You can think of it as being similar to > BitTorrent, but employing progressive downloads so media can be consumed in > on-demand streams (and it''s not written in Python! :). It was developed by ClickCaster, > Inc. <http://www.clickcaster.com/> through the University of Colorado > Computer Science Program. The whole project is written in Ruby and uses > EventMachine as the basis of both the client and server. It''s available as > a gem (gem install distribustream) and you can find a basic writeup on how > to use it here: > > http://www.clickcaster.com/items/distribustream-0-1-0-released > > This release targets primarily developers as it has some pretty major > issues that need to get resolved before it''s generally usable. The most > pressing is the client implementation, which currently stores the entire > file in RAM before writing it out to an IO object. This needs to be changed > to a disk buffer, and / or combined with the protocol''s ability to > "unprovide" certain file segments from the peer network, so the client can > store a manageable moving window of data. > > In terms of implementation details that would nice to get resolved, the > present implementation uses a hodgepodge of both EM and threaded > components. While the underlying protocol (the Peer Distributed Transfer > Protocol, see pdtp.org) is implemented using EventMachine, all of the > peer-to-peer file transfers utilize the regular threaded Mongrel (i.e. not > Kirk''s Mongrel) and the regular (and abominable) Net::HTTP client. > > I would very much like to eliminate all of the threaded components and > running everything inside the Reactor loop with the EM HttpClient and either > Kirk''s Mongrel or Francis'' em_httpserver. In earlier implementations where > we tried this we ran into inexplicable stalling inside of the Reactor > itself, and because of that the team decided to go with the hybrid Reactor / > threaded approach. Now that the project is open source I''d like to start > looking into this again, and would greatly appreciate whatever help anyone > can offer. > > I''ll be announcing it to ruby-talk in a few days or so, but thought I''d > announce it here first and get some feedback.Do you have any general information about the reactor stalling you observed? It would be great to figure out where that was coming from. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/7c247c05/attachment.html
We observed it while running LineAndTextProtocol and HttpClient in the same Reactor. I can see if I can revert the codebase to when we were experiencing the problem (months and months ago), but I think the better approach is to start removing some of the threaded components in our current codebase and if the problem comes up again we can have a look then. We were also trying to use Kirk''s evented Mongrel at one point, however it looks like due to the way the Mongrel API works we can''t use it properly for what we want to do. Mongrel has a method you call to synchronously consume the request body (for example, in a POST or PUT request), but in the evented Mongrel this is stubbed out and the whole request has been consumed (including the body) before we even get access to the header information. We''re using the header information to authenticate file transfers, so it''s somewhat pointless and wasteful if we can''t get access to that information before the entire transfer has occurred. Francis, does your HTTP server let you check the header of a POST or PUT request before the body is consumed? On 10/19/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> > On 10/18/07, Tony Arcieri <tony at clickcaster.com> wrote: > > > I''m glad to announce the first initial public release of DistribuStream: > > > > http://distribustream.org/ > > > > DistribuStream is a peercasting system, facilitating streaming media > > over a managed peer-to-peer network. You can think of it as being similar > > to BitTorrent, but employing progressive downloads so media can be consumed > > in on-demand streams (and it''s not written in Python! :). It was developed > > by ClickCaster, Inc. <http://www.clickcaster.com/> through the > > University of Colorado Computer Science Program. The whole project is > > written in Ruby and uses EventMachine as the basis of both the client and > > server. It''s available as a gem (gem install distribustream) and you can > > find a basic writeup on how to use it here: > > > > http://www.clickcaster.com/items/distribustream-0-1-0-released > > > > This release targets primarily developers as it has some pretty major > > issues that need to get resolved before it''s generally usable. The most > > pressing is the client implementation, which currently stores the entire > > file in RAM before writing it out to an IO object. This needs to be changed > > to a disk buffer, and / or combined with the protocol''s ability to > > "unprovide" certain file segments from the peer network, so the client can > > store a manageable moving window of data. > > > > In terms of implementation details that would nice to get resolved, the > > present implementation uses a hodgepodge of both EM and threaded > > components. While the underlying protocol (the Peer Distributed Transfer > > Protocol, see pdtp.org) is implemented using EventMachine, all of the > > peer-to-peer file transfers utilize the regular threaded Mongrel ( i.e. > > not Kirk''s Mongrel) and the regular (and abominable) Net::HTTP client. > > > > I would very much like to eliminate all of the threaded components and > > running everything inside the Reactor loop with the EM HttpClient and either > > Kirk''s Mongrel or Francis'' em_httpserver. In earlier implementations where > > we tried this we ran into inexplicable stalling inside of the Reactor > > itself, and because of that the team decided to go with the hybrid Reactor / > > threaded approach. Now that the project is open source I''d like to start > > looking into this again, and would greatly appreciate whatever help anyone > > can offer. > > > > I''ll be announcing it to ruby-talk in a few days or so, but thought I''d > > announce it here first and get some feedback. > > > > Do you have any general information about the reactor stalling you > observed? It would be great to figure out where that was coming from. > > > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk >-- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com 720-227-0129 ext. 202 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/64c959ba/attachment-0001.html
On 10/19/07, Tony Arcieri <tony at clickcaster.com> wrote:> We were also trying to use Kirk''s evented Mongrel at one point, however it > looks like due to the way the Mongrel API works we can''t use it properly for > what we want to do. Mongrel has a method you call to synchronously consume > the request body (for example, in a POST or PUT request), but in the evented > Mongrel this is stubbed out and the whole request has been consumed > (including the body) before we even get access to the header information. > We''re using the header information to authenticate file transfers, so it''s > somewhat pointless and wasteful if we can''t get access to that information > before the entire transfer has occurred.I may be able to fix that.> Francis, does your HTTP server let you check the header of a POST or > PUT request before the body is consumed?As written, no. I have modified it so that as the request body is read, a method is called that receives the POST/PUT data, but the method to handle the request itself isn''t called until the state machine reaches the DispatchState, and it doesn''t reach that until it has finished reading the request. I however, have been hacking the heck out of this thing, trying different things with it, and I don''t see any reason why one couldn''t change it to call a method after the header has been parsed, but before the body has been parsed. Should be pretty trivial. Kirk Haines
On 10/19/07, Tony Arcieri <tony at clickcaster.com> wrote:> > We observed it while running LineAndTextProtocol and HttpClient in the > same Reactor. I can see if I can revert the codebase to when we were > experiencing the problem (months and months ago), but I think the better > approach is to start removing some of the threaded components in our current > codebase and if the problem comes up again we can have a look then. > > We were also trying to use Kirk''s evented Mongrel at one point, however it > looks like due to the way the Mongrel API works we can''t use it properly for > what we want to do. Mongrel has a method you call to synchronously consume > the request body (for example, in a POST or PUT request), but in the evented > Mongrel this is stubbed out and the whole request has been consumed > (including the body) before we even get access to the header information. > We''re using the header information to authenticate file transfers, so it''s > somewhat pointless and wasteful if we can''t get access to that information > before the entire transfer has occurred. > > Francis, does your HTTP server let you check the header of a POST or PUT > request before the body is consumed?No, but it wouldn''t be a stretch to add it. Was the stalling easily repeatable? Could you reproduce it now if you had to? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/861c56f0/attachment.html
It would''ve been easily repeatable several months ago, but unfortunately we weren''t sure about how the code was to be released at that time. One way or another I''d like to remove the threaded components. If I run into the same problem again when I do so I''ll let you know. On 10/19/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> > On 10/19/07, Tony Arcieri <tony at clickcaster.com> wrote: > > > > We observed it while running LineAndTextProtocol and HttpClient in the > > same Reactor. I can see if I can revert the codebase to when we were > > experiencing the problem (months and months ago), but I think the better > > approach is to start removing some of the threaded components in our current > > codebase and if the problem comes up again we can have a look then. > > > > We were also trying to use Kirk''s evented Mongrel at one point, however > > it looks like due to the way the Mongrel API works we can''t use it properly > > for what we want to do. Mongrel has a method you call to synchronously > > consume the request body (for example, in a POST or PUT request), but in the > > evented Mongrel this is stubbed out and the whole request has been consumed > > (including the body) before we even get access to the header information. > > We''re using the header information to authenticate file transfers, so it''s > > somewhat pointless and wasteful if we can''t get access to that information > > before the entire transfer has occurred. > > > > Francis, does your HTTP server let you check the header of a POST or PUT > > request before the body is consumed? > > No, but it wouldn''t be a stretch to add it. > > Was the stalling easily repeatable? Could you reproduce it now if you had > to? > > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk >-- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com 720-227-0129 ext. 202 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/fc2d0108/attachment.html
On 10/19/07, Tony Arcieri <tony at clickcaster.com> wrote:> > It would''ve been easily repeatable several months ago, but unfortunately > we weren''t sure about how the code was to be released at that time. > > One way or another I''d like to remove the threaded components. If I run > into the same problem again when I do so I''ll let you know.Cool, thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/f9f9cc34/attachment-0001.html
Just switched over to a version which uses HttpClient... I noticed a marginal speedup over net/http I''m not sure what issues they were running into before, but I''m now rid of one source of countless threads I''ll work on the HTTP servers next... not sure what road to go down in that regard yet On 10/19/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> > On 10/19/07, Tony Arcieri <tony at clickcaster.com> wrote: > > > > It would''ve been easily repeatable several months ago, but unfortunately > > we weren''t sure about how the code was to be released at that time. > > > > One way or another I''d like to remove the threaded components. If I run > > into the same problem again when I do so I''ll let you know. > > > > Cool, thanks. > > > _______________________________________________ > Eventmachine-talk mailing list > Eventmachine-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/eventmachine-talk >-- Tony Arcieri ClickCaster, Inc. tony at clickcaster.com 720-227-0129 ext. 202 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071026/26dbba49/attachment.html
On 10/26/07, Tony Arcieri <tony at clickcaster.com> wrote:> > Just switched over to a version which uses HttpClient... I noticed a > marginal speedup over net/httpHttpClient is functional and useful, but a bit of a throwaway, not designed very carefully. I''ll try to hack the chunked-transfer support into it, but it could use a rethink. Anyone up for it? I''m not sure what issues they were running into before, but I''m now rid of> one source of countless threads > > I''ll work on the HTTP servers next... not sure what road to go down in > that regard yetLook at eventmachine_httpserver: this is actually a well-designed and powerful piece of code. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071026/c3ad51c3/attachment.html
On 10/26/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:> > HttpClient is functional and useful, but a bit of a throwaway, not > designed very carefully. I''ll try to hack the chunked-transfer support into > it, but it could use a rethink. Anyone up for it? >#send_request is the biggest problem (and the method I overwrite). The main thing that''s needed, IMO, are objects which represent requests and responses. #send_request is pretty imperative and difficult to extend, so in DistribuStream I ended up overriding the entire thing, when really all I wanted to do was patch in a few additional types of headers. Look at eventmachine_httpserver: this is actually a well-designed and> powerful piece of code. >I''ll certainly take a look at it, but since all the existing code uses Mongrel the simplest way to go would be Kirk''s evented_mongrel. However if I can''t find a way to consume a PUT request''s body seperately of the header with evented_mongrel I''ll probably end up rewriting the HTTP server components with eventmachine_httpserver. It looks like it could use objects which represent requests and responses as well, though. Tony Arcieri ClickCaster, Inc. tony at clickcaster.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071027/037faa9e/attachment.html
On 10/27/07, Tony Arcieri <tony at clickcaster.com> wrote:> > On 10/26/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote: > > > > HttpClient is functional and useful, but a bit of a throwaway, not > > designed very carefully. I''ll try to hack the chunked-transfer support into > > it, but it could use a rethink. Anyone up for it? > > > > #send_request is the biggest problem (and the method I overwrite). The > main thing that''s needed, IMO, are objects which represent requests and > responses. #send_request is pretty imperative and difficult to extend, so > in DistribuStream I ended up overriding the entire thing, when really all I > wanted to do was patch in a few additional types of headers. >Yup, that''s a good analysis of the problem. Would you consider contributing your rewrite of #send_request? Look at eventmachine_httpserver: this is actually a well-designed and> > powerful piece of code. > > > > I''ll certainly take a look at it, but since all the existing code uses > Mongrel the simplest way to go would be Kirk''s evented_mongrel. However if > I can''t find a way to consume a PUT request''s body seperately of the header > with evented_mongrel I''ll probably end up rewriting the HTTP server > components with eventmachine_httpserver. It looks like it could use objects > which represent requests and responses as well, though.em_httpserver has distinct request and response objects. They''re not too much like the ones on Ruby''s CGI unfortunately, because they were designed with eventing in mind. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071027/75807dbe/attachment.html