thr3ads.net - Eventmachine talk - [Eventmachine-talk] DistribuStream released [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Tony Arcieri

2007-Oct-18 15:57 UTC

[Eventmachine-talk] DistribuStream released

I''m glad to announce the first initial public release of
DistribuStream:

http://distribustream.org/

DistribuStream is a peercasting system, facilitating streaming media over a
managed peer-to-peer network. You can think of it as being similar to
BitTorrent, but employing progressive downloads so media can be consumed in
on-demand streams (and it''s not written in Python! :). It was
developed by ClickCaster,
Inc. <http://www.clickcaster.com/> through the University of Colorado
Computer Science Program. The whole project is written in Ruby and uses
EventMachine as the basis of both the client and server. It''s
available as
a gem (gem install distribustream) and you can find a basic writeup on how
to use it here:

http://www.clickcaster.com/items/distribustream-0-1-0-released

This release targets primarily developers as it has some pretty major issues
that need to get resolved before it''s generally usable. The most
pressing
is the client implementation, which currently stores the entire file in RAM
before writing it out to an IO object. This needs to be changed to a disk
buffer, and / or combined with the protocol''s ability to
"unprovide" certain
file segments from the peer network, so the client can store a manageable
moving window of data.

In terms of implementation details that would nice to get resolved, the
present implementation uses a hodgepodge of both EM and threaded
components. While the underlying protocol (the Peer Distributed Transfer
Protocol, see pdtp.org) is implemented using EventMachine, all of the
peer-to-peer file transfers utilize the regular threaded Mongrel (i.e. not
Kirk''s Mongrel) and the regular (and abominable) Net::HTTP client.

I would very much like to eliminate all of the threaded components and
running everything inside the Reactor loop with the EM HttpClient and either
Kirk''s Mongrel or Francis'' em_httpserver. In earlier
implementations where
we tried this we ran into inexplicable stalling inside of the Reactor
itself, and because of that the team decided to go with the hybrid Reactor /
threaded approach. Now that the project is open source I''d like to
start
looking into this again, and would greatly appreciate whatever help anyone
can offer.

I''ll be announcing it to ruby-talk in a few days or so, but thought
I''d
announce it here first and get some feedback.

--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071018/6a4a2db2/attachment-0001.html

Francis Cianfrocca

2007-Oct-19 01:32 UTC

head link

[Eventmachine-talk] DistribuStream released

On 10/18/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> I''m glad to announce the first initial public release of
DistribuStream:
>
> http://distribustream.org/
>
> DistribuStream is a peercasting system, facilitating streaming media over
> a managed peer-to-peer network.  You can think of it as being similar to
> BitTorrent, but employing progressive downloads so media can be consumed in
> on-demand streams (and it''s not written in Python! :).  It was
developed by ClickCaster,
> Inc. <http://www.clickcaster.com/> through the University of Colorado
> Computer Science Program.  The whole project is written in Ruby and uses
> EventMachine as the basis of both the client and server.  It''s
available as
> a gem (gem install distribustream) and you can find a basic writeup on how
> to use it here:
>
> http://www.clickcaster.com/items/distribustream-0-1-0-released
>
> This release targets primarily developers as it has some pretty major
> issues that need to get resolved before it''s generally usable. 
The most
> pressing is the client implementation, which currently stores the entire
> file in RAM before writing it out to an IO object.  This needs to be
changed
> to a disk buffer, and / or combined with the protocol''s ability to
> "unprovide" certain file segments from the peer network, so the
client can
> store a manageable moving window of data.
>
> In terms of implementation details that would nice to get resolved, the
> present implementation uses a hodgepodge of both EM and threaded
> components.  While the underlying protocol (the Peer Distributed Transfer
> Protocol, see pdtp.org) is implemented using EventMachine, all of the
> peer-to-peer file transfers utilize the regular threaded Mongrel (i.e. not
> Kirk''s Mongrel) and the regular (and abominable) Net::HTTP client.
>
> I would very much like to eliminate all of the threaded components and
> running everything inside the Reactor loop with the EM HttpClient and
either
> Kirk''s Mongrel or Francis'' em_httpserver.  In earlier
implementations where
> we tried this we ran into inexplicable stalling inside of the Reactor
> itself, and because of that the team decided to go with the hybrid Reactor
/
> threaded approach.  Now that the project is open source I''d like
to start
> looking into this again, and would greatly appreciate whatever help anyone
> can offer.
>
> I''ll be announcing it to ruby-talk in a few days or so, but
thought I''d
> announce it here first and get some feedback.


Do you have any general information about the reactor stalling you observed?
It would be great to figure out where that was coming from.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/7c247c05/attachment.html

Tony Arcieri

2007-Oct-19 10:19 UTC

head link

[Eventmachine-talk] DistribuStream released

We observed it while running LineAndTextProtocol and HttpClient in the same
Reactor.  I can see if I can revert the codebase to when we were
experiencing the problem (months and months ago), but I think the better
approach is to start removing some of the threaded components in our current
codebase and if the problem comes up again we can have a look then.

We were also trying to use Kirk''s evented Mongrel at one point, however
it
looks like due to the way the Mongrel API works we can''t use it
properly for
what we want to do.  Mongrel has a method you call to synchronously consume
the request body (for example, in a POST or PUT request), but in the evented
Mongrel this is stubbed out and the whole request has been consumed
(including the body) before we even get access to the header information.
We''re using the header information to authenticate file transfers, so
it''s
somewhat pointless and wasteful if we can''t get access to that
information
before the entire transfer has occurred.

Francis, does your HTTP server let you check the header of a POST or PUT
request before the body is consumed?

On 10/19/07, Francis Cianfrocca <garbagecat10 at gmail.com>
wrote:>
> On 10/18/07, Tony Arcieri <tony at clickcaster.com> wrote:
>
> > I''m glad to announce the first initial public release of
DistribuStream:
> >
> > http://distribustream.org/
> >
> > DistribuStream is a peercasting system, facilitating streaming media
> > over a managed peer-to-peer network.  You can think of it as being
similar
> > to BitTorrent, but employing progressive downloads so media can be
consumed
> > in on-demand streams (and it''s not written in Python! :).  It
was developed
> > by ClickCaster, Inc. <http://www.clickcaster.com/> through the
> > University of Colorado Computer Science Program.  The whole project is
> > written in Ruby and uses EventMachine as the basis of both the client
and
> > server.  It''s available as a gem (gem install distribustream)
and you can
> > find a basic writeup on how to use it here:
> >
> > http://www.clickcaster.com/items/distribustream-0-1-0-released
> >
> > This release targets primarily developers as it has some pretty major
> > issues that need to get resolved before it''s generally
usable.  The most
> > pressing is the client implementation, which currently stores the
entire
> > file in RAM before writing it out to an IO object.  This needs to be
changed
> > to a disk buffer, and / or combined with the protocol''s
ability to
> > "unprovide" certain file segments from the peer network, so
the client can
> > store a manageable moving window of data.
> >
> > In terms of implementation details that would nice to get resolved,
the
> > present implementation uses a hodgepodge of both EM and threaded
> > components.  While the underlying protocol (the Peer Distributed
Transfer
> > Protocol, see pdtp.org) is implemented using EventMachine, all of the
> > peer-to-peer file transfers utilize the regular threaded Mongrel (
i.e.
> > not Kirk''s Mongrel) and the regular (and abominable)
Net::HTTP client.
> >
> > I would very much like to eliminate all of the threaded components and
> > running everything inside the Reactor loop with the EM HttpClient and
either
> > Kirk''s Mongrel or Francis'' em_httpserver.  In
earlier implementations where
> > we tried this we ran into inexplicable stalling inside of the Reactor
> > itself, and because of that the team decided to go with the hybrid
Reactor /
> > threaded approach.  Now that the project is open source I''d
like to start
> > looking into this again, and would greatly appreciate whatever help
anyone
> > can offer.
> >
> > I''ll be announcing it to ruby-talk in a few days or so, but
thought I''d
> > announce it here first and get some feedback.
>
>
>
> Do you have any general information about the reactor stalling you
> observed? It would be great to figure out where that was coming from.
>
>
> _______________________________________________
> Eventmachine-talk mailing list
> Eventmachine-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/eventmachine-talk
>


--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
720-227-0129 ext. 202
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/64c959ba/attachment-0001.html

Kirk Haines

2007-Oct-19 10:30 UTC

head link

[Eventmachine-talk] DistribuStream released

On 10/19/07, Tony Arcieri <tony at clickcaster.com> wrote:
> We were also trying to use Kirk''s evented Mongrel at one point,
however it
> looks like due to the way the Mongrel API works we can''t use it
properly for
> what we want to do.  Mongrel has a method you call to synchronously consume
> the request body (for example, in a POST or PUT request), but in the
evented
> Mongrel this is stubbed out and the whole request has been consumed
> (including the body) before we even get access to the header information.
> We''re using the header information to authenticate file transfers,
so it''s
> somewhat pointless and wasteful if we can''t get access to that
information
> before the entire transfer has occurred.
I may be able to fix that.
> Francis, does your HTTP server let you check the header of a POST or >
PUT request before the body is consumed?
As written, no.  I have modified it so that as the request body is
read, a method is called that receives the POST/PUT data, but the
method to handle the request itself isn''t called until the state
machine reaches the DispatchState, and it doesn''t reach that until it
has finished reading the request.

I however, have been hacking the heck out of this thing, trying
different things with it, and I don''t see any reason why one
couldn''t
change it to call a method after the header has been parsed, but
before the body has been parsed.  Should be pretty trivial.


Kirk Haines

Francis Cianfrocca

2007-Oct-19 12:06 UTC

head link

[Eventmachine-talk] DistribuStream released

On 10/19/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> We observed it while running LineAndTextProtocol and HttpClient in the
> same Reactor.  I can see if I can revert the codebase to when we were
> experiencing the problem (months and months ago), but I think the better
> approach is to start removing some of the threaded components in our
current
> codebase and if the problem comes up again we can have a look then.
>
> We were also trying to use Kirk''s evented Mongrel at one point,
however it
> looks like due to the way the Mongrel API works we can''t use it
properly for
> what we want to do.  Mongrel has a method you call to synchronously consume
> the request body (for example, in a POST or PUT request), but in the
evented
> Mongrel this is stubbed out and the whole request has been consumed
> (including the body) before we even get access to the header information.
> We''re using the header information to authenticate file transfers,
so it''s
> somewhat pointless and wasteful if we can''t get access to that
information
> before the entire transfer has occurred.
>
> Francis, does your HTTP server let you check the header of a POST or PUT
> request before the body is consumed?
 No, but it wouldn''t be a stretch to add it.

Was the stalling easily repeatable? Could you reproduce it now if you had
to?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/861c56f0/attachment.html

Tony Arcieri

2007-Oct-19 12:21 UTC

head link

[Eventmachine-talk] DistribuStream released

It would''ve been easily repeatable several months ago, but
unfortunately we
weren''t sure about how the code was to be released at that time.

One way or another I''d like to remove the threaded components.  If I
run
into the same problem again when I do so I''ll let you know.

On 10/19/07, Francis Cianfrocca <garbagecat10 at gmail.com>
wrote:>
> On 10/19/07, Tony Arcieri <tony at clickcaster.com> wrote:
> >
> > We observed it while running LineAndTextProtocol and HttpClient in the
> > same Reactor.  I can see if I can revert the codebase to when we were
> > experiencing the problem (months and months ago), but I think the
better
> > approach is to start removing some of the threaded components in our
current
> > codebase and if the problem comes up again we can have a look then.
> >
> > We were also trying to use Kirk''s evented Mongrel at one
point, however
> > it looks like due to the way the Mongrel API works we can''t
use it properly
> > for what we want to do.  Mongrel has a method you call to
synchronously
> > consume the request body (for example, in a POST or PUT request), but
in the
> > evented Mongrel this is stubbed out and the whole request has been
consumed
> > (including the body) before we even get access to the header
information.
> > We''re using the header information to authenticate file
transfers, so it''s
> > somewhat pointless and wasteful if we can''t get access to
that information
> > before the entire transfer has occurred.
> >
> > Francis, does your HTTP server let you check the header of a POST or
PUT
> > request before the body is consumed?
>
>  No, but it wouldn''t be a stretch to add it.
>
> Was the stalling easily repeatable? Could you reproduce it now if you had
> to?
>
> _______________________________________________
> Eventmachine-talk mailing list
> Eventmachine-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/eventmachine-talk
>


--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
720-227-0129 ext. 202
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/fc2d0108/attachment.html

Francis Cianfrocca

2007-Oct-19 12:30 UTC

head link

[Eventmachine-talk] DistribuStream released

On 10/19/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> It would''ve been easily repeatable several months ago, but
unfortunately
> we weren''t sure about how the code was to be released at that
time.
>
> One way or another I''d like to remove the threaded components.  If
I run
> into the same problem again when I do so I''ll let you know.


Cool, thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071019/f9f9cc34/attachment-0001.html

Tony Arcieri

2007-Oct-26 16:38 UTC

head link

[Eventmachine-talk] DistribuStream released

Just switched over to a version which uses HttpClient... I noticed a
marginal speedup over net/http

I''m not sure what issues they were running into before, but
I''m now rid of
one source of countless threads

I''ll work on the HTTP servers next... not sure what road to go down in
that
regard yet

On 10/19/07, Francis Cianfrocca <garbagecat10 at gmail.com>
wrote:>
> On 10/19/07, Tony Arcieri <tony at clickcaster.com> wrote:
> >
> > It would''ve been easily repeatable several months ago, but
unfortunately
> > we weren''t sure about how the code was to be released at that
time.
> >
> > One way or another I''d like to remove the threaded
components.  If I run
> > into the same problem again when I do so I''ll let you know.
>
>
>
> Cool, thanks.
>
>
> _______________________________________________
> Eventmachine-talk mailing list
> Eventmachine-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/eventmachine-talk
>


--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
720-227-0129 ext. 202
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071026/26dbba49/attachment.html

Francis Cianfrocca

2007-Oct-26 17:55 UTC

head link

[Eventmachine-talk] DistribuStream released

On 10/26/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> Just switched over to a version which uses HttpClient... I noticed a
> marginal speedup over net/http

HttpClient is functional and useful, but a bit of a throwaway, not designed
very carefully. I''ll try to hack the chunked-transfer support into it,
but
it could use a rethink. Anyone up for it?

I''m not sure what issues they were running into before, but
I''m now rid of> one source of countless threads
>
> I''ll work on the HTTP servers next... not sure what road to go
down in
> that regard yet

Look at eventmachine_httpserver: this is actually a well-designed and
powerful piece of code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071026/c3ad51c3/attachment.html

Tony Arcieri

2007-Oct-27 02:39 UTC

head link

[Eventmachine-talk] DistribuStream released

On 10/26/07, Francis Cianfrocca <garbagecat10 at gmail.com>
wrote:>
> HttpClient is functional and useful, but a bit of a throwaway, not
> designed very carefully. I''ll try to hack the chunked-transfer
support into
> it, but it could use a rethink. Anyone up for it?
>
#send_request is the biggest problem (and the method I overwrite).  The main
thing that''s needed, IMO, are objects which represent requests and
responses.  #send_request is pretty imperative and difficult to extend, so
in DistribuStream I ended up overriding the entire thing, when really all I
wanted to do was patch in a few additional types of headers.

Look at eventmachine_httpserver: this is actually a well-designed
and> powerful piece of code.
>
I''ll certainly take a look at it, but since all the existing code uses
Mongrel the simplest way to go would be Kirk''s evented_mongrel. 
However if
I can''t find a way to consume a PUT request''s body seperately
of the header
with evented_mongrel I''ll probably end up rewriting the HTTP server
components with eventmachine_httpserver.  It looks like it could use objects
which represent requests and responses as well, though.

Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071027/037faa9e/attachment.html

Francis Cianfrocca

2007-Oct-27 05:01 UTC

head link

[Eventmachine-talk] DistribuStream released

On 10/27/07, Tony Arcieri <tony at clickcaster.com>
wrote:>
> On 10/26/07, Francis Cianfrocca <garbagecat10 at gmail.com> wrote:
> >
> > HttpClient is functional and useful, but a bit of a throwaway, not
> > designed very carefully. I''ll try to hack the
chunked-transfer support into
> > it, but it could use a rethink. Anyone up for it?
> >
>
> #send_request is the biggest problem (and the method I overwrite).  The
> main thing that''s needed, IMO, are objects which represent
requests and
> responses.  #send_request is pretty imperative and difficult to extend, so
> in DistribuStream I ended up overriding the entire thing, when really all I
> wanted to do was patch in a few additional types of headers.
>
Yup, that''s a good analysis of the problem. Would you consider
contributing
your rewrite of #send_request?

Look at eventmachine_httpserver: this is actually a well-designed
and> > powerful piece of code.
> >
>
> I''ll certainly take a look at it, but since all the existing code
uses
> Mongrel the simplest way to go would be Kirk''s evented_mongrel. 
However if
> I can''t find a way to consume a PUT request''s body
seperately of the header
> with evented_mongrel I''ll probably end up rewriting the HTTP
server
> components with eventmachine_httpserver.  It looks like it could use
objects
> which represent requests and responses as well, though.

em_httpserver has distinct request and response objects. They''re not
too
much like the ones on  Ruby''s CGI unfortunately, because they were
designed
with eventing in mind.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20071027/75807dbe/attachment.html

Eventmachine talk - Oct 2007 - DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released

[Eventmachine-talk] DistribuStream released