thr3ads.net - Eventmachine talk - [Eventmachine-talk] epoll increasing latency big time [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Peter Cooper

2008-Feb-01 11:22 UTC

[Eventmachine-talk] epoll increasing latency big time

Sorry this is going to be a bit vague, but I''ve noticed something
rather odd
going on with EventMachine on Linux.

I''ve written something uses EventMachine to proxy HTTP to other
processes.
On OS X, it works great, adding only about 20% extra latency into the
connection when proxying versus connecting to the original backend process
directly, so a 5ms connection might now take 6ms. On a Linux deployment,
however, that 5ms was going up to 150ms! I noticed that when I removed
EventMachine.epoll, everything corrected itself and it worked as quickly as
in OS X. So, for now I''ve removed EventMachine.epoll, and all is well.

The first question, therefore, is does anyone have any experience or
knowledge of why using epoll could make an EventMachine daemon work more
slowly? Note that I am using "defer" after reading all the data in at
the
EventMachine end, if that makes a difference. Perhaps epoll clashes with
threading in some way I don''t understand.

To add more detail to the point, however, I did quite a lot of testing by
writing a small client (connecting to the EventMachine server) and found the
connection is just as fast with epoll, writing and flushing the data is just
as fast with epoll, BUT on doing a "sock.read", it takes 150ms before
the
EventMachine server sends back the data. Perhaps this helps.

This is all on a pretty standard Linux machine, reasonably new install,
kernel 2.6.9-55.0.6.ELsmp, Ruby 1.8.6, the latest EventMachine (0.10.0),
CentOS 4.6, etc.

The next step I''ll take when I have some time is to write an absolute
bare
bones EventMachine daemon and see if that also has the same effect. Then
I''ll try it on some different Linux boxes, but for now I thought
I''d ask in
case this is a known problem / issue for some reason. Thanks!

Regards,
Peter Cooper
http://www.rubyinside.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20080201/9d6c6fa8/attachment.html

Kirk Haines

2008-Feb-01 11:39 UTC

head link

[Eventmachine-talk] epoll increasing latency big time

On Feb 1, 2008 12:22 PM, Peter Cooper <pcooper at gmail.com> wrote:
> I''ve written something uses EventMachine to proxy HTTP to other
processes.
> On OS X, it works great, adding only about 20% extra latency into the
> connection when proxying versus connecting to the original backend process
> directly, so a 5ms connection might now take 6ms. On a Linux deployment,
> however, that 5ms was going up to 150ms! I noticed that when I removed
> EventMachine.epoll, everything corrected itself and it worked as quickly as
> in OS X. So, for now I''ve removed EventMachine.epoll, and all is
well.
That curious, Peter.  I run Swiftiply and Analogger with epoll on
several different Linux machines, and I am not seeing any latency
issues with either.

And yeah, your 20% benchmark there is about what I expect.  Proxy
speeds should be commensurate with the best dedicated software proxies
in common use, like HAProxy or nginx''s proxy support.

You have me curious, so I am going to run some tests and see if I can
detect any differences now, though.  In some tests with Analogger a
few weeks ago, switching from select to epoll seemed to slightly
reduce latencies, though.

Kirk Haines

Peter Cooper

2008-Feb-01 12:15 UTC

head link

[Eventmachine-talk] epoll increasing latency big time

[Sorry if this gets posted as a new thread. I erroneously chose the digest
version, then realized I couldn''t respond properly if I did that..
I''m now
getting each mail so it won''t happen again!]
Thanks for your reply Kirk! I''ve done a little digging around.

I tried my app on a totally separate Ubuntu virtual machine (rather than a
real life Centos server) and got the same result (super slow responses), so
I decided to go through commenting stuff out to see where the issue was.

It turns out if I put:

  send_data "<some valid HTTP response here>"
  close_connection_after_writing

OUTSIDE of the lambda called by EventMachine.defer, then it was fast on both
epoll and non-epoll (though non-epoll was still marginally faster).

But if that code was *even the first thing* in the lambda (with nothing else
in there at all - just to see), it immediately tripled latency. I wondered
then why only triple the latency and not 10x as I was experiencing before.
After more experimenting, it turned out using Logger (within those lambdas)
to log stuff was adding the rest of the latency.

Fast (in all modes):
  def process
    send_data "whatever"
    close_connection_after_writing
  end

Super slow (only with epoll activated):
  def process
    before = lambda {
      send_data "whatever"
      close_connection_after_writing
    }

    EventMachine.defer(before, lambda { })
  end


ULTRA slow (only with epoll activated):
  def process
    before = lambda {
      LOG.info "whatever"
      send_data "whatever"
      close_connection_after_writing
    }

    EventMachine.defer(before, lambda { })
  end

(I know that''s not how you use the before and after callbacks, but this
was
just for testing. My real app does it the right way ;-))

So.. for some reason, using epoll is making the lambdas used by
EventMachine.defer extremely slow for any form of IO. But doing IO outside
of them is fast. And doing IO within them is fast as long as you''re not
using epoll. Perhaps that will ring some bells? :)

Regards,
Peter Cooper
http://www.rubyinside.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20080201/6f3a78a2/attachment.html

Tony Arcieri

2008-Feb-01 12:23 UTC

head link

[Eventmachine-talk] epoll increasing latency big time

On Feb 1, 2008 1:15 PM, Peter Cooper <pcooper at gmail.com> wrote:
> So.. for some reason, using epoll is making the lambdas used by
> EventMachine.defer extremely slow for any form of IO. But doing IO outside
> of them is fast. And doing IO within them is fast as long as
you''re not
> using epoll. Perhaps that will ring some bells? :)

Sounds like some nasty interaction between the Ruby green threads scheduler
and EM''s epoll implementation

--
Tony Arcieri
ClickCaster, Inc.
tony at clickcaster.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20080201/b2e14090/attachment.html

Francis Cianfrocca

2008-Feb-01 12:23 UTC

head link

[Eventmachine-talk] epoll increasing latency big time

On Feb 1, 2008 3:15 PM, Peter Cooper <pcooper at gmail.com> wrote:
> [Sorry if this gets posted as a new thread. I erroneously chose the digest
> version, then realized I couldn''t respond properly if I did that..
I''m now
> getting each mail so it won''t happen again!]
> Thanks for your reply Kirk! I''ve done a little digging around.
>
> I tried my app on a totally separate Ubuntu virtual machine (rather than a
> real life Centos server) and got the same result (super slow responses), so
> I decided to go through commenting stuff out to see where the issue was.
>
> It turns out if I put:
>
>   send_data "<some valid HTTP response here>"
>   close_connection_after_writing
>
> OUTSIDE of the lambda called by EventMachine.defer, then it was fast on
> both epoll and non-epoll (though non-epoll was still marginally faster).
>
> But if that code was *even the first thing* in the lambda (with nothing
> else in there at all - just to see), it immediately tripled latency. I
> wondered then why only triple the latency and not 10x as I was experiencing
> before. After more experimenting, it turned out using Logger (within those
> lambdas) to log stuff was adding the rest of the latency.
>
> Fast (in all modes):
>   def process
>     send_data "whatever"
>     close_connection_after_writing
>   end
>
> Super slow (only with epoll activated):
>   def process
>     before = lambda {
>       send_data "whatever"
>       close_connection_after_writing
>     }
>
>     EventMachine.defer(before, lambda { })
>   end
>
>
> ULTRA slow (only with epoll activated):
>   def process
>     before = lambda {
>       LOG.info "whatever"
>       send_data "whatever"
>       close_connection_after_writing
>     }
>
>     EventMachine.defer(before, lambda { })
>   end
>
> (I know that''s not how you use the before and after callbacks, but
this
> was just for testing. My real app does it the right way ;-))
>
> So.. for some reason, using epoll is making the lambdas used by
> EventMachine.defer extremely slow for any form of IO. But doing IO outside
> of them is fast. And doing IO within them is fast as long as
you''re not
> using epoll. Perhaps that will ring some bells? :)
>
>

It definitely rings a bell. EM#defer uses a pool of Ruby threads, and I
wouldn''t be surprised if epoll introduces latency as it interacts with
the
thread scheduler. Do you have the ability to use Ruby 1.9? You might find
that it goes a lot faster with the current HEAD revision.

If it turns out that I''m right, then it''s possible to tune the
interaction
with Ruby threads and make this go faster.

What would be more interesting to me is to see if you can avoid using
EM#defer. It''s there for cases when you just plain and simply
can''t avoid a
blocking call (as in a call to a database library, although it would be
possible and interesting to develop nonblocking versions of the standard
DBMS clients).

In all other cases, there should be a more event-oriented approach. If you
like, you can describe what you''re doing in more detail (either onlist
or
offlist) and I might be able to make a suggestion.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20080201/f6311d09/attachment.html

Peter Cooper

2008-Feb-01 12:43 UTC

head link

[Eventmachine-talk] epoll increasing latency big time

On Feb 1, 2008 8:23 PM, Francis Cianfrocca <garbagecat10 at gmail.com>
wrote:
> It definitely rings a bell. EM#defer uses a pool of Ruby threads, and I
> wouldn''t be surprised if epoll introduces latency as it interacts
with the
> thread scheduler. Do you have the ability to use Ruby 1.9? You might find
> that it goes a lot faster with the current HEAD revision.
>
I''ll give it a try in Ruby 1.9, although Ruby 1.9 isn''t apt
for the main
program I''m writing, although it might reveal some things about this
particular issue.

What would be more interesting to me is to see if you can avoid
using> EM#defer. It''s there for cases when you just plain and simply
can''t avoid a
> blocking call (as in a call to a database library, although it would be
> possible and interesting to develop nonblocking versions of the standard
> DBMS clients).
>
I can avoid it, as I''m just doing a socket connection, request and
response,
which are currently coded in a way that blocks, but it wouldn''t be
impossible to rewrite it to use non-blocking reads. I did a trial of just
running the existing code but without EventMachine.defer and performance is
increased somewhat in epoll mode, although still not quite as fast as
without, although this was also true when I made a bare bones EventMachine
HTTP server (I got 3000 req/s with it off, 2800 req/s with epoll on).

Given the positive results on Linux with epoll *off*, I think my quick fix
is to just not use it (especially as I can get 3000 req/s already, far
beyond what''s needed!) but look into a more permanent solution for the
next
minor revision of my software.

In all other cases, there should be a more event-oriented approach. If
you> like, you can describe what you''re doing in more detail (either
onlist or
> offlist) and I might be able to make a suggestion.
>
The project that uses this code launches later today (this is the final bug
found in testing for this revision - how fun it is doing multi-architecture
testing!) so you can take a look. It''ll be announced on Ruby Inside
sometime
over the weekend, all being well, but if I remember I will send you a link
:) If all goes well, this might become one of the more deployed EM-based
apps.. (or not!)

Thanks for all of the pointers, much appreciated!

Regards,
Peter Cooper
http://www.rubyinside.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20080201/71d7feba/attachment-0001.html

Francis Cianfrocca

2008-Feb-01 18:10 UTC

head link

[Eventmachine-talk] epoll increasing latency big time

On Feb 1, 2008 3:43 PM, Peter Cooper <pcooper at gmail.com> wrote:
>
>
> On Feb 1, 2008 8:23 PM, Francis Cianfrocca <garbagecat10 at
gmail.com> wrote:
>
> > It definitely rings a bell. EM#defer uses a pool of Ruby threads, and
I
> > wouldn''t be surprised if epoll introduces latency as it
interacts with the
> > thread scheduler. Do you have the ability to use Ruby 1.9? You might
> > find that it goes a lot faster with the current HEAD revision.
> >
>
> I''ll give it a try in Ruby 1.9, although Ruby 1.9 isn''t
apt for the main
> program I''m writing, although it might reveal some things about
this
> particular issue.
>
> What would be more interesting to me is to see if you can avoid using
> > EM#defer. It''s there for cases when you just plain and simply
can''t avoid a
> > blocking call (as in a call to a database library, although it would
be
> > possible and interesting to develop nonblocking versions of the
standard
> > DBMS clients).
> >
>
> I can avoid it, as I''m just doing a socket connection, request and
> response, which are currently coded in a way that blocks, but it
wouldn''t be
> impossible to rewrite it to use non-blocking reads. I did a trial of just
> running the existing code but without EventMachine.defer and performance
> is increased somewhat in epoll mode, although still not quite as fast as
> without, although this was also true when I made a bare bones EventMachine
> HTTP server (I got 3000 req/s with it off, 2800 req/s with epoll on).
>
> Given the positive results on Linux with epoll *off*, I think my quick fix
> is to just not use it (especially as I can get 3000 req/s already, far
> beyond what''s needed!) but look into a more permanent solution for
the next
> minor revision of my software.
>
>
You will want to preserve the option to use epoll if your scalability
requirement goes up, since EM without epoll can only handle at most 1024
simultaneous connections. With epoll, that limit goes way up.

I don''t believe in optimizing past the point where it makes economic
sense
:-). So if you''re good where you are at 3000 reqs/second and your code
is
otherwise stable, then I''d say you''re ready to launch!
I''m looking forward
to hearing about your product.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/eventmachine-talk/attachments/20080201/58d46513/attachment.html

Possibly Parallel Threads

Search for more reasonably related threads

Eventmachine talk - Feb 2008 - epoll increasing latency big time

[Eventmachine-talk] epoll increasing latency big time

[Eventmachine-talk] epoll increasing latency big time

[Eventmachine-talk] epoll increasing latency big time

[Eventmachine-talk] epoll increasing latency big time

[Eventmachine-talk] epoll increasing latency big time

[Eventmachine-talk] epoll increasing latency big time

[Eventmachine-talk] epoll increasing latency big time

Possibly Parallel Threads