thr3ads.net - Mongrel users - [Mongrel] scaling unicorn [Jun 2010]

If this information is useful, please help other people find it:
Share via:

snacktime

2010-Jun-21 19:58 UTC

[Mongrel] scaling unicorn

Interested in some feeback on this (does it sound right?), or maybe
this might be of interest to others.

We are launching a new facebook app in a couple weeks and we did some
load testing over the weekend on our unicorn web cluster.  The servers
are 8 way xeon''s with 24gb ram.  Our app ended up being primarily cpu
bound.  So far the sweet spot for the number of unicorns seems to be
around 40.  This seemed to yield the most requests per second without
overloading the server or hitting memory bandwidth issues.  The
backlog is at the somaxconn default of 128, I''m still not sure if we
will bump that up or not.  Increasing the number of unicorns beyond a
certain point resulted in a noticable drop in the requests per second
the server could handle.   I''m pretty sure the cause is the box
running out of memory bandwidth.  The load average and resource usage
in general (except for memory) would keep going down but so did the
requests per second.  At 80 unicorns the requests per second dropped
by more then half.  I''m going to disable hyperthreading and rerun some
of the tests to see what impact that has.

Chris

Eric Wong

2010-Jun-22 00:16 UTC

head link

[Mongrel] scaling unicorn

snacktime <snacktime at gmail.com> wrote:> Interested in some feeback on this (does it sound right?), or maybe
> this might be of interest to others.
Hi Chris,

I think you meant to post this to the mongrel-unicorn at rubyforge.org
list, not mongrel-users at rubyforge.org :>
> We are launching a new facebook app in a couple weeks and we did some
> load testing over the weekend on our unicorn web cluster.  The servers
> are 8 way xeon''s with 24gb ram.  Our app ended up being primarily
cpu
> bound.  So far the sweet spot for the number of unicorns seems to be
> around 40.  This seemed to yield the most requests per second without
> overloading the server or hitting memory bandwidth issues.  The
> backlog is at the somaxconn default of 128, I''m still not sure if
we
> will bump that up or not.
The default backlog we try to specify is actually 1024 (same as
Mongrel).  But it''s always a murky value anyways, as it''s
kernel/sysctl-dependent.  With Unix domain sockets, some folks use
crazy values like 2048 to look better on synthetic benchmarks :)
> Increasing the number of unicorns beyond a
> certain point resulted in a noticable drop in the requests per second
> the server could handle.   I''m pretty sure the cause is the box
> running out of memory bandwidth.  The load average and resource usage
> in general (except for memory) would keep going down but so did the
> requests per second.  At 80 unicorns the requests per second dropped
> by more then half.  I''m going to disable hyperthreading and rerun
some
> of the tests to see what impact that has.
That''s "8 way xeon" _before_ hyperthreading, right?  Which
family of
Xeons are you using, the Pentium4-based crap or the awesome new ones?

How much memory is each Unicorn worker using for your app?

40 workers for 8 physical cores sounds reasonable.  Depending on the
app, I think the reasonable range is anywhere from 2-8 workers per
physical core.  More if you''re (unfortunately) limited by external
network calls, but since you claim to be CPU bound, less.

Do you have actual performance numbers you''re able to share?
Mean/median request times/rates would be very useful.  If your requests
run very quickly, you may be limited by contention with the accept()
syscall on the listen socket, too.

I assume you''re using nginx as the proxy, is this with Unix domain
sockets or TCP sockets?  Unix domain sockets should give a small
performance over TCP if it''s all on the same box.

With TCP, you should also check to see you have enough local ports
available if you''re hitting extremely high (and probably unrealistic :)
request rates.

-- 
Eric Wong

James Tucker

2010-Jun-22 00:42 UTC

head link

[Mongrel] scaling unicorn

What was the request rate and total bandwidth flowing at your peak?

How far is that from your theoretical potential on the box?

On 21 Jun 2010, at 16:58, snacktime wrote:
> Interested in some feeback on this (does it sound right?), or maybe
> this might be of interest to others.
> 
> We are launching a new facebook app in a couple weeks and we did some
> load testing over the weekend on our unicorn web cluster.  The servers
> are 8 way xeon''s with 24gb ram.  Our app ended up being primarily
cpu
> bound.  So far the sweet spot for the number of unicorns seems to be
> around 40.  This seemed to yield the most requests per second without
> overloading the server or hitting memory bandwidth issues.  The
> backlog is at the somaxconn default of 128, I''m still not sure if
we
> will bump that up or not.  Increasing the number of unicorns beyond a
> certain point resulted in a noticable drop in the requests per second
> the server could handle.   I''m pretty sure the cause is the box
> running out of memory bandwidth.  The load average and resource usage
> in general (except for memory) would keep going down but so did the
> requests per second.  At 80 unicorns the requests per second dropped
> by more then half.  I''m going to disable hyperthreading and rerun
some
> of the tests to see what impact that has.
> 
> Chris
> _______________________________________________
> Mongrel-users mailing list
> Mongrel-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-users

Jamie Wilkinson

2010-Jun-22 02:34 UTC

head link

scaling unicorn

On Jun 21, 2010, at 5:16 PM, Eric Wong wrote:
>> overloading the server or hitting memory bandwidth issues.  The
>> backlog is at the somaxconn default of 128, I''m still not sure
if we
>> will bump that up or not.
> 
> The default backlog we try to specify is actually 1024 (same as
> Mongrel).  But it''s always a murky value anyways, as it''s
> kernel/sysctl-dependent.  With Unix domain sockets, some folks use
> crazy values like 2048 to look better on synthetic benchmarks :)
Somewhat related -- I''ve been meaning to discuss the finer points of
backlog tuning.

I''ve been experimenting with the multi-server socket+TCP megaunicorn
configuration from your CDT:
http://rubyforge.org/pipermail/mongrel-unicorn/2009-September/000033.html

Which I think is what this sentence from TUNING is talking about?

	"Setting a very low value for the :backlog parameter in ?listen?
directives can allow failover to happen more quickly if your cluster is
configured for it."

Our app can catch a batch of requests which will be slow (1-3s), and these can
pool on one individual server in our load-balanced EC2 cluster -- exactly the
case for the multi-server failover setup.

I''ve put this into production under a healthy load (5000+ RPM) and it
appears to work really well!  Produces very high requests/s rates at
significantly higher concurrency than without, and serves zero 502 errors (part
of the goal)

I currently I have the unix socket set to a backlog of 64, then failing over to
a TCP listener using backlog 1024 (so that things are queued rather than
502''d)

I can imagine there might be a case for keeping the TCP backlog low as well
& serving errors when overloaded, rather than getting caught in an
unrecoverable back-queue tarpit

I''m currently failing-over to a dedicated "backup" instance,
so that I could measure exactly how much traffic is being offloaded. This means
my benchmarks w/o failover are 1 server, but with failover is actually 2
servers. We''re reconfiguring to something more like the original
diagram at which point I''ll do some cluster-wide stress-tests &
share data/scripts/process.

BTW, this configuration needs a cool name!

-jamie
http://jamiedubs.com
http://fffff.at

Eric Wong

2010-Jun-22 04:53 UTC

head link

scaling unicorn

Jamie Wilkinson <jamie at tramchase.com> wrote:> On Jun 21, 2010, at 5:16 PM, Eric Wong wrote:
> >> overloading the server or hitting memory bandwidth issues.  The
> >> backlog is at the somaxconn default of 128, I''m still not
sure if
> >> we will bump that up or not.
> > 
> > The default backlog we try to specify is actually 1024 (same as
> > Mongrel).  But it''s always a murky value anyways, as
it''s
> > kernel/sysctl-dependent.  With Unix domain sockets, some folks use
> > crazy values like 2048 to look better on synthetic benchmarks :)
>  
> Somewhat related -- I''ve been meaning to discuss the finer points
of
> backlog tuning.
>  
> I''ve been experimenting with the multi-server socket+TCP
megaunicorn
> configuration from your CDT:
> http://rubyforge.org/pipermail/mongrel-unicorn/2009-September/000033.html
> 
> Which I think is what this sentence from TUNING is talking about?
>  
> 	"Setting a very low value for the :backlog parameter in ?listen?
> 	directives can allow failover to happen more quickly if your
> 	cluster is configured for it."
Yes.

<snip>

Thanks for sharing, and good to this is working well for you.

I''m still unlikely to have the chance to test this anywhere soon, but
maybe more folks can give it a try now that we''ve had one successful
report.   More reports (success or not) would definitely be good
to hear.
> BTW, this configuration needs a cool name!
Since you''re the first person brave enough to try (or at least report
about it), you shall have the honor of naming it :)

-- 
Eric Wong

snacktime

2010-Jun-22 17:30 UTC

head link

[Mongrel] scaling unicorn

On Mon, Jun 21, 2010 at 5:16 PM, Eric Wong <normalperson at yhbt.net>
wrote:> snacktime <snacktime at gmail.com> wrote:
>> Interested in some feeback on this (does it sound right?), or maybe
>> this might be of interest to others.
>
> Hi Chris,
>
> I think you meant to post this to the mongrel-unicorn at rubyforge.org
> list, not mongrel-users at rubyforge.org :>
>Yes, not sure how that got mixed up...

>
> That''s "8 way xeon" _before_ hyperthreading, right?
?Which family of
> Xeons are you using, the Pentium4-based crap or the awesome new ones?
>Two quad core Nehalems on each server.
> How much memory is each Unicorn worker using for your app?
>Undoubtedly this is lower then it will be under a real load, but under
our load tests they stabilize at around 160mb.
> Do you have actual performance numbers you''re able to share?
> Mean/median request times/rates would be very useful. ?If your requests
> run very quickly, you may be limited by contention with the accept()
> syscall on the listen socket, too.
>
I had two different types of requests to test that I did in varying
combinations.  One takes on average 600ms, and the other 40ms.  98% of
our requests will be the faster one.  Deviations were really low.
> I assume you''re using nginx as the proxy, is this with Unix domain
> sockets or TCP sockets? ?Unix domain sockets should give a small
> performance over TCP if it''s all on the same box.
>
Yes nginx with domain sockets.


Chris

snacktime

2010-Jun-22 18:03 UTC

head link

scaling unicorn

>> Somewhat related -- I''ve been meaning to discuss the finer
points of
>> backlog tuning.
>>
>> I''ve been experimenting with the multi-server socket+TCP
megaunicorn
>> configuration from your CDT:
>>
http://rubyforge.org/pipermail/mongrel-unicorn/2009-September/000033.html
So I''m in the position of launching a web app in a couple of weeks
that is pretty much guaranteed to get huge traffic.  I''m working with
ops people who are very good but this is not how they would normally
setup load balancing and scale out.  I''m having a meeting with our
network ops lead tomorrow to talk about this.  I like the idea of this
approach, it seems like it gives you more fine grained control over
how much load you put on individual servers as well as how individual
requests are handled.  But I''m not too keen on using something like
this at scale when we simply don''t have the chance to test it out at a
smaller scale.  I have yet to see anyone with this setup running at
scale.  That of course doesn''t mean it''s not a great idea,
only that I
doubt our ops guys are going to want to be the first.  They are
already overworked as it is:)

So assuming we will scale out the ''normal'' way by not having a
short
backlog, any info on how to manage that?   Should we control the
backlog queue in nginx (not sure exactly how I would do that) or via
the listen backlog?  I was looking around last night and couldn''t find
a way to actually poll the listen backlog queue size.

Also, any ideas on how you would practically manage this type of load
balancing setup?  Seems like you would have some type of
''reserve''
cluster for requests that hit the listen backlog, and when you start
seeing too much traffic going to the reserve, you add more servers to
your main pool.  How else would you manage the configuration for
something like this when you are working with 100 - 200 servers?  You
can''t be changing the nginx configs every time you add servers,
that''s
just not practical.

Chris

Jamie Wilkinson

2010-Jun-22 18:57 UTC

head link

scaling unicorn

>> Somewhat related -- I''ve been meaning to discuss the finer
points of
>> backlog tuning.
>> 
>> I''ve been experimenting with the multi-server socket+TCP
megaunicorn
>> configuration from your CDT:
>>
http://rubyforge.org/pipermail/mongrel-unicorn/2009-September/000033.html
On Jun 22, 2010, at 11:03 AM, snacktime wrote:
> Seems like you would have some type of ''reserve''
> cluster for requests that hit the listen backlog, and when you start
> seeing too much traffic going to the reserve, you add more servers to
> your main pool.  How else would you manage the configuration for
> something like this when you are working with 100 - 200 servers?  You
> can''t be changing the nginx configs every time you add servers,
that''s
> just not practical.
We are using chef for machine configuration which makes these kinds of  numbers
doable
http://wiki.opscode.com/display/chef/Home

I would love to see a nginx module for distributed configuration mgmnt

Right now we are running 6 frontend machines, 4 in use & 2 in reserve like
you described. We are doing about 5000rpm with this, almost all dynamic. 10-30%
of requests might be ''slow'' (1+s) depending on usage patterns.

To measure health I am using munin to watch system load, nginx requests &
nginx errors. In this configuration 502 Bad Gateways from frontend nginx
indicate a busy unicorn socket & thus a handoff of the request to the
backups. Then we measure the rails production.log for request counts + speed on
each server as well as using NewRelic RPM

monit also emails us when 502s show up. 
In theory monit could  be automatically spinning up another backup server,
provisioning it using chef, then reprovisioning the rest of the cluster to start
handing over traffic. Alternately the new server could just act as backup for
the one overloaded machine, which could make isolating performance issues
easier.

-jamie

Jamie Wilkinson

2010-Jun-22 19:18 UTC

head link

scaling unicorn

On Jun 21, 2010, at 9:53 PM, Eric Wong wrote:
> Thanks for sharing, and good to this is working well for you.
> 
> I''m still unlikely to have the chance to test this anywhere soon,
but
> maybe more folks can give it a try now that we''ve had one
successful
> report.   More reports (success or not) would definitely be good
> to hear.
> 
>> BTW, this configuration needs a cool name!
> 
> Since you''re the first person brave enough to try (or at least
report
> about it), you shall have the honor of naming it :)
The all-knowing WikiAnswers says "a group of unicorns is a blessing"
:)

http://wiki.answers.com/Q/What_is_a_group_of_Unicorns_called

Some great fan art out there:

http://www.elfwood.com/~ara-tun/Unicorn-Herd.2537340.html

But my coworkers & I are voting "pegacorn"

http://images.elfwood.com/art/m/i/michelle16/pegacorn.jpg

-jamie

Eric Wong

2010-Jun-23 09:32 UTC

head link

scaling unicorn

snacktime <snacktime at gmail.com> wrote:> >> Somewhat related -- I''ve been meaning to discuss the
finer points of
> >> backlog tuning.
> >>
> >> I''ve been experimenting with the multi-server socket+TCP
megaunicorn
> >> configuration from your CDT:
> >>
http://rubyforge.org/pipermail/mongrel-unicorn/2009-September/000033.html
> 
> So I''m in the position of launching a web app in a couple of weeks
> that is pretty much guaranteed to get huge traffic.  I''m working
with
> ops people who are very good but this is not how they would normally
> setup load balancing and scale out.  I''m having a meeting with our
> network ops lead tomorrow to talk about this.  I like the idea of this
> approach, it seems like it gives you more fine grained control over
> how much load you put on individual servers as well as how individual
> requests are handled.  But I''m not too keen on using something
like
> this at scale when we simply don''t have the chance to test it out
at a
> smaller scale.  I have yet to see anyone with this setup running at
> scale.  That of course doesn''t mean it''s not a great
idea, only that I
> doubt our ops guys are going to want to be the first.  They are
> already overworked as it is:)
No worries.  Don''t ever feel obligated to try something you''re
not
comfortable with.  Heck, it took months before anybody besides myself
was comfortable with Unicorn.
> So assuming we will scale out the ''normal'' way by not
having a short
> backlog, any info on how to manage that?   Should we control the
> backlog queue in nginx (not sure exactly how I would do that) or via
> the listen backlog?  I was looking around last night and couldn''t
find
> a way to actually poll the listen backlog queue size.
nginx lets you specify a backlog=num with the "listen" directive
much like Unicorn does (Unicorn steals most configuration parameter
names/options from nginx):

  http://wiki.nginx.org/NginxHttpCoreModule#listen

If you use Linux, you can poll the current listen queue
using Raindrops (http://raindrops.bogomips.org/), the ss(8) utility,
or parsing /proc/net/tcp and/or /proc/net/unix.  Unfortunately,
checking the listen queue for Unix domain sockets is expensive,
Raindrops and ss(8) both need to parse /proc/net/unix because
that info isn''t available via netlink.
> Also, any ideas on how you would practically manage this type of load
> balancing setup?  Seems like you would have some type of
''reserve''
> cluster for requests that hit the listen backlog, and when you start
> seeing too much traffic going to the reserve, you add more servers to
> your main pool.  How else would you manage the configuration for
> something like this when you are working with 100 - 200 servers?  You
> can''t be changing the nginx configs every time you add servers,
that''s
> just not practical.
I''ve never tried this setup, so what Jamie said :)

One extra note, 100-200 hosts in an upstream {} block makes a very long
nginx config file.  You could use ERB or something else to template,
but based on a previous reading of the nginx source code, you can
also setup a round-robin DNS entry for all the servers.

nginx only does DNS lookups for upstreams at load time.  For round-robin
DNS entries, nginx adds an entry for every IP address a name resolves
to, so just specify the one DNS name in the upstream block instead of
the list of IP(s).

Just remember to HUP the nginxes (or if you''re forgetful, make an
occasional cronjob to HUP them) when you make DNS changes and add/remove
a box.

-- 
Eric Wong

Mongrel users - Jun 2010 - scaling unicorn

[Mongrel] scaling unicorn

[Mongrel] scaling unicorn

[Mongrel] scaling unicorn

scaling unicorn

scaling unicorn

[Mongrel] scaling unicorn

scaling unicorn

scaling unicorn

scaling unicorn

scaling unicorn