thr3ads.net - mongrel unicorn - Fwd: Maintaining capacity during deploys [Nov 2012]

If this information is useful, please help other people find it:
Share via:

Tony Arcieri

2012-Nov-29 23:05 UTC

Fwd: Maintaining capacity during deploys

We''re using unicornctl restart with the default before/after hook
behavior, which is to reap old Unicorn workers via SIGQUIT after the
new one has finished booting.

Unfortunately, while the new workers are forking and begin processing
requests, we''re still seeing significant spikes in our haproxy request
queue. It seems as if after we restart, the unwarmed workers get
swamped by the incoming requests. As far as I can tell, the momentary
loss of capacity we experience translates fairly quickly into a
thundering herd.

We''ve experimented with rolling restarts at the server level but these
do not resolve the problem.

I''m curious if we could do a more granular application-level rolling
restart, perhaps using TTOU instead of QUIT to progressively dial down
the old workers one-at-a-time, and forking new ones to replace them
incrementally. Anyone tried anything like that before?

Or are there any other suggestions? (short of "add more capacity")

--
Tony Arcieri<div class="gmail_extra"><br><br><div
class="gmail_quote">On Thu, Nov 29, 2012 at 2:50 PM, Tony Arcieri
<span dir="ltr">&lt;<a href="mailto:tony.arcieri at
gmail.com"
target="_blank">tony.arcieri at
gmail.com</a>&gt;</span>
wrote:<br><blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex;">We''re using
unicornctl restart with the default before/after hook behavior, which
is to reap old Unicorn workers via SIGQUIT after the new one has
finished booting.<div><br></div><div>Unfortunately,
while the new
workers are forking and begin processing requests, we''re still seeing
significant spikes in our haproxy request queue. It seems as if after
we restart, the unwarmed workers get swamped by the incoming
requests.&nbsp;As far as I can tell, the momentary loss of capacity we
experience translates fairly quickly into a thundering herd.</div>
<div><div><br></div><div>We''ve
experimented with rolling restarts at
the server level but these do not resolve the
problem.</div><div><br></div><div>I''m
curious if we could do a more
granular application-level rolling restart, perhaps using TTOU instead
of QUIT to progressively dial down the old workers one-at-a-time, and
forking new ones to replace them incrementally. Anyone tried anything
like that before?</div>
<div><br></div><div>Or are there any other suggestions?
(short of "add
more capacity")</div><span class="HOEnZb"><font
color="#888888"><div><br></div>-- <br>Tony
Arcieri<br><br>
</font></span></div>
</blockquote></div><br><br
clear="all"><div><br></div>-- <br>Tony
Arcieri<br><br>
</div>

Alex Sharp

2012-Nov-29 23:09 UTC

head link

Maintaining capacity during deploys

I remember seeing a gist of a cap script a year or so ago that did something
like you''re suggesting with TTOU. I know unicorn supports TTOU, but
I''ve never personally done anything different than just using QUIT.

- Alex Sharp

Eric Wong

2012-Nov-29 23:32 UTC

head link

Fwd: Maintaining capacity during deploys

Tony Arcieri <tony.arcieri at gmail.com> wrote:> We''re using unicornctl restart with the default before/after hook
> behavior, which is to reap old Unicorn workers via SIGQUIT after the
> new one has finished booting.
> 
> Unfortunately, while the new workers are forking and begin processing
> requests, we''re still seeing significant spikes in our haproxy
request
> queue. It seems as if after we restart, the unwarmed workers get
> swamped by the incoming requests. As far as I can tell, the momentary
> loss of capacity we experience translates fairly quickly into a
> thundering herd.
> 
> We''ve experimented with rolling restarts at the server level but
these
> do not resolve the problem.
So it''s one haproxy -> multiple nginx/unicorn servers?

Do you mark the server down or lower the weight in haproxy when
deploying the Ruby app?  (Perhaps using monitor-uri in haproxy)

If the server is down to haproxy, but still up, you can send some warmup
requests to the server before enabling the monitor-uri for haproxy.

Lawrence Pit

2012-Nov-29 23:34 UTC

head link

Maintaining capacity during deploys

> Unfortunately, while the new workers are forking and begin processing
> requests, we''re still seeing significant spikes in our haproxy
request
> queue. It seems as if after we restart, the unwarmed workers get
> swamped by the incoming requests.
Perhaps it''s possible to warm up the workers in the unicorn after_fork
block?


Cheers,
Lawrence

Tony Arcieri

2012-Nov-29 23:52 UTC

head link

Fwd: Maintaining capacity during deploys

On Thu, Nov 29, 2012 at 3:32 PM, Eric Wong <normalperson at yhbt.net>
wrote:> So it''s one haproxy -> multiple nginx/unicorn servers?
Confirm
> Do you mark the server down or lower the weight in haproxy when
> deploying the Ruby app?  (Perhaps using monitor-uri in haproxy)
I assume you''re talking about when we did rolling restarts at the
server level? We don''t do this presently as we deploy to all servers
at the same time.

Doing single-server-at-a-time deploys still resulted in a backlog in
our queue, which speaks to our capacity problems. It''s being worked
on, but we''re looking for an interim solution. It''d also be
nice not
to have to do this as it further extends the length of our (already
gruelingly long) deploy process.

--
Tony Arcieri

Tony Arcieri

2012-Nov-30 01:10 UTC

head link

Maintaining capacity during deploys

On Thu, Nov 29, 2012 at 3:34 PM, Lawrence Pit <lawrence.pit at gmail.com>
wrote:>
> Perhaps it''s possible to warm up the workers in the unicorn
after_fork block?
Are people doing this in production (i.e. moving the termination of
the old master from before_fork to after_fork)? My worry is that
during this warming process you will have 2X the normal number of
Unicorn workers active at the same time, which could potentially lead
to exhausting of system resources (i.e. RAM)

--
Tony Arcieri

Eric Wong

2012-Nov-30 01:24 UTC

head link

Maintaining capacity during deploys

Tony Arcieri <tony.arcieri at gmail.com> wrote:> On Thu, Nov 29, 2012 at 3:34 PM, Lawrence Pit <lawrence.pit at
gmail.com> wrote:
> >
> > Perhaps it''s possible to warm up the workers in the unicorn
after_fork block?
> 
> Are people doing this in production (i.e. moving the termination of
> the old master from before_fork to after_fork)? My worry is that
> during this warming process you will have 2X the normal number of
> Unicorn workers active at the same time, which could potentially lead
> to exhausting of system resources (i.e. RAM)
I haven''t done any terminations in the *_fork hooks for a long time.
I just let 2x the normal workers run for a bit before sending SIGQUIT.

That said, I usually have plenty of RAM (and DB connections) to spare.
Excessive CPU-bound loads are handled very well nowadays.

Devin Ben-Hur

2012-Nov-30 01:28 UTC

head link

Maintaining capacity during deploys

On 11/29/12 3:34 PM, Lawrence Pit wrote:>> Unfortunately, while the new workers are forking and begin processing
>> requests, we''re still seeing significant spikes in our haproxy
request
>> queue. It seems as if after we restart, the unwarmed workers get
>> swamped by the incoming requests.
>
> Perhaps it''s possible to warm up the workers in the unicorn
after_fork block?
I''ve successfully applied this methodology to a nasty rails app that
had
a lot of latent initialization upon first request. Each worker gets a 
unique private secondary listen port and each worker sends a warm-up 
request to a prior worker in the after_fork hook. In our environment our 
load balancer drains each host as it''s being deployed, and this does 
effect the length of deployment across many hosts in a cluster, but the 
warmup bucket brigade is effective at making sure workers on that host 
are responsive when they get added back to the available pool.

A better solution is to use a profiler to identify what extra work is 
being done when an unwarm worker gets its first request and move that 
work into an initialization step which occurs before fork when run with 
app preload enabled.

Tony Arcieri

2012-Nov-30 01:40 UTC

head link

Maintaining capacity during deploys

On Thu, Nov 29, 2012 at 5:28 PM, Devin Ben-Hur <dbenhur at whitepages.com>
wrote:> A better solution is to use a profiler to identify what extra work is being
> done when an unwarm worker gets its first request and move that work into
an
> initialization step which occurs before fork when run with app preload
> enabled.
I''ve done that, unfortunately that work is connection setup which must
happen after forking or otherwise file descriptors would wind up
shared between processes.

--
Tony Arcieri

seth.cousins at gmail.com

2012-Nov-30 04:48 UTC

head link

Maintaining capacity during deploys

In my experience high loads and contention are a common issue when restarting
the unicorn master process. In a previous project we dealt with this by 1)
performing some warmup requests in the master before starting to fork workers;
2) replacing old workers slowly by having each new worker send a TTOU to the old
master in after_fork and having the new master sleep for a couple of seconds
between spawning workers.

It was a couple of years ago so the details are not fresh but iirc before tuning
a restart took 5-10 seconds followed by load climbing to 10-20 (on a 4 proc
machine) with a 2-5 minute slow recovery of long request times. In particularly
pathological cases requests can start timing out which results in workers being
killed and new workers needing to warm up on and already overloaded system.
After tuning the rolling restart took 30-40 seconds but the load barely budged
and the request processing times stayed constant.

.seth

On Nov 29, 2012, at 5:24 PM, Eric Wong <normalperson at yhbt.net> wrote:
> Tony Arcieri <tony.arcieri at gmail.com> wrote:
>> On Thu, Nov 29, 2012 at 3:34 PM, Lawrence Pit <lawrence.pit at
gmail.com> wrote:
>>> 
>>> Perhaps it''s possible to warm up the workers in the
unicorn after_fork block?
>> 
>> Are people doing this in production (i.e. moving the termination of
>> the old master from before_fork to after_fork)? My worry is that
>> during this warming process you will have 2X the normal number of
>> Unicorn workers active at the same time, which could potentially lead
>> to exhausting of system resources (i.e. RAM)
> 
> I haven''t done any terminations in the *_fork hooks for a long
time.
> I just let 2x the normal workers run for a bit before sending SIGQUIT.
> 
> That said, I usually have plenty of RAM (and DB connections) to spare.
> Excessive CPU-bound loads are handled very well nowadays.
> _______________________________________________
> Unicorn mailing list - mongrel-unicorn at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-unicorn
> Do not quote signatures (like this one) or top post when replying

Tony Arcieri

2012-Nov-30 21:28 UTC

head link

Fwd: Maintaining capacity during deploys

On Thu, Nov 29, 2012 at 3:32 PM, Eric Wong <normalperson at yhbt.net>
wrote:> If the server is down to haproxy, but still up, you can send some warmup
> requests to the server before enabling the monitor-uri for haproxy.
I''ve heard various solutions for exactly how to do warmup in this
thread. Anyone have specific recommendations? Should I spin off a
background thread that hits the local instance with a request/requests
then does the SIGQUIT-style switchover?

--
Tony Arcieri

Eric Wong

2012-Nov-30 22:27 UTC

head link

Fwd: Maintaining capacity during deploys

Tony Arcieri <tony.arcieri at gmail.com> wrote:> On Thu, Nov 29, 2012 at 3:32 PM, Eric Wong <normalperson at yhbt.net>
wrote:
> > If the server is down to haproxy, but still up, you can send some
warmup
> > requests to the server before enabling the monitor-uri for haproxy.
> 
> I''ve heard various solutions for exactly how to do warmup in this
> thread. Anyone have specific recommendations? Should I spin off a
> background thread that hits the local instance with a request/requests
> then does the SIGQUIT-style switchover?
I usually put that logic in the deployment script (probably just
with "curl -sf"), but a background thread would probably work.

I think it''s a good idea anyways to ensure your newly-deployed
app is configured and running properly before throwing real
traffic for it.

Tony Arcieri

2012-Dec-03 23:53 UTC

head link

Fwd: Maintaining capacity during deploys

On Fri, Nov 30, 2012 at 2:27 PM, Eric Wong <normalperson at yhbt.net>
wrote:> I usually put that logic in the deployment script (probably just
> with "curl -sf"), but a background thread would probably work.
Are you doing something different than unicornctl restart? It seems
like with unicornctl restart

1) our deployment automation doesn''t know when the restart has
finished, since unicornctl is just sending signals
2) we don''t have any way to send requests specifically to the new
worker instead of the old one

Perhaps I''m misreading the unicorn source code, but here''s
what I see happening:

1) old unicorn master forks a new master. They share the same TCP
listen socket, but only the old master continues accepting requests
2) new master loads the Rails app and runs the before_fork hook. It
seems like normally this hook would send SIGQUIT to the new master,
causing it to close its TCP listen socket
3) new master forks and begins accepting on the TCP listen socket
4) new workers run the after_fork hook and begin accepting requests

It seems like if we remove the logic which reaps the old master in the
before_fork hook and attempt to warm the workers in the after_fork
hook, then we''re stuck in a state where both the old master and new
master are accepting requests but the new workers have not yet been
warmed up.

Is this the case, and if so, is there a way we can prevent the new
master from accepting requests until warmup is complete?

Or how would we change the way we restart unicorn to support our
deployment automation (Capistrano, in this case) handling starting and
healthchecking a new set of workers? Would we have to start the new
master on a separate port and use e.g. nginx to handle the switchover?

Something which doesn''t involve massive changes to the way we
presently restart Unicorm (i.e. unicornctl restart) would probably be
the most practical solution for us. We have a "real solution" for all
of these problems in the works. What I''m looking for in the interim is
a band-aid.

--
Tony Arcieri

Eric Wong

2012-Dec-04 00:34 UTC

head link

Fwd: Maintaining capacity during deploys

Tony Arcieri <tony.arcieri at gmail.com> wrote:> On Fri, Nov 30, 2012 at 2:27 PM, Eric Wong <normalperson at yhbt.net>
wrote:
> > I usually put that logic in the deployment script (probably just
> > with "curl -sf"), but a background thread would probably
work.
> 
> Are you doing something different than unicornctl restart? It seems
> like with unicornctl restart
I''m actually not sure what "unicornctl" is...
Is it this?  https://gist.github.com/1207003

I normally use a shell script (similar to examples/init.sh) in the
unicorn source tree.
> 1) our deployment automation doesn''t know when the restart has
> finished, since unicornctl is just sending signals
> 2) we don''t have any way to send requests specifically to the new
> worker instead of the old one
> 
> Perhaps I''m misreading the unicorn source code, but
here''s what I see happening:
> 
> 1) old unicorn master forks a new master. They share the same TCP
> listen socket, but only the old master continues accepting requests
Correct.
> 2) new master loads the Rails app and runs the before_fork hook. It
> seems like normally this hook would send SIGQUIT to the new master,
> causing it to close its TCP listen socket
Correct, if you''re using preload_app true.

Keep in mind you''re never required to use the before_fork hook to send
SIGQUIT.
> 3) new master forks and begins accepting on the TCP listen socket
accept() never runs on the master, only workers.
> 4) new workers run the after_fork hook and begin accepting requests
Instead of sending HTTP requests to warmup, can you put internal
warmup logic in your after_fork hook?  The worker won''t accept
a request until after_fork is done running.

Hell, maybe you can even use Rack::Mock in your after_fork to fake
requests w/o going through sockets. (random idea, I''ve never tried it)
> It seems like if we remove the logic which reaps the old master in the
> before_fork hook and attempt to warm the workers in the after_fork
> hook, then we''re stuck in a state where both the old master and
new
> master are accepting requests but the new workers have not yet been
> warmed up.
Yes, but if you have enough resources, the split should be even
> Is this the case, and if so, is there a way we can prevent the new
> master from accepting requests until warmup is complete?
If the new processes never accept requests, can they ever complete warm
up? :)
> Or how would we change the way we restart unicorn to support our
> deployment automation (Capistrano, in this case) handling starting and
> healthchecking a new set of workers?
> Would we have to start the new
> master on a separate port and use e.g. nginx to handle the switchover?
Maybe using a separate port for the new master will work.
> Something which doesn''t involve massive changes to the way we
> presently restart Unicorm (i.e. unicornctl restart) would probably be
> the most practical solution for us. We have a "real solution" for
all
> of these problems in the works. What I''m looking for in the
interim is
> a band-aid.
It sounds like you''re really in a bad spot :<

Honestly I''ve never had these combinations of problems to deal with.

Reasonably Related Threads

Search for more possibly parallel threads

mongrel unicorn - Nov 2012 - Fwd: Maintaining capacity during deploys

Fwd: Maintaining capacity during deploys

Maintaining capacity during deploys

Fwd: Maintaining capacity during deploys

Maintaining capacity during deploys

Fwd: Maintaining capacity during deploys

Maintaining capacity during deploys

Maintaining capacity during deploys

Maintaining capacity during deploys

Maintaining capacity during deploys

Maintaining capacity during deploys

Fwd: Maintaining capacity during deploys

Fwd: Maintaining capacity during deploys

Fwd: Maintaining capacity during deploys

Fwd: Maintaining capacity during deploys

Reasonably Related Threads