Ben Osheroff
2007-Feb-15  01:21 UTC
[Mongrel] mongrel process stopped listening but "phantom thread" still going
Hi,
I run a medium-sized website that uses mongrel/rails in the following
configuration:
-frontend reverse proxy to a cluster of 7 app servers
-each app server runs apache 2.2 with mod_proxy_balancer that balancers
the requests out into a mongrel cluster of 35 servers.
-each mongrel is version 1.0.1
the mongrel cluster config looks like:
port: "1620"
environment: production
address: 127.0.0.1
pid_file: log/mongrel.pid
servers: 35
The problem is, when the mongrel processes are sent a USR2 signal, some
of them seem to have threads running, that are waiting for data on a
socket that''s been disconnected from the apache balancer long ago.
process 27457 has been sent a USR2 signal, and is waiting for its thread
to die:
[xxx at app04 config]$ /usr/sbin/lsof | grep 27457
[snip]
ruby      27457  ilike   28u     IPv4 214673805                   TCP
*:49242 (LISTEN)
ruby      27457  ilike   29u     IPv4 214673810                   TCP
app04:49244->10.1.2.10:mysql (ESTABLISHED)
[snip]
Here''s the strace of what it''s actually doing:
[ilike at app04 config]$ strace -p 27457
Process 27457 attached - interrupt to quit
select(29, [28], [], [], {0, 137000})   = 0 (Timeout)
gettimeofday({1171499150, 914277}, NULL) = 0
select(29, [28], [], [], {0, 175})      = 0 (Timeout)
gettimeofday({1171499150, 915224}, NULL) = 0
select(29, [28], [], [], {0, 0})        = 0 (Timeout)
time(NULL)                              = 1171499150
gettimeofday({1171499150, 915386}, NULL) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0
write(2, "Wed Feb 14 16:25:50 PST 2007: Re"..., 86) = 86
write(2, "\n", 1)                       = 1
gettimeofday({1171499150, 915596}, NULL) = 0
write(2, "Waiting for 1 requests to finish"..., 58) = 58
Port 49242 is an irrelevant port - the mongrels listen on ports
1620-1655 (the mongrel in question has closed out its main port).  It
seems that the mongrel is waiting for data on a port that has become
disconnected from the mod_proxy_balancer.
I could probably provide an end-to-end strace if needed, although
picking out the relevant bits could be tough.
-ben
Zed A. Shaw
2007-Feb-16  09:04 UTC
[Mongrel] mongrel process stopped listening but "phantom thread" still going
On Wed, 14 Feb 2007 17:21:22 -0800 Ben Osheroff <ben at gimbo.net> wrote:> Hi, > > I run a medium-sized website that uses mongrel/rails in the following > configuration: > > The problem is, when the mongrel processes are sent a USR2 signal, some > of them seem to have threads running, that are waiting for data on a > socket that''s been disconnected from the apache balancer long ago.USR2 restarting isn''t reliable for many reasons outside of Mongrel. You should just do a full stop and full restart. -- Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu http://www.zedshaw.com/ http://www.awprofessional.com/title/0321483502 -- The Mongrel Book http://mongrel.rubyforge.org/ http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.