Ben Osheroff
2007-Feb-15 01:21 UTC
[Mongrel] mongrel process stopped listening but "phantom thread" still going
Hi, I run a medium-sized website that uses mongrel/rails in the following configuration: -frontend reverse proxy to a cluster of 7 app servers -each app server runs apache 2.2 with mod_proxy_balancer that balancers the requests out into a mongrel cluster of 35 servers. -each mongrel is version 1.0.1 the mongrel cluster config looks like: port: "1620" environment: production address: 127.0.0.1 pid_file: log/mongrel.pid servers: 35 The problem is, when the mongrel processes are sent a USR2 signal, some of them seem to have threads running, that are waiting for data on a socket that''s been disconnected from the apache balancer long ago. process 27457 has been sent a USR2 signal, and is waiting for its thread to die: [xxx at app04 config]$ /usr/sbin/lsof | grep 27457 [snip] ruby 27457 ilike 28u IPv4 214673805 TCP *:49242 (LISTEN) ruby 27457 ilike 29u IPv4 214673810 TCP app04:49244->10.1.2.10:mysql (ESTABLISHED) [snip] Here''s the strace of what it''s actually doing: [ilike at app04 config]$ strace -p 27457 Process 27457 attached - interrupt to quit select(29, [28], [], [], {0, 137000}) = 0 (Timeout) gettimeofday({1171499150, 914277}, NULL) = 0 select(29, [28], [], [], {0, 175}) = 0 (Timeout) gettimeofday({1171499150, 915224}, NULL) = 0 select(29, [28], [], [], {0, 0}) = 0 (Timeout) time(NULL) = 1171499150 gettimeofday({1171499150, 915386}, NULL) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0 write(2, "Wed Feb 14 16:25:50 PST 2007: Re"..., 86) = 86 write(2, "\n", 1) = 1 gettimeofday({1171499150, 915596}, NULL) = 0 write(2, "Waiting for 1 requests to finish"..., 58) = 58 Port 49242 is an irrelevant port - the mongrels listen on ports 1620-1655 (the mongrel in question has closed out its main port). It seems that the mongrel is waiting for data on a port that has become disconnected from the mod_proxy_balancer. I could probably provide an end-to-end strace if needed, although picking out the relevant bits could be tough. -ben
Zed A. Shaw
2007-Feb-16 09:04 UTC
[Mongrel] mongrel process stopped listening but "phantom thread" still going
On Wed, 14 Feb 2007 17:21:22 -0800 Ben Osheroff <ben at gimbo.net> wrote:> Hi, > > I run a medium-sized website that uses mongrel/rails in the following > configuration: > > The problem is, when the mongrel processes are sent a USR2 signal, some > of them seem to have threads running, that are waiting for data on a > socket that''s been disconnected from the apache balancer long ago.USR2 restarting isn''t reliable for many reasons outside of Mongrel. You should just do a full stop and full restart. -- Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu http://www.zedshaw.com/ http://www.awprofessional.com/title/0321483502 -- The Mongrel Book http://mongrel.rubyforge.org/ http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.