Stuart Halloway
2007-Feb-26 23:10 UTC
[Mongrel] Apache+mod_proxy_balancer+Mongrel+Mephisto, Apache kills CPU
Our Mephisto install kills Mongrels and causes Apache to pound the CPU. This started when we moved to Apache+mod_proxy_balancer+Mongrel. Here''s what we know: The following things are working OK, except when used in the combination listed above: mongrel, mongrel_rails, MySQL, Apache, mod_proxy_balancer. We believe these are all OK because we moved five other Rails apps to this configuration on the same box, and they are working fine. Here''s what we have figured out: Connecting directly to the mongrel instance works, sort of. It''s S L O W. Clicking on a monthly blog archive link takes about 5 seconds, most of which is spent in the MySQL database. I turned the production.log up to debug, and found that Mephisto (0.7.3) makes dozens of round trips to build the monthly archive page. Here''s the weird part: Mephisto is slow but OK until we put Apache in front of the Mongrels. Then things go sideways. The mongrels become unresponsive, and Apache''s CPU usage spikes. A single unresponsive mongrel thread causes Apache''s utilization to stay about 50% forever, and a few dozen of them peg the processor. Here''s some stuff we have explored: * mod_proxy_balancer vs. ProxyPass to a single Mongrel. Same problem either way. * Monit. Very cool tool for diagnosing the problem, and killing the runaway processes. The problem occurs too frequently to just let monit bounce everything. * Apache server status. The trouble requests look like this: CPU Client Request 0-1 4401 0/5/5 R 30.88 348 0 0.0 0.06 0.06 ? ? ..reading.. Note CPU time is huge. These seem to hang around forever, but the CPU gets pegged long before we run out of threads. * Mongrel -USR1 debugging. Sample thread entry at the bottom of this post, nothing out of the ordinary in the other logs. * Shutdown portion of Mongrel''s main log also shown below. Could this explain the problem? The part that mystifies me: * Why is Mongrel fine without Apache, but not with it? * When the Mongrels start to suffer, why does Apache''s CPU usage peg? I would expect Apache''s threads to have about 0% CPU utilization while waiting on a struggling Mongrel. I am happy to try other experiments anyone might suggest, or to provide more detail from other logs. Thanks, Stuart Halloway www.relevancellc.com [Environment] Mac OS X Mephisto 0.7.3 Most recent version of all gems (nothing on edge) [Excerpt from threads.log] Mon Feb 26 14:37:16 PST 2007 REQUEST /archives/2006/5 0.0.0.0:8111 -- THREADS: 3 ----- KEYS: -- #<Thread:0x35e4030>: [:__inspect_key__, :started_on] -- #<Thread:0x36f8c78>: [:__inspect_key__, :started_on] -- #<Thread:0x35e352c>: [:__inspect_key__, :started_on] [Excerpt from mongrel.log when killing unresponsive mongrels] ** TERM signal received. Mon Feb 26 14:37:06 PST 2007: Reaping 3 threads for slow workers because of ''shutdown'' Waiting for 3 requests to finish, could take 60 seconds.object.log ERROR: undefined method `class'' for #<Liquid::Strainer:0x3132ba4> object.log ERROR: undefined method `class'' for #<Liquid::Strainer: 0x3132ba4> Mon Feb 26 14:37:13 PST 2007: Reaping 3 threads for slow workers because of ''shutdown'' Waiting for 3 requests to finish, could take 60 seconds.object.log ERROR: undefined method `class'' for #<Liquid::Strainer:0x3132ba4> Mon Feb 26 14:37:20 PST 2007: Reaping 3 threads for slow workers because of ''shutdown'' Waiting for 3 requests to finish, could take 60 seconds.Mon Feb 26 14:37:30 PST 2007: Reaping 3 threads for slow workers because of ''shutdown'' Waiting for 3 requests to finish, could take 60 seconds.Mon Feb 26 14:37:36 PST 2007: Reaping 2 threads for slow workers because of ''shutdown''
Roy Nicholson
2007-Feb-27 03:34 UTC
[Mongrel] Apache+mod_proxy_balancer+Mongrel+Mephisto, Apache kills CPU
Hi Stu, What version of apache httpd is this? 2.2.x? Is httpd serving static content with no issue? Anything strange in your httpd errors log? What does your proxy balancer config look like? ie should be something like: <Proxy balancer://my_mongrels> BalancerMember http://<some_ip>:<some_port> ... </Proxy> Got any rewrite rules defined? What modules are compiled in? (httpd -l) Is there a particular request that causes the issue? or is it the first request with apache that causes everything to ''go sideways''? If you''re really out of options one other thing to try is to hook in a traffic sniffer between apache and mongrel (like TCPmon) and look at the requests/responses between the two. Perhaps mongrel is returning something apache doesn''t like? or vise versa... -Roy Stuart Halloway wrote:> Our Mephisto install kills Mongrels and causes Apache to pound the > CPU. This started when we moved to Apache+mod_proxy_balancer+Mongrel. > > Here''s what we know: > The following things are working OK, except when used in the > combination listed above: mongrel, mongrel_rails, MySQL, Apache, > mod_proxy_balancer. We believe these are all OK because we moved five > other Rails apps to this configuration on the same box, and they are > working fine. > > Here''s what we have figured out: > Connecting directly to the mongrel instance works, sort of. It''s S L > O W. Clicking on a monthly blog archive link takes about 5 seconds, > most of which is spent in the MySQL database. I turned the > production.log up to debug, and found that Mephisto (0.7.3) makes > dozens of round trips to build the monthly archive page. > > Here''s the weird part: > Mephisto is slow but OK until we put Apache in front of the Mongrels. > Then things go sideways. The mongrels become unresponsive, and > Apache''s CPU usage spikes. A single unresponsive mongrel thread > causes Apache''s utilization to stay about 50% forever, and a few > dozen of them peg the processor. > > Here''s some stuff we have explored: > * mod_proxy_balancer vs. ProxyPass to a single Mongrel. Same problem > either way. > * Monit. Very cool tool for diagnosing the problem, and killing the > runaway processes. The problem occurs too frequently to just let > monit bounce everything. > * Apache server status. The trouble requests look like this: > CPU Client Request > 0-1 4401 0/5/5 R 30.88 348 0 0.0 0.06 0.06 ? ? ..reading.. > Note CPU time is huge. These seem to hang around forever, but the CPU > gets pegged long before we run out of threads. > * Mongrel -USR1 debugging. Sample thread entry at the bottom of this > post, nothing out of the ordinary in the other logs. > * Shutdown portion of Mongrel''s main log also shown below. Could this > explain the problem? > > The part that mystifies me: > * Why is Mongrel fine without Apache, but not with it? > * When the Mongrels start to suffer, why does Apache''s CPU usage peg? > I would expect Apache''s threads to have about 0% CPU utilization > while waiting on a struggling Mongrel. > > I am happy to try other experiments anyone might suggest, or to > provide more detail from other logs. > > Thanks, > Stuart Halloway > www.relevancellc.com > > [Environment] > Mac OS X > Mephisto 0.7.3 > Most recent version of all gems (nothing on edge) > > [Excerpt from threads.log] > Mon Feb 26 14:37:16 PST 2007 REQUEST /archives/2006/5 > 0.0.0.0:8111 -- THREADS: 3 ----- > KEYS: > -- #<Thread:0x35e4030>: [:__inspect_key__, :started_on] > -- #<Thread:0x36f8c78>: [:__inspect_key__, :started_on] > -- #<Thread:0x35e352c>: [:__inspect_key__, :started_on] > > [Excerpt from mongrel.log when killing unresponsive mongrels] > ** TERM signal received. > Mon Feb 26 14:37:06 PST 2007: Reaping 3 threads for slow workers > because of ''shutdown'' > Waiting for 3 requests to finish, could take 60 seconds.object.log > ERROR: undefined method `class'' for #<Liquid::Strainer:0x3132ba4> > object.log ERROR: undefined method `class'' for #<Liquid::Strainer: > 0x3132ba4> > Mon Feb 26 14:37:13 PST 2007: Reaping 3 threads for slow workers > because of ''shutdown'' > Waiting for 3 requests to finish, could take 60 seconds.object.log > ERROR: undefined method `class'' for #<Liquid::Strainer:0x3132ba4> > Mon Feb 26 14:37:20 PST 2007: Reaping 3 threads for slow workers > because of ''shutdown'' > Waiting for 3 requests to finish, could take 60 seconds.Mon Feb 26 > 14:37:30 PST 2007: Reaping 3 threads for slow workers because of > ''shutdown'' > Waiting for 3 requests to finish, could take 60 seconds.Mon Feb 26 > 14:37:36 PST 2007: Reaping 2 threads for slow workers because of > ''shutdown'' > > > > _______________________________________________ > Mongrel-users mailing list > Mongrel-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-users >
Stuart Halloway
2007-Feb-27 14:58 UTC
[Mongrel] Apache+mod_proxy_balancer+Mongrel+Mephisto, Apache kills CPU
Hi Roy, Responses inline:> What version of apache httpd is this? 2.2.x?2.2.0> Is httpd serving static content with no issue?Yes. In fact the site is mostly static so I am leaving httpd up and the mongrels down (http://www.relevancellc.com)> Anything strange in your httpd errors log?Sometimes getting a batch of these on restart: [Mon Feb 26 13:03:28 2007] [error] child process 6060 still did not exit, sending a SIGKILL> What does your proxy balancer config look like?My proxy balancer config looks just like that. I have also tried just ProxyPass. Note that there are two other Rails apps working just fine behind similar configurations on the same Apache instance.> Got any rewrite rules defined?RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f # RewriteRule ^/(.*)$ balancer://mephisto_cluster%{REQUEST_URI} [P,QSA,L] RewriteRule .* http://127.0.0.1:8110%{REQUEST_URI} [L,P,QSA] Either of the balancer or the pass lead to the same problem.> What modules are compiled in? (httpd -l)What''s the right answer to this question supposed to be? :-) I have # bin/httpd -l Compiled in modules: core.c worker.c http_core.c mod_so.c> Is there a particular request that causes the issue? or is it the > first > request with apache that causes everything to ''go sideways''?It is not the first request, but it happens fairly quickly under modest load.> If you''re really out of options one other thing to try is to hook in a > traffic sniffer between apache and mongrel (like TCPmon) and look > at the > requests/responses between the two. Perhaps mongrel is returning > something apache doesn''t like? or vise versa...Good idea. I''ll have to do something console-based since I don''t have GUI access to the server. I''ll try this afternoon. Is there some way to turn on a promiscuous mode in either Apache or Mongrel that would just dump all the traffic? That might be easier.> > -Roy > > Stuart Halloway wrote: >> Our Mephisto install kills Mongrels and causes Apache to pound the >> CPU. This started when we moved to Apache+mod_proxy_balancer+Mongrel. >> >> Here''s what we know: >> The following things are working OK, except when used in the >> combination listed above: mongrel, mongrel_rails, MySQL, Apache, >> mod_proxy_balancer. We believe these are all OK because we moved five >> other Rails apps to this configuration on the same box, and they are >> working fine. >> >> Here''s what we have figured out: >> Connecting directly to the mongrel instance works, sort of. It''s S L >> O W. Clicking on a monthly blog archive link takes about 5 seconds, >> most of which is spent in the MySQL database. I turned the >> production.log up to debug, and found that Mephisto (0.7.3) makes >> dozens of round trips to build the monthly archive page. >> >> Here''s the weird part: >> Mephisto is slow but OK until we put Apache in front of the Mongrels. >> Then things go sideways. The mongrels become unresponsive, and >> Apache''s CPU usage spikes. A single unresponsive mongrel thread >> causes Apache''s utilization to stay about 50% forever, and a few >> dozen of them peg the processor. >> >> Here''s some stuff we have explored: >> * mod_proxy_balancer vs. ProxyPass to a single Mongrel. Same problem >> either way. >> * Monit. Very cool tool for diagnosing the problem, and killing the >> runaway processes. The problem occurs too frequently to just let >> monit bounce everything. >> * Apache server status. The trouble requests look like this: >> CPU Client Request >> 0-1 4401 0/5/5 R 30.88 348 0 0.0 0.06 0.06 ? ? ..reading.. >> Note CPU time is huge. These seem to hang around forever, but the CPU >> gets pegged long before we run out of threads. >> * Mongrel -USR1 debugging. Sample thread entry at the bottom of this >> post, nothing out of the ordinary in the other logs. >> * Shutdown portion of Mongrel''s main log also shown below. Could this >> explain the problem? >> >> The part that mystifies me: >> * Why is Mongrel fine without Apache, but not with it? >> * When the Mongrels start to suffer, why does Apache''s CPU usage peg? >> I would expect Apache''s threads to have about 0% CPU utilization >> while waiting on a struggling Mongrel. >> >> I am happy to try other experiments anyone might suggest, or to >> provide more detail from other logs. >> >> Thanks, >> Stuart Halloway >> www.relevancellc.com >> >> [Environment] >> Mac OS X >> Mephisto 0.7.3 >> Most recent version of all gems (nothing on edge) >> >> [Excerpt from threads.log] >> Mon Feb 26 14:37:16 PST 2007 REQUEST /archives/2006/5 >> 0.0.0.0:8111 -- THREADS: 3 ----- >> KEYS: >> -- #<Thread:0x35e4030>: >> [:__inspect_key__, :started_on] >> -- #<Thread:0x36f8c78>: >> [:__inspect_key__, :started_on] >> -- #<Thread:0x35e352c>: >> [:__inspect_key__, :started_on] >> >> [Excerpt from mongrel.log when killing unresponsive mongrels] >> ** TERM signal received. >> Mon Feb 26 14:37:06 PST 2007: Reaping 3 threads for slow workers >> because of ''shutdown'' >> Waiting for 3 requests to finish, could take 60 seconds.object.log >> ERROR: undefined method `class'' for #<Liquid::Strainer:0x3132ba4> >> object.log ERROR: undefined method `class'' for #<Liquid::Strainer: >> 0x3132ba4> >> Mon Feb 26 14:37:13 PST 2007: Reaping 3 threads for slow workers >> because of ''shutdown'' >> Waiting for 3 requests to finish, could take 60 seconds.object.log >> ERROR: undefined method `class'' for #<Liquid::Strainer:0x3132ba4> >> Mon Feb 26 14:37:20 PST 2007: Reaping 3 threads for slow workers >> because of ''shutdown'' >> Waiting for 3 requests to finish, could take 60 seconds.Mon Feb 26 >> 14:37:30 PST 2007: Reaping 3 threads for slow workers because of >> ''shutdown'' >> Waiting for 3 requests to finish, could take 60 seconds.Mon Feb 26 >> 14:37:36 PST 2007: Reaping 2 threads for slow workers because of >> ''shutdown'' >> >>