thr3ads.net - Rails - Rails app going nuts at 100% cpu [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Fernando Perez

2009-Jul-30 00:05 UTC

Rails app going nuts at 100% cpu

Hi,

My rails app has been growing in LOC, everything was running fine, until
someday (one or two weeks ago) where I pushed an update to my server:
after a random period of time, my ruby processes eat 100% of the cpu,
and the app becomes unresponsive. The problem is that I am unable to
tell which update started giving troubles.

$ netstat -anp shows connections not being properly closed between my
rails process and postgresql database, the rails app certainly is
hanging there.

I have yet been unable to identify the source of the problem even after:
- reinstalling on a fresh operating system (debian lenny)
- switching from connecting to postgresql through remote tcp to local
unix sockets
- updating nginx
- updating Rails and other gems
- updating plugins, and removing some that are not so useful
- moving from Thin instances to Nginx+Passenger
- removing suspicious and most recent lines of code that could be the
problem

Everything works fine on my dev machine. On the production server, after
a random amount of time, it suddenly goes crazy. It''s terribly painful
to hunt down and I don''t see any new potential areas to investigate.

Recently I have been seeing a new error message from time to time but
which disappears on the next request:
--
A copy of XX has been removed from the module tree but is still active!
--
Could that be related to some memory leak that will eventually lock a
rails process at 100% cpu after some time?


Has anyone had any troubles like this? Does anyone have an idea where
the problem could come from? How to tackle the problem?

As it''s random, I can make modifications then after 6 hours be happy
thinking that it all works, then 10 minutes later it fails...


Best regards,
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Jul-30 00:41 UTC

head link

Re: Rails app going nuts at 100% cpu

Houston we have a problem!

I wanted to stress test my website in order to make it fail faster. But 
using ab (apache bench) I get the following error:

$ ab http://www.digiprof.fr/
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, 
http://www.zeustech.net/
Copyright (c) 2006 The Apache Software Foundation, 
http://www.apache.org/

Benchmarking www.digiprof.fr (be patient)...
Test aborted after 10 failures

: Operation now in progress
--
 Then in quits.

But if I look in my log file on the server, I have 12 requests that 
appear from my IP address!!! ab by default should only make 1 request, 
is it Rails receiving the request but not sending it back and ab 
requesting again 11 more times? But if I check my website with Firefox 
or with curl it works perfectly.

What in hell could be happening?
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Jul-30 00:47 UTC

head link

Re: Rails app going nuts at 100% cpu

Hang on tight, because there is a good one:

Instead of Nginx+Passenger on my server, I loaded my rails app using 
good ole'' webrick, and now ab works and can stress test the site!?
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Jul-30 00:50 UTC

head link

Re: Rails app going nuts at 100% cpu

My munin graph show that when the rails app goes out of control, there 
is a raise in the number of  interrupt and context switches.
-- 
Posted via http://www.ruby-forum.com/.

bill walton

2009-Jul-30 02:14 UTC

head link

Re: Rails app going nuts at 100% cpu

Hi Fernandez,

On Thu, 2009-07-30 at 02:50 +0200, Fernando Perez wrote:> My munin graph show that when the rails app goes out of control, there 
> is a raise in the number of  interrupt and context switches.
Excellent investigation and evidence.  I hope you continue, and then do
something with it (i.e., a bug report to the appropriate forum).

Best regards,
Bill

Phlip

2009-Jul-30 02:39 UTC

head link

Re: Rails app going nuts at 100% cpu

Fernando Perez wrote:
> My munin graph show that when the rails app goes out of control, there
> is a raise in the number of  interrupt and context switches.
When you say 100%, do you mean the usage goes up and then bounces around, 
say between 95% and 100%?

Or do you mean it flatlines at exactly 100%, with no bouncing?

The former means an infinite loop that accesses some IO resource, such as 
the wire or the database. You could also have some kind of endless 
conversation, where event A (such as an Ajax hit) triggers event B (such as 
a page refresh), which triggers A again.

The latter means you have a simple infinite loop that is busy doing only 
Ruby statements, such as "nil while true".

How are your unit tests doing? Do they cover all this logic, so they might 
show a similar loop or dead spot?

Can you "comment out" entire blocks of your app, such as entire
controller
actions, and then run the app and see if the problem goes away? If it does, 
the problem is in the last action you clobbered, so put it back in and then 
clobber half of it. Keep clobbering until you find the region of coding 
doing it.

All generic techniques - no I don''t know the difference here between
Webrick
and Passenger - but they generally can''t hurt!

-- 
  Phlip

Marnen Laibow-Koser

2009-Jul-30 06:48 UTC

head link

Re: Rails app going nuts at 100% cpu

Phlip wrote:
[...]> All generic techniques - no I don''t know the difference here
between
> Webrick
> and Passenger - but they generally can''t hurt!
Total shot in the dark: resource acquisition deadlock?  I seem to recall 
that Passenger runs a cluter of Mongrels, while I assume you''re only 
running one Webrick process at a time...

I could be totally off base here, though.
> 
> --
>   Phlip
Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen-sbuyVjPbboAdnm+yROfE0A@public.gmane.org
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Jul-30 09:03 UTC

head link

Re: Rails app going nuts at 100% cpu

> Excellent investigation and evidence.  I hope you continue, and then do
> something with it (i.e., a bug report to the appropriate forum).I will report any progress here.

> When you say 100%, do you mean the usage goes up and then bounces 
> around, say between 95% and 100%?
> Or do you mean it flatlines at exactly 100%, with no bouncing?$ top shows that it''s locked at 100% From time to time anew ruby1.8 
process pops in then goes away, but all locked ruby1.8 processes are at 
16.7 to 17% in cpu usage and 8.4 to 8.5 memory usage.
> How are your unit tests doing? Do they cover all this logic, so they 
> might show a similar loop or dead spot?Nearly 99% of the code is covered and they all pass.
> Can you "comment out" entire blocks of your appYes that''s what I am doing right now. I have commented out 100% of my 
models code, unloaded a few not frequently updated plugins, and 
commented out all model calls in my controllers, so only empty pages 
will get returned.

I''ll see how it works out. The problem is that the last failure
happened
after 8 hours of working perfectly, so I can''t tell if it worked or not
unless I wait at least that amount of time if it doesn''t fail before.
> Total shot in the dark: resource acquisition deadlock?  I seem to recall 
> that Passenger runs a cluter of MongrelsI switched from Thin, to pure mongrel to Passenger, and they all failed 
pretty much the same way, but I''ll keep that in mind just in case, and 
try with one Webrick running alone.


Thank you all for your assistance, I need new eyes on this problem as I 
am sure I''m not looking in the correct place.
-- 
Posted via http://www.ruby-forum.com/.

Zach Karpinski

2009-Jul-30 15:03 UTC

head link

Re: Rails app going nuts at 100% cpu

Passenger uses mongrel?  Is that really true?  I have no idea but on
the modrails.com site it lists Passenger as being faster than mongrel
while running a couple apps.  I''d almost have to say this
isn''t true,
if not because of the performance difference,  then because it shares
none of the configuration settings of Mongrel.  If those files are
sitting around they are cleverly hidden.

On Jul 30, 1:48 am, Marnen Laibow-Koser <rails-mailing-l...@andreas-
s.net> wrote:> Total shot in the dark: resource acquisition deadlock?  I seem to recall
> that Passenger runs a cluter of Mongrels, while I assume you''re
only
> running one Webrick process at a time...
>
> I could be totally off base here, though.

Marnen Laibow-Koser

2009-Jul-30 15:16 UTC

head link

Re: Rails app going nuts at 100% cpu

Zach Karpinski wrote:> Passenger uses mongrel?  Is that really true?  
I thought I had read that Passenger used Mongrel internally, but on 
reading the description of the architecture, I think I may have been 
mistaken.

Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen-sbuyVjPbboAdnm+yROfE0A@public.gmane.org
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Jul-31 08:10 UTC

head link

Re: Rails app going nuts at 100% cpu

Commenting out almost all my application yielded a 10hr run without any 
problem, so I added in the latest pieces of code I thought the bug(s) 
would be in, and still I got 10hr continuous run without any problem!

Now I am adding back the code little by little, I have no idea where the 
bug is, but I''m sure I''ll be very surprised once I nail it!
The worse
thing is that I''m probably looking at it right now and I don''t
know yet
it''s him.
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Aug-01 18:01 UTC

head link

Re: Rails app going nuts at 100% cpu

I''m nailing it down more and more. I have added back most the 
functionnality of my website, and it still hasn''t failed, so I 
definitely have not a single clue where that bug is hidding! All the 
pieces of code I was suspecting have been re-enabled and they haven''t 
failed on me.

I''ll keep you updated.
-- 
Posted via http://www.ruby-forum.com/.

Phlip

2009-Aug-02 07:39 UTC

head link

Re: Rails app going nuts at 100% cpu

Fernando Perez wrote:> I''m nailing it down more and more.
We all forgot to ask...

  - how many CPUs on your motherboard?
  - can you reproduce on another machine?
    - how many CPUs on it?

(Sometimes a threaded application can abuse semaphores and turn an SMP into a 
single-threaded multi-CPU process...)

-- 
   Phlip

Fernando Perez

2009-Aug-02 08:58 UTC

head link

Re: Rails app going nuts at 100% cpu

Hi Philip,

Here is the cpuinfo of the server: (edited for brevity)
--
vendor_id       : GenuineIntel
cpu family      : 15
model           : 6
model name      : Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping        : 5
cpu MHz         : 2999.964
cache size      : 2048 KB
cpu cores       : 1
--

It''s a P4 with Hyper Thread. The only other machine I have access to is
a PowerPC, and I have never had any problem at all on it, but as I said, 
the rails process goes out of control after 2 to 8 hours. Before that 
time lapse everything works perfectly fine. I guess there is somewhere a 
piece of code that aggregates objects and grows infinitely.

Now 100% of my own written code has been re-enabled and everything has 
been stable for the last 3 hours but I still have to wait. For the last 
3 days everything has been working since I have been reactivating the 
application bit by bit. The only things I have left out are a few 
plugins, I''ll gradually let them back in.

For some mysterious reason, I can''t use apache bench to hammer my
server
and accelerate the failure, it keeps getting errors although the app is 
reachable with FF or curl.
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Aug-03 15:13 UTC

head link

Re: Rails app going nuts at 100% cpu

I have reactivated my whole application, except rcov_plugin for which I 
had not installed the gem on the production server. I don''t if
that''s
the cause, but now my app has been running flawlessly as if nothing ever 
happenned...
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Aug-03 20:59 UTC

head link

Re: Rails app going nuts at 100% cpu

Yeah buddy!!! Lightweight!

It finally failed 7 hours and 40 minutes after my last update! I''m 
feeling lucky, because it''s a tie break between 2 things: acts_as_list 
plugin (which might be clashing with acts_as_list) and my sitemap 
generator.

This week long investigation and hunt will finally come to an end. Stay 
tuned for tomorrow''s last episode :D
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Aug-03 21:00 UTC

head link

Re: Rails app going nuts at 100% cpu

Sorry, should read: acts_as_tree might be the trouble maker.
-- 
Posted via http://www.ruby-forum.com/.

Marnen Laibow-Koser

2009-Aug-03 22:31 UTC

head link

Re: Rails app going nuts at 100% cpu

Fernando Perez wrote:> Sorry, should read: acts_as_tree might be the trouble maker.
<soapbox>
You probably want awesome_nested_set anyway.
</soapbox>

Good luck with the rest of the tracking.  I''m not sure I''d
have the
patience for all the stuff you''ve done!

Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen-sbuyVjPbboAdnm+yROfE0A@public.gmane.org
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Aug-04 08:46 UTC

head link

Re: Rails app going nuts at 100% cpu

Nailed! It was acts_as_tree the problem. He has been sentenced to an 
unlimited time ban from my app.

Marnen Laibow-Koser wrote:> <soapbox>
> You probably want awesome_nested_set anyway.
> </soapbox>At the time I was reviewing various options, I should have taken 
awesome_blabla instead of acts_as_messy_tree. Damn!
> I''m not sure I''d have the patience for all the stuff
you''ve done!I have no other choice, as this rails app is my business ;-)

So remember, acts_as_tree doesn''t seem to play nicely with other stuff 
(acts_as_list?), so be careful. I hope this thread wil lsave other 
people days of work.

Best regards,
-- 
Posted via http://www.ruby-forum.com/.

Colin Law

2009-Aug-04 09:41 UTC

head link

Re: Rails app going nuts at 100% cpu

2009/8/4 Fernando Perez
<rails-mailing-list-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org>:>
> Nailed! It was acts_as_tree the problem. He has been sentenced to an
> unlimited time ban from my app.
That seems surprising as there is so little in acts_as_tree.  Unless
it is an interaction with something else that you have.  Is there a
possibility you could be setting up a loop in the tree somehow?  I
don''t know whether that would give the symptom described.  If there is
a flaw in your design that allows a loop to be setup you could get the
same issue with another acts_as..

Colin
>
> Marnen Laibow-Koser wrote:
>> <soapbox>
>> You probably want awesome_nested_set anyway.
>> </soapbox>
> At the time I was reviewing various options, I should have taken
> awesome_blabla instead of acts_as_messy_tree. Damn!
>
>> I''m not sure I''d have the patience for all the stuff
you''ve done!
> I have no other choice, as this rails app is my business ;-)
>
> So remember, acts_as_tree doesn''t seem to play nicely with other
stuff
> (acts_as_list?), so be careful. I hope this thread wil lsave other
> people days of work.
>
>
> Best regards,
> --
> Posted via http://www.ruby-forum.com/.
>
> >
>

Fernando Perez

2009-Aug-06 21:09 UTC

head link

Re: Rails app going nuts at 100% cpu

Hi, my app got fucked again!

So it might not be acts_as_tree''s fault. Reading my log book and my 
repository commits, I am now having high doubts about rcov_plugin: even 
though it''s not loaded in production.rb, it''s still sitting in
vendor/plugins and it might be causing a problem.

I''ll report tomorrow morning.
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Aug-07 07:58 UTC

head link

Re: Rails app going nuts at 100% cpu

Damn it failed again! So I have to restart the hunt from the scratch, I 
might have not waited long enough to let it fail...
-- 
Posted via http://www.ruby-forum.com/.

Roger Pack

2009-Aug-08 16:08 UTC

head link

Re: Rails app going nuts at 100% cpu

Fernando Perez wrote:> Damn it failed again! So I have to restart the hunt from the scratch, I 
> might have not waited long enough to let it fail...
Some things to help with a frozen app:

strace
http://eigenclass.org/hiki/ruby+live+process+introspection
http://github.com/rogerdpack/live-console/tree/master
http://www.ruby-forum.com/topic/191210
GL!
-r
-- 
Posted via http://www.ruby-forum.com/.

Roger Pack

2009-Aug-08 16:13 UTC

head link

Re: Rails app going nuts at 100% cpu

> A copy of XX has been removed from the module tree but is still active!
Usually [for me] that means "you''re running in development node
not
production"


> Could that be related to some memory leak that will eventually lock a
> rails process at 100% cpu after some time?
might be related to
http://redmine.ruby-lang.org/issues/show/1484
-- 
Posted via http://www.ruby-forum.com/.

Fernando Perez

2009-Aug-21 07:49 UTC

head link

Re: Rails app going nuts at 100% cpu

My app is still failing. I installed the MemoryLogic plugin to check the 
memory usage after each request, but it''s stable.

I noted that it''s a single request that locks the app as it
doesn''t get
printed in the production.log file. Is it possible to debug a live 
running Rails app? I looked at LiveConsole but it''s only for ruby apps 
not rails.
-- 
Posted via http://www.ruby-forum.com/.

bill walton

2009-Aug-21 09:22 UTC

head link

Re: Rails app going nuts at 100% cpu

Hi Fernando,

On Fri, 2009-08-21 at 09:49 +0200, Fernando Perez wrote:> My app is still failing. I installed the MemoryLogic plugin to check the 
> memory usage after each request, but it''s stable.
> 
> I noted that it''s a single request that locks the app as it
doesn''t get
> printed in the production.log file. Is it possible to debug a live 
> running Rails app? I looked at LiveConsole but it''s only for ruby
apps
> not rails.
Sorry to hear you haven''t gotten this sorted out yet.

By way of alternatives, here''s a potentially productive debugging
approach.

Put production logging into debug mode by uncommenting the following
line in environment.rb:

# config.log_level = :debug

Your logs will be much bigger so you''ll want to clear them on a daily
basis.  But do not just delete them.  Save them somewhere first.

You say "it doesn''t get printed in the production.log file". 
That''s ok.
My guess is that the problem you''re having is almost surely
state-related.  The next time your app fails, walk the log backwards to
figure out what state your app was in just prior to the failure.  It''s
not what we''re shooting for, but in general most Rails apps
aren''t
stateless.  You can narrow the list of requests that *might* be the
problem by figuring out what state the app was in prior to the
failure.  

You''re in production mode so I assume you''ve got multiple
users on the
system.  That means you''ll have to sort them out.  Write a little
script
to do that by IP.  Then start doing your analysis.  Do it on *both* sets
of logs: ones where the app failed, but *especially* the ones where the
app did *not* fail.  Your analysis needs to ask: "what''s the
difference?"  What was each user doing just before the problem.  Were
all the users in one part of the app?  Or did one just do something new
while all the rest were doing X and Y ?  

Debugging a running production app is not impossible, but it''s damn
close to impossible IME.  The best I can suggest is start gathering as
much info as possible and doing some heavy analysis to narrow down the
possibilities.

Best of luck,
Bill

Fernando Perez

2009-Aug-21 15:22 UTC

head link

Re: Rails app going nuts at 100% cpu

Hi,

I finally solved my problem. It was a regular expression that failed on 
a specific URL which locked the process! I never thought such thing 
could happen!

Thank you all for your support and debugging ideas.


Best regards,
-- 
Posted via http://www.ruby-forum.com/.

Roger Pack

2009-Aug-21 22:46 UTC

head link

Re: Rails app going nuts at 100% cpu

Fernando Perez wrote:> Hi,
> 
> I finally solved my problem. It was a regular expression that failed on 
> a specific URL which locked the process! I never thought such thing 
> could happen!
What was the exact regex and ruby version and example?
Thanks!
-r
-- 
Posted via http://www.ruby-forum.com/.

pepe

2009-Aug-23 02:07 UTC

head link

Re: Rails app going nuts at 100% cpu

I''ve been reading all along but had nothing to offer to help.
I''m very
glad you got it solved. :)

On Aug 21, 11:22 am, Fernando Perez
<rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org>
wrote:> Hi,
>
> I finally solved my problem. It was a regular expression that failed on
> a specific URL which locked the process! I never thought such thing
> could happen!
>
> Thank you all for your support and debugging ideas.
>
> Best regards,
> --
> Posted viahttp://www.ruby-forum.com/.

why that

2013-Aug-28 14:04 UTC

head link

Re: Rails app going nuts at 100% cpu

check your yml config files

maybe you mixed ident spaces with a tab ?

-- 
Posted via http://www.ruby-forum.com/.

-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit
https://groups.google.com/d/msgid/rubyonrails-talk/37c13c97627b81ecf0735ec3d6ab3500%40ruby-forum.com.
For more options, visit https://groups.google.com/groups/opt_out.

Rails - Jul 2009 - Rails app going nuts at 100% cpu

Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu

Re: Rails app going nuts at 100% cpu