Mongrel users - Oct 2006 - Problems with mongrel dying

Hi

One of the two mongrel processes has died in the middle of the night
four times in the past 9 days, and I need help debugging this.

Each time the symptoms are the same:

* Each time I can restart the process via cap -a restart_app.
* Before the restart, there is nothing unusual in production.log or
mongrel.log.
* During the restart, about 100 repetitions of an error message are
generated in mongrel.log (see below).
* I followed the suggestions on mongrel.rubyforge.org/faq:
lsof -i -P | grep CLOSE_WAIT
99% CPU
Memory Leak

None of those show any problems.  Before the restart, top shows the
largest memory processes are:

top - 07:17:05 up 31 days, 20:38,  0 users,  load average: 0.00, 0.00,
0.00
Tasks:  46 total,   2 running,  44 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0% us,  0.0% sy,  0.0% ni, 100.0% id,  0.0% wa,  0.0% hi,
0.0% si
Mem:    262316k total,   239700k used,    22616k free,     3412k buffers
Swap:        0k total,        0k used,        0k free,    88320k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
18001 root      16   0 45712  39m 2584 S  0.0 15.6   0:20.79
mongrel_rails
18004 root      16   0 43624  38m 2524 S  0.0 15.0   0:25.93
mongrel_rails
2632 mysql     16   0  109m  27m 3100 S  0.0 10.8   5:37.37 mysqld

After the restart, memory usage rapidly approaches above values while
the application runs normally.

* Yesterday I updated production.rb with this entry:

ActiveRecord::Base.verification_timeout = 14400

and this did not fix the problem.

Note I am using 0.3.13.5 because I had problems with 0.3.13.4 not
restarting when my railsmachine system reboots.
However, I would gladly switch back to 0.3.13.4 if it solved this
problem.

Useage is very low - only about 50-200 requests / hour.

One more factoid - may be irrelevant, but it appears that shortly before
each of the four crashes, one or two pdf files were downloaded from my
public directory - which I believe apache does bypassing mongrel.  

Suggestions?
Thanks
Robert Vogel

Error messages in mongrel.log generated during restart:

Tue Oct 31 07:26:30 PST 2006: Error calling Dispatcher.dispatch
#<Sync_m::Err::UnknownLocker: Thread(#<Thread:0xb7544c58 aborting>) not
locked.>
/usr/lib/ruby/1.8/sync.rb:57:in `Fail''
/usr/lib/ruby/1.8/sync.rb:63:in `Fail''
/usr/lib/ruby/1.8/sync.rb:183:in `sync_unlock''
/usr/lib/ruby/1.8/sync.rb:231:in `synchronize''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel/rails.rb:81:in
`process''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel.rb:583:in
`process_client''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel.rb:582:in
`process_client''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel.rb:689:in
`run''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel.rb:689:in
`run''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel.rb:676:in
`run''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel/configurator.rb:271:in
`run''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel/configurator.rb:270:in
`run''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/bin/mongrel_rails:124:in
`run''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel/command.rb:211:in
`run''
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/bin/mongrel_rails:234
/usr/bin/mongrel_rails:18
Tue Oct 31 07:26:30 PST 2006: Error calling Dispatcher.dispatch
#<Sync_m::Err::UnknownLocker: Thread(#<Thread:0xb7545540 aborting>) not
locked.>
/usr/lib/ruby/1.8/sync.rb:57:in `Fail''
/usr/lib/ruby/1.8/sync.rb:63:in `Fail''
/usr/lib/ruby/1.8/sync.rb:183:in `sync_unlock''
etc.

The tail of mongrel.log after the restart is:

/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/bin/mongrel_rails:234
/usr/bin/mongrel_rails:18
/var/www/apps/kd/current/config/../vendor/rails/activerecord/lib/active_record/transactions.rb:84:in
`transaction'': Transaction aborted
(ActiveRecord::Transactions::TransactionError)
       from
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel/configurator.rb:293:in
`join''
       from
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel/configurator.rb:293:in
`join''
       from
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/bin/mongrel_rails:133:in
`run''
       from
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/lib/mongrel/command.rb:211:in
`run''
       from
/usr/lib/ruby/gems/1.8/gems/mongrel-0.3.13.5/bin/mongrel_rails:234
       from /usr/bin/mongrel_rails:18
deadlock 0xb71a8518: sleep:-  - /usr/lib/ruby/1.8/thread.rb:100
deadlock 0xb7e3c748: sleep:- (main) - /usr/lib/ruby/1.8/thread.rb:100
** TERM signal received.
** Daemonized, any open files are closed.  Look at log/mongrel.8000.pid
and log/mongrel.log for info.
** Starting Mongrel listening at 127.0.0.1:8000
** Starting Rails with production environment...
** Daemonized, any open files are closed.  Look at log/mongrel.8001.pid
and log/mongrel.log for info.
** Starting Mongrel listening at 127.0.0.1:8001
** Starting Rails with production environment...
** Rails loaded.
** Loading any Rails specific GemPlugins
** Signals ready.  TERM => stop.  USR2 => restart.  INT => stop (no
restart).
** Rails signals registered.  HUP => reload (without restart).  It might
not work well.
** Mongrel available at 127.0.0.1:8000
** Writing PID file to log/mongrel.8000.pid
** Rails loaded.
** Loading any Rails specific GemPlugins
** Signals ready.  TERM => stop.  USR2 => restart.  INT => stop (no
restart).
** Rails signals registered.  HUP => reload (without restart).  It might
not work well.
** Mongrel available at 127.0.0.1:8001
** Writing PID file to log/mongrel.8001.pid

Mongrel users - Oct 2006 - Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

[Mongrel] Problems with mongrel dying

Apparently Analagous Threads