thr3ads.net - Mongrel users - [Mongrel] mongrel, monit, and the many, many messages [Jan 2008]

If this information is useful, please help other people find it:
Share via:

Greg Willits

2008-Jan-09 00:58 UTC

[Mongrel] mongrel, monit, and the many, many messages

Monit 4.9, Mongrel 1.0.1, Rails 1.2.6, Mac OS X 10.4.11 (PPC)

I don''t know whether this is a mongrel issue or a monit issue.

I''m trying to poke my way around a system set up by someone else. I
have
no more experience w/ mongrel that local Rails dev at this point, and a
conceptual understanding of how monit is working. I have the Deploying
Rails beta book, and I''m muddling my way thru mongrel and monit docs,
but I think some hints as to direction would be useful.

I am suspicious that all cannot be well on this setup as monit will send
dozens of messages a day, and occasionally hundreds of messages. The
worst day was 1400 alerts. Yes, 1400.

The bulk comes from there being 3 clusters (staging, beta, production),
and 10 mongrels per cluster, and two servers. So, we can reduce the
total quantity by these factors, I get that part, but still, there''s an
aweful lot of "this stopped" and "that does not exist" even
factoring
the redundancy out.

I don''t understand the implications of what each of these means.
Mongrel
keep crashing? Rails crashing? Monit crashing?

Thanks for any clues you can offer.

Sample messages I get are:

-- (A)----------------------------------
Monit instance changed Service [domain snipped]

  Date:        Tue, 08 Jan 2008 14:41:50 -0800
  Action:      alert
  Host:         [domain snipped]
  Description: Monit stopped

-- (B)----------------------------------
Does not exist Service mongrel-production-8300

  Date:        Tue, 08 Jan 2008 15:30:04 -0800
  Action:      restart
  Host:         [domain snipped]
  Description: ''mongrel-production-8300'' process is not
running

-- (C)----------------------------------
Execution failed Service mongrel-production-8301

  Date:        Tue, 08 Jan 2008 15:30:34 -0800
  Action:      alert
  Host:         [domain snipped]
  Description: ''mongrel-production-8301'' failed to start
-- 
Posted via http://www.ruby-forum.com/.

Dave Cheney

2008-Jan-09 01:02 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

Sounds like you have a number of issues. Starting with mongrel, what  
do the mongrel logs for the pids that have stopped running say ? Also  
check /var/log/system.log for monit messages.

It may be worth upgrading to monit 4.10.1, which includes a number of  
fixes for running monit under OSX.

Cheers

Dave

On 09/01/2008, at 11:58 AM, Greg Willits wrote:
> I don''t understand the implications of what each of these means.  
> Mongrel
> keep crashing? Rails crashing? Monit crashing?

Erik Hetzner

2008-Jan-09 01:18 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

At Wed, 9 Jan 2008 01:58:58 +0100,
Greg Willits <lists at ruby-forum.com> wrote:> 
> Monit 4.9, Mongrel 1.0.1, Rails 1.2.6, Mac OS X 10.4.11 (PPC)
> 
> I don''t know whether this is a mongrel issue or a monit issue.
> 
> I''m trying to poke my way around a system set up by someone else.
I have
> no more experience w/ mongrel that local Rails dev at this point, and a
> conceptual understanding of how monit is working. I have the Deploying
> Rails beta book, and I''m muddling my way thru mongrel and monit
docs,
> but I think some hints as to direction would be useful.
>
> [?]
I have seen a similar situation here. What happened was (more or less,
this is from memory) a mongrel instance would be locked up on an HTTP
response that would take a long time to complete. Because requests
would just queue up behind this one, monit would fail to get a
response in a reasonable time, would assume that the process was
non-responsive and try to restart it gracefully (using mongrel_rails
stop). Mongrel would take a long time to shut down because it was
still processing that long running response, so we would get a message
that monit couldn''t shut it down and it would fail to start (or
something like that). Finally the long running rails process would
complete, mongrel would restart, and monit would let us know that the
process was back up.

The solution was to make sure that responses come back in a reasonable
amount of time.

best,
Erik Hetzner
;; Erik Hetzner, California Digital Library
;; gnupg key id: 1024D/01DB07E3
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url :
http://rubyforge.org/pipermail/mongrel-users/attachments/20080108/15656214/attachment.bin

Evan Weaver

2008-Jan-09 03:28 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

Make sure your Monit check interval (not sure abou the default) is
greater than your Mongrel request timeout interval (default 60
seconds).

Evan

On Jan 8, 2008 8:18 PM, Erik Hetzner <erik.hetzner at ucop.edu>
wrote:> At Wed, 9 Jan 2008 01:58:58 +0100,
> Greg Willits <lists at ruby-forum.com> wrote:
> >
> > Monit 4.9, Mongrel 1.0.1, Rails 1.2.6, Mac OS X 10.4.11 (PPC)
> >
> > I don''t know whether this is a mongrel issue or a monit
issue.
> >
> > I''m trying to poke my way around a system set up by someone
else. I have
> > no more experience w/ mongrel that local Rails dev at this point, and
a
> > conceptual understanding of how monit is working. I have the Deploying
> > Rails beta book, and I''m muddling my way thru mongrel and
monit docs,
> > but I think some hints as to direction would be useful.
> >
> > [?]
>
> I have seen a similar situation here. What happened was (more or less,
> this is from memory) a mongrel instance would be locked up on an HTTP
> response that would take a long time to complete. Because requests
> would just queue up behind this one, monit would fail to get a
> response in a reasonable time, would assume that the process was
> non-responsive and try to restart it gracefully (using mongrel_rails
> stop). Mongrel would take a long time to shut down because it was
> still processing that long running response, so we would get a message
> that monit couldn''t shut it down and it would fail to start (or
> something like that). Finally the long running rails process would
> complete, mongrel would restart, and monit would let us know that the
> process was back up.
>
> The solution was to make sure that responses come back in a reasonable
> amount of time.
>
> best,
> Erik Hetzner
> ;; Erik Hetzner, California Digital Library
> ;; gnupg key id: 1024D/01DB07E3
>
> _______________________________________________
> Mongrel-users mailing list
> Mongrel-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-users
>


-- 
Evan Weaver
Cloudburst, LLC

Greg Willits

2008-Jan-09 18:55 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

Thanks for the ideas so far. I''ll look into the latest monit. Message 
(A) is starting to look like a monit crash to me. It is always followed 
by a bunch of similar messages that monit maybe stopping/starting all 
the mongrels.

looks like the logs have little or no date/time stamps, so they''re 
semi-useless in trying to correlate to the email alerts.

I do have some requests that can take a while to process (depends on 
response time from external services), so that''s a valid lead.

Evan Weaver wrote:> Make sure your Monit check interval (not sure abou the default) is
> greater than your Mongrel request timeout interval (default 60
> seconds).
I have looked everywhere I can think of, and I don''t see any mention of
this timeout value anywhere in Mongrel docs. This page 
(http://mongrel.rubyforge.org/docs/howto.html) mentions a -t (timeout), 
but the description doesn''t match what you''re referring to. It
looks
like a delay between the end of responding to request A and starting to 
handle request B, not when to give up on A.

I guess I''ll assume the 60 secs, and play with monit accordingly.

-- gw

-- 
Posted via http://www.ruby-forum.com/.

Evan Weaver

2008-Jan-09 19:36 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

That page is out of date. The RDoc is probably better. And there''s
always the source...

Soon we''ll do some work on the state of the documentation.

Evan


On Jan 9, 2008 1:55 PM, Greg Willits <lists at ruby-forum.com>
wrote:> Thanks for the ideas so far. I''ll look into the latest monit.
Message
> (A) is starting to look like a monit crash to me. It is always followed
> by a bunch of similar messages that monit maybe stopping/starting all
> the mongrels.
>
> looks like the logs have little or no date/time stamps, so they''re
> semi-useless in trying to correlate to the email alerts.
>
> I do have some requests that can take a while to process (depends on
> response time from external services), so that''s a valid lead.
>
> Evan Weaver wrote:
> > Make sure your Monit check interval (not sure abou the default) is
> > greater than your Mongrel request timeout interval (default 60
> > seconds).
>
> I have looked everywhere I can think of, and I don''t see any
mention of
> this timeout value anywhere in Mongrel docs. This page
> (http://mongrel.rubyforge.org/docs/howto.html) mentions a -t (timeout),
> but the description doesn''t match what you''re referring
to. It looks
> like a delay between the end of responding to request A and starting to
> handle request B, not when to give up on A.
>
> I guess I''ll assume the 60 secs, and play with monit accordingly.
>
> -- gw
>
>
> --
> Posted via http://www.ruby-forum.com/.
> _______________________________________________
>
> Mongrel-users mailing list
> Mongrel-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-users
>


-- 
Evan Weaver
Cloudburst, LLC

Erik Hetzner

2008-Jan-09 20:12 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

At Wed, 9 Jan 2008 19:55:27 +0100,
Greg Willits <lists at ruby-forum.com> wrote:> 
> Thanks for the ideas so far. I''ll look into the latest monit.
Message
> (A) is starting to look like a monit crash to me. It is always followed 
> by a bunch of similar messages that monit maybe stopping/starting all 
> the mongrels.
>
> [?]
I doubt a monit crash. This is the message I get when I start monit
with the ?-I quit? option. It sounds like something (a cron job?) is
restarting monit, & monit is not noticing that the mongrels are
running when it restarts, so it tries to bring the mongrels up. Fool
around with the monitrc: perhaps monit is failing to notice the pid
files that exist for mongrel?

best,
Erik Hetzner
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url :
http://rubyforge.org/pipermail/mongrel-users/attachments/20080109/a60a434d/attachment.bin

Greg Willits

2008-Jan-09 21:09 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

Erik Hetzner wrote:> At Wed, 9 Jan 2008 19:55:27 +0100,
> Greg Willits <lists at ruby-forum.com> wrote:
>> 
>> Thanks for the ideas so far. I''ll look into the latest monit.
Message
>> (A) is starting to look like a monit crash to me. It is always followed
>> by a bunch of similar messages that monit maybe stopping/starting all 
>> the mongrels.
>>
>> [?]
> 
> I doubt a monit crash. This is the message I get when I start monit
> with the ?-I quit? option. It sounds like something (a cron job?) is
> restarting monit....
Yeah we have launchd monitoring monit, so that could explain that.

When it was all set up it was explained to me that "mongrel/rails 
crashes/has leaks, so we use monit to keep an eye on that, but monit 
crashes/has leaks, so we''ll use launchd to monitor monit"

Sounded like a house of cards to me, but wasn''t in a position to argue 
it at the time. IIRC the monit thing may have been a leak specific to OS 
X at the time. So hopefully the recent versions are the solution to 
that. I should get a chance to look into that tonight.

Thanks.

-- gw


-- 
Posted via http://www.ruby-forum.com/.

Nathan Vack

2008-Jan-09 21:40 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

On Jan 9, 2008, at 3:09 PM, Greg Willits wrote:
> Yeah we have launchd monitoring monit, so that could explain that.
Y''know, you can just have launchd monitor mongrel. That probably  
makes more sense than launchd watching monit watching mongrel ;-)

-n

Greg Willits

2008-Jan-09 21:56 UTC

head link

[Mongrel] mongrel, monit, and the many, many messages

Nathan Vack wrote:> On Jan 9, 2008, at 3:09 PM, Greg Willits wrote:
> 
>> Yeah we have launchd monitoring monit, so that could explain that.
> 
> Y''know, you can just have launchd monitor mongrel. That probably
> makes more sense than launchd watching monit watching mongrel ;-)
Yep. Now that I''ve been poking around and getting more familiar with 
this setup and see that launchd can monitor those details, that seemed 
like a logical thing to me, so now I have a "second" :-) The orginal
guy
was just learning OS X at the time and was more familiar with monit as 
part of his overall Rails deployment package.

-- gw


-- 
Posted via http://www.ruby-forum.com/.

Possibly Parallel Threads

Search for more seemingly similar threads

Mongrel users - Jan 2008 - mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

[Mongrel] mongrel, monit, and the many, many messages

Possibly Parallel Threads