Jared Brown
2006-Sep-07 21:52 UTC
[Mongrel] How to setup a sweeper to restart stale or hung mongrel servers
How do I setup a sweeper to restart stale or hung mongrel servers? -- Jared Brown jaredbrown at gmail.com (765) 409-0875 7001 Central Ave Indianapolis, IN 46220 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-users/attachments/20060907/0994017f/attachment.html
Kirk Haines
2006-Sep-07 22:08 UTC
[Mongrel] How to setup a sweeper to restart stale or hung mongrel servers
On 9/7/06, Jared Brown <jaredbrown at gmail.com> wrote:> How do I setup a sweeper to restart stale or hung mongrel servers?If Mongel servers are hanging, doesn''t that suggest some sort of very ungood bug somewhere, most likely in the application code somewhere? I know this doesn''t answer your question, but a hanging server seems like it falls into the category of things that just should not happen, and that it''d be worthwhile to get to the bottom of why it is happening before working around the fact that it is happening. Kirk Haines
Jared Brown
2006-Sep-07 23:35 UTC
[Mongrel] How to setup a sweeper to restart stale or hung mongrel servers
What happens is the servers are waiting for their database request to come back and are in a locked state so when mod_proxy_balancer sends a request their way it gets queued and eventually apache times it out and returns a proxy error. So I am going to try and up the number of servers in the cluster and increase the proxy pass through timeout. But I was also interested in what the cluster does when a mongrel server is hung and gone stale and how it gets detected and restarted. On 9/7/06, Kirk Haines <wyhaines at gmail.com> wrote:> > On 9/7/06, Jared Brown <jaredbrown at gmail.com> wrote: > > How do I setup a sweeper to restart stale or hung mongrel servers? > > If Mongel servers are hanging, doesn''t that suggest some sort of very > ungood bug somewhere, most likely in the application code somewhere? > > I know this doesn''t answer your question, but a hanging server seems > like it falls into the category of things that just should not happen, > and that it''d be worthwhile to get to the bottom of why it is > happening before working around the fact that it is happening. > > > Kirk Haines > _______________________________________________ > Mongrel-users mailing list > Mongrel-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-users >-- Jared Brown jaredbrown at gmail.com (765) 409-0875 7001 Central Ave Indianapolis, IN 46220 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-users/attachments/20060907/d86f5882/attachment.html
Zed Shaw
2006-Sep-07 23:43 UTC
[Mongrel] How to setup a sweeper to restart stale or hung mongrel servers
On Thu, 2006-09-07 at 16:08 -0600, Kirk Haines wrote:> On 9/7/06, Jared Brown <jaredbrown at gmail.com> wrote: > > How do I setup a sweeper to restart stale or hung mongrel servers? > > If Mongel servers are hanging, doesn''t that suggest some sort of very > ungood bug somewhere, most likely in the application code somewhere? > > I know this doesn''t answer your question, but a hanging server seems > like it falls into the category of things that just should not happen, > and that it''d be worthwhile to get to the bottom of why it is > happening before working around the fact that it is happening.Yeah, I think people are used to FCGI where zombies and broken processes are just the way it goes. I''ve made sure there''s *tons* of debugging features in Mongrel so people can figure out what is causing the hangups and using these features we''ve found it''s almost always one or two rails actions, that it''s easy to spot with USR1 logging, and that it involves a simple rewrite or moving that code to a DRb server. But, when you''re backed against a wall, having a process monitor is the only thing you can do. My favorite is monit, other folks like runit/daemon tools. -- Zed A. Shaw http://www.zedshaw.com/ http://mongrel.rubyforge.org/ http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.
Jared Brown
2006-Sep-08 03:59 UTC
[Mongrel] How to setup a sweeper to restart stale or hung mongrel servers
Thanks for the response Zed. I will continue to look into the scrpt debug output. On 9/7/06, Zed Shaw <zedshaw at zedshaw.com> wrote:> > On Thu, 2006-09-07 at 16:08 -0600, Kirk Haines wrote: > > On 9/7/06, Jared Brown <jaredbrown at gmail.com> wrote: > > > How do I setup a sweeper to restart stale or hung mongrel servers? > > > > If Mongel servers are hanging, doesn''t that suggest some sort of very > > ungood bug somewhere, most likely in the application code somewhere? > > > > I know this doesn''t answer your question, but a hanging server seems > > like it falls into the category of things that just should not happen, > > and that it''d be worthwhile to get to the bottom of why it is > > happening before working around the fact that it is happening. > > Yeah, I think people are used to FCGI where zombies and broken processes > are just the way it goes. I''ve made sure there''s *tons* of debugging > features in Mongrel so people can figure out what is causing the hangups > and using these features we''ve found it''s almost always one or two rails > actions, that it''s easy to spot with USR1 logging, and that it involves > a simple rewrite or moving that code to a DRb server. > > But, when you''re backed against a wall, having a process monitor is the > only thing you can do. > > My favorite is monit, other folks like runit/daemon tools. > > > -- > Zed A. Shaw > http://www.zedshaw.com/ > http://mongrel.rubyforge.org/ > http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help. > > _______________________________________________ > Mongrel-users mailing list > Mongrel-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-users >-- Jared Brown jaredbrown at gmail.com (765) 409-0875 7001 Central Ave Indianapolis, IN 46220 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-users/attachments/20060907/b7e1d378/attachment-0001.html