Hi, was wondering if anyone else had a similar problem and knows why or a solution. basically my mongrels seems to work fine. i am running three clusters all which are monitored by monit. monit has the ability to restart a mongrel if it doesn''t pass a port connection test. so the problem is that after some time. aprox. 6hrs. to 20hrs. after clusters are started, the mongrels get restarted by monit due to monit not being able to connect to said port. not all of them at the same time. just some of them, sometimes. the server is pre-production and is getting no hits. could this be the problem and when the server is live, with constant use the mongrels will remain working. or could this be a monit issue ? any help would be truly appreciated.. Chris __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-users/attachments/20071107/47a94e45/attachment.html
What do your logs say? Why are the mongrels not responding? Since you''re not in production, you should be able to pinpoint exactly when and why they stopped responding. On Nov 7, 2007 1:15 PM, Eire Angel <chrisangileri at yahoo.com> wrote:> Hi, > > was wondering if anyone else had a similar problem and knows why or a > solution. > basically my mongrels seems to work fine. i am running three clusters all > which are monitored by monit. monit has the ability to restart a mongrel if > it doesn''t pass a port connection test. so the problem is that after some > time. aprox. 6hrs. to 20hrs. after clusters are started, the mongrels get > restarted by monit due to monit not being able to connect to said port. > not all of them at the same time. just some of them, sometimes. the server > is pre-production and is getting no hits. could this be the problem and > when the server is live, with constant use the mongrels will remain working. > or could this be a monit issue ? > > any help would be truly appreciated.. > > Chris > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > _______________________________________________ > Mongrel-users mailing list > Mongrel-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-users >
Hi Chris, I hit this too at the same kind of timeframe you mentioned. In my case, the mongrel processes do become non-responsive, making monit necessary to keep my webapp living + breathing. The problem occurs on multiple machines: some running OpenSuse 64-bit and some running Ubuntu Feisty Fawn 64-bit. Some folks had suggested this is related to not using the mysql gem for database access. This may be the case, but the mysql gem wasn''t a possibility for me since it is very buggy in 64-bit (it crashed my webapps). There is a also an ActiveRecord timeout that is usually prescribed, but this had no effect for me. I wonder... how many 64-bit mongrel users are out there? Thanks, Pete On Nov 7, 2007, at 11:15 AM, Eire Angel wrote:> Hi, > > was wondering if anyone else had a similar problem and knows why or > a solution. > basically my mongrels seems to work fine. i am running three > clusters all which are monitored by monit. monit has the ability > to restart a mongrel if it doesn''t pass a port connection test. so > the problem is that after some time. aprox. 6hrs. to 20hrs. after > clusters are started, the mongrels get restarted by monit due to > monit not being able to connect to said port. not all of them at > the same time. just some of them, sometimes. the server is pre- > production and is getting no hits. could this be the problem and > when the server is live, with constant use the mongrels will remain > working. or could this be a monit issue ? > > any help would be truly appreciated.. > > Chris > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > _______________________________________________ > Mongrel-users mailing list > Mongrel-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mongrel-users
On 11/7/07, Pete DeLaurentis <pete at nextengine.com> wrote:> I hit this too at the same kind of timeframe you mentioned. In myThat sort of a delay -- 6 to 20 hours is what the OP mentioned -- screams at me that the problem is probably related to the db handle timing out. Even if you change the AR timeout value to 14400 (the most often quoted value that I see), that is still just 4 hours. If your process sits quiescent for 6 to 20 hours while the timeout on the db handle is set at 4 hours, the db handle is still going to time out.> I wonder... how many 64-bit mongrel users are out there?My old servers are 32 bit machines, but my new ones are all 64 bit machines. Kirk Haines
> That sort of a delay -- 6 to 20 hours is what the OP mentioned -- > screams at me that the problem is probably related to the db handle > timing out. Even if you change the AR timeout value to 14400 (the > most often quoted value that I see), that is still just 4 hours. If > your process sits quiescent for 6 to 20 hours while the timeout on the > db handle is set at 4 hours, the db handle is still going to time out.Thanks for this Kirk. Yep, I was using 14400. I''m switching this to 2 weeks: 1209600 and we''ll see if any further restarts are needed by monit.> My old servers are 32 bit machines, but my new ones are all 64 bit > machinesWhich 64-bit OS are you running? Thanks, Pete
>> I wonder... how many 64-bit mongrel users are out there? > > My old servers are 32 bit machines, but my new ones are all 64 bit > machines.All of our production boxes lately have been 64bit. -- Jesse Proudman, Blue Box Group, LLC
thanks for the replies. i will try setting that db timeout to about a week and see how it does. Chris Pete DeLaurentis <pete at nextengine.com> wrote: > That sort of a delay -- 6 to 20 hours is what the OP mentioned --> screams at me that the problem is probably related to the db handle > timing out. Even if you change the AR timeout value to 14400 (the > most often quoted value that I see), that is still just 4 hours. If > your process sits quiescent for 6 to 20 hours while the timeout on the > db handle is set at 4 hours, the db handle is still going to time out.Thanks for this Kirk. Yep, I was using 14400. I''m switching this to 2 weeks: 1209600 and we''ll see if any further restarts are needed by monit.> My old servers are 32 bit machines, but my new ones are all 64 bit > machinesWhich 64-bit OS are you running? Thanks, Pete _______________________________________________ Mongrel-users mailing list Mongrel-users at rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-users __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-users/attachments/20071107/da93ea8f/attachment-0001.html
On 11/7/07, Pete DeLaurentis <pete at nextengine.com> wrote:> Thanks for this Kirk. Yep, I was using 14400. I''m switching this to > 2 weeks: 1209600 and we''ll see if any further restarts are needed by > monit.I''ve always wondered why 14400 is the number that is always passed around when talking about extending the timeout period. Maybe there is some db issue with a _really_ long timeout like 1209600?> Which 64-bit OS are you running?Right now I have Ubuntu and CentOS 64 bit machines. Kirk Haines
"Maybe there is some db issue with a _really_ long timeout like 1209600?" that was my thought. i set it mine to 115200, 32 hours more than enough but not too crazy Kirk Haines <wyhaines at gmail.com> wrote: On 11/7/07, Pete DeLaurentis wrote:> Thanks for this Kirk. Yep, I was using 14400. I''m switching this to > 2 weeks: 1209600 and we''ll see if any further restarts are needed by > monit.I''ve always wondered why 14400 is the number that is always passed around when talking about extending the timeout period. Maybe there is some db issue with a _really_ long timeout like 1209600?> Which 64-bit OS are you running?Right now I have Ubuntu and CentOS 64 bit machines. Kirk Haines _______________________________________________ Mongrel-users mailing list Mongrel-users at rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-users __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-users/attachments/20071107/a95c9c94/attachment.html