Sure
Mongrel 0.3.13.4
Mongrel Cluster 0.2.0
Ruby 1.8.4
Rails 1.1.6
Apache 2.2.2
RHEL 4
The symptom is that we are getting frequent application 500 errors.
Monitoring the mongrel cluster shows some of the servers in Err status at
any given moment. We run 40 mongrel instances in the cluster and only a few
of them are in Err status at a time. Does a mongrel instance in Err status
return an application 500 error? I have not found documentation on the
cluster monitor.
The Apache config setup is standard, taken from Coda Hale''s blog.
The Ruby on Rails code is straight forward. Very few gems are used, the
memory footprint is only 35MB for each mongrel instance. I have not been
able to find any error messages in the production log. What types of things
would be major no-no''s? I don''t use sessions. The web
application mainly
uses AJAX to coordinate messaging between a server and the client.
I have not been able to put the servers into debug mode for the past couple
of days since they are running in production at our colo, which is the only
place we''ve seen these application errors occur. Restarting them causes
chats to get dropped as well searches our users are performing.
The linux kernel''s TCP/IP settings have been tweaked to the following:
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_window_scaling = 0
net.ipv4.tcp_dsack = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.icmp_echo_ignore_broadcasts = 0
net.ipv4.inet_peer_threshold = 16536
net.ipv4.ipfrag_high_thresh = 5000000
net.ipv4.ipfrag_low_thresh = 3000000
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.lo.rp_filter = 0
net.ipv4.conf.eth0.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.core.netdev_max_backlog = 2500
net.core.optmem_max = 102400
net.core.rmem_default = 262141
net.core.rmem_max = 262141
net.core.wmem_default = 262141
net.core.wmem_max = 262141
net.ipv4.route.gc_interval = 5
net.ipv4.route.gc_elasticity = 3
net.ipv4.route.gc_min_interval = 1
net.ipv4.route.gc_timeout = 30
net.ipv4.route.max_size = 65536
net.ipv4.route.gc_thresh = 256
fs.file-max = 32768
net.ipv4.ip_local_port_range = 1024 65535
Although we were noticing the application 500 errors before these tweaks and
removing these tweaks did not seem to make a difference.
Switching servers seems to resolve the problem for a day or so.
On 10/30/06, Zed A. Shaw <zedshaw at zedshaw.com>
wrote:>
> On Mon, 30 Oct 2006 10:02:29 -0500
> "Jared Brown" <mongrelmail at gmail.com> wrote:
>
> > Configuration:
> >
> > (2) Dual Core Opterons
> > 8GB RAM
> > Apache used to balance 40 mongrel instances
> >
> > We receive Application 500 Errors. Nothing suspect appears in the log,
> so we
> > are at a lost as to what to do next.
> >
> > Any advice would be welcome and/or an explanation of what types of
> things
> > cause Application 500 Errors in mongrel.
>
> Jared, you write software right? What would you do if someone came
> running into your office babbling about some kind of bug, but
couldn''t tell
> you important information you needed to know to fix the bug?
>
> As a programmer I expect other programmers to treat me as they want to be
> treated. If you ask a question on the list also report: the versions of
> your software, how you''re using it, if you''re doing
anything odd, operating
> systems used, etc.
>
> C''mon, you know how to do this right.
>
> --
> Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu
> http://www.zedshaw.com/
> http://safari.oreilly.com/0321483502 -- The Mongrel Book
> http://mongrel.rubyforge.org/
> http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.
> _______________________________________________
> Mongrel-users mailing list
> Mongrel-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mongrel-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/mongrel-users/attachments/20061030/589cc689/attachment.html