Hi, I made an interesting observation using webservers (not just mongrel) under red hat enterprise linux ES release 4 (Nahant Update 5). Maybe this is helpful or somebody with deeper networking expertise can comment on this. Once client said that 1-2% of the response of our server were unacceptably slow (really huge 3s-21s). So I did more ab and httperf tests and notice that very few requests do take a very long time. Being clueless I first thought that the ruby garbage collector or mongrel is causing the effect, so after looking at a similar setup using erlang''s yaws or nginx alone, I noticed that sometimes this can happen, especially when increasing the number of concurrent connects to a large number (e.g. 250-500). I did the same on OS-X and did not notice these outliers. After a lot of painful searching by luck we found one cure: inet_peer_threshold was too small. Chaging it to a much larger value made the problem go away. echo 500000 > /proc/sys/net/ipv4/inet_peer_threshold There is a trade-off here. A too small value causes too many delays from inet-peer-storage cleaning and a too big value makes life well for some limited time, but when it hits you, it becomes really expensive. Did you ever see this? Thanks, -Armin
On Sep 23, 2007, at 02:30 , armin roehrl wrote:> Hi, > > I made an interesting observation using webservers (not just > mongrel) under red hat enterprise > linux ES release 4 (Nahant Update 5). Maybe this is helpful or > somebody with deeper networking > expertise can comment on this. > > Once client said that 1-2% of the response of our server were > unacceptably slow (really huge 3s-21s). > So I did more ab and httperf tests and notice that very few > requests do take a very long time. Being clueless > I first thought that the ruby garbage collector or mongrel is > causing the effect, so after looking > at a similar setup using erlang''s yaws or nginx alone, I noticed > that sometimes this can happen, > especially when increasing the number of concurrent connects to a > large number (e.g. 250-500). > I did the same on OS-X and did not notice these outliers. > > After a lot of painful searching by luck we found one cure: > inet_peer_threshold was too small. > Chaging it to a much larger value made the problem go away. > > echo 500000 > /proc/sys/net/ipv4/inet_peer_threshold > > There is a trade-off here. A too small value causes too many delays > from inet-peer-storage cleaning > and a too big value makes life well for some limited time, but when > it hits you, it becomes really expensive. > > Did you ever see this? > > Thanks, > -ArminArmin, We might put this in the documentation, will discuss with the dev team. Thank you for this. ~Wayne s///g Wayne E. Seguin Sr. Systems Architect & Systems Administrator -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-users/attachments/20070923/e0d86058/attachment.html
On Sep 23, 2007, at 02:30 , armin roehrl wrote:> Did you ever see this?We run RedShit EL 4 in production also and yes we have come across this issue, never tracked it down till now though ( thank you :) ). ~Wayne s///g Wayne E. Seguin Sr. Systems Architect & Systems Administrator -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mongrel-users/attachments/20070923/32b93800/attachment-0001.html
Armin, Since your goofing around with that, how about some of the other settings like maxtime and ttl values? Seems like you have deleayd when some garbage collection operations are taking place, so maybe tweeking that a little more will give you the performance your looking for. I''ve never used this feature, but thought that might interesting as well. Mike B. Quoting "Wayne E. Seguin" <wayneeseguin at gmail.com>:> On Sep 23, 2007, at 02:30 , armin roehrl wrote: >> Hi, >> >> I made an interesting observation using webservers (not just >> mongrel) under red hat enterprise >> linux ES release 4 (Nahant Update 5). Maybe this is helpful or >> somebody with deeper networking >> expertise can comment on this. >> >> Once client said that 1-2% of the response of our server were >> unacceptably slow (really huge 3s-21s). >> So I did more ab and httperf tests and notice that very few >> requests do take a very long time. Being clueless >> I first thought that the ruby garbage collector or mongrel is >> causing the effect, so after looking >> at a similar setup using erlang''s yaws or nginx alone, I noticed >> that sometimes this can happen, >> especially when increasing the number of concurrent connects to a >> large number (e.g. 250-500). >> I did the same on OS-X and did not notice these outliers. >> >> After a lot of painful searching by luck we found one cure: >> inet_peer_threshold was too small. >> Chaging it to a much larger value made the problem go away. >> >> echo 500000 > /proc/sys/net/ipv4/inet_peer_threshold >> >> There is a trade-off here. A too small value causes too many delays >> from inet-peer-storage cleaning >> and a too big value makes life well for some limited time, but when >> it hits you, it becomes really expensive. >> >> Did you ever see this? >> >> Thanks, >> -Armin > > Armin, > > We might put this in the documentation, will discuss with the dev team. > > Thank you for this. > > ~Wayne > > s///g > Wayne E. Seguin > Sr. Systems Architect & Systems Administrator > >---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Sep 23, 2007, at 1:41 PM, barsalou wrote:> Armin, > > Since your goofing around with that, how about some of the other > settings like maxtime and ttl values? > > Seems like you have deleayd when some garbage collection operations > are > taking place, so maybe tweeking that a little more will give you the > performance your looking for. > > I''ve never used this feature, but thought that might interesting as > well. > > Mike B. > > Quoting "Wayne E. Seguin" <wayneeseguin at gmail.com>: > >> On Sep 23, 2007, at 02:30 , armin roehrl wrote: >>> Hi, >>> >>> I made an interesting observation using webservers (not just >>> mongrel) under red hat enterprise >>> linux ES release 4 (Nahant Update 5). Maybe this is helpful or >>> somebody with deeper networking >>> expertise can comment on this. >>> >>> [snip] >>> echo 500000 > /proc/sys/net/ipv4/inet_peer_threshold >>> >>> There is a trade-off here. A too small value causes too many delays >>> from inet-peer-storage cleaning >>> and a too big value makes life well for some limited time, but when >>> it hits you, it becomes really expensive. >>> >>> Did you ever see this? >>> >>> Thanks, >>> -Armin >> >> Armin, >> >> We might put this in the documentation, will discuss with the dev >> team.I''ve searched all over the place to confirm this issue with RHEL 4 update 5 and have come up empty. What''s the original source of the "fix" ? Also, any suggestions on how to build a test harness to confirm new values actually *improve* the situation rather than make it worse? cr