I posted about qrp (http://qrp.rubyforge.org/) many weeks ago, but only deployed it to a live site a few weeks ago (after a few bug fixes leading up to qrp v0.4.0). qrp was deployed late on 2008-03-27 to roughly half our servers, and then fully deployed on 2008-03-28. So far, the results have been fairly decent (see below). The only change I needed to make to Mongrel was the following patch to disable the excessive logging since I disabled concurrency in Mongrel: --- a/mongrel.rb 2008-03-03 16:42:04.000000000 -0800 +++ b/mongrel.rb 2008-04-17 15:30:57.313952784 -0700 @@ -210,7 +210,7 @@ # after the reap is done. It only runs if there are workers to reap. def reap_dead_workers(reason=''unknown'') if @workers.list.length > 0 - STDERR.puts "#{Time.now}: Reaping #{@workers.list.length} threads for slow workers because of ''#{reason}''" + #STDERR.puts "#{Time.now}: Reaping #{@workers.list.length} threads for slow workers because of ''#{reason}''" error_msg = "Mongrel timed out this thread: #{reason}" mark = Time.now @workers.list.each do |worker| @@ -278,7 +278,7 @@ worker_list = @workers.list if worker_list.length >= @num_processors - STDERR.puts "Server overloaded with #{worker_list.length} processors (#@num_processors max). Dropping connection." + #STDERR.puts "Server overloaded with #{worker_list.length} processors (#@num_processors max). Dropping connection." client.close rescue nil reap_dead_workers("max processors") else As far as I know, we''re the only Rails site running qrp in our configuration, but it should be safe now that we''re doing it ;) Results: While I don''t have hard numbers for average response time and standard deviation, they have not changed much since qrp was deployed. However, our metric for requests taking over 10 seconds has improved greatly since qrp was deployed[1]. The Date is actually shifted by one day (so the report that I received on 2008-03-02 was actually for the previous days traffic). While a few hundredths of one percent doesn''t sound like a lot, that''s still a reasonable amount of unhappy users that get bogged down. Date | % of requests taking >10s, (0-100) 2008-03-01 | 0.1192 | ***************** 2008-03-02 | 0.1537 | *********************** 2008-03-03 | 0.0634 | ********* 2008-03-04 | 0.1094 | **************** 2008-03-05 | 0.1241 | ****************** 2008-03-06 | 0.1075 | **************** 2008-03-07 | 0.1086 | **************** 2008-03-08 | 0.1664 | ************************ 2008-03-09 | 0.1647 | ************************ 2008-03-10 | 0.0705 | ********** 2008-03-11 | 0.1190 | ***************** 2008-03-12 | 0.1754 | ************************** 2008-03-13 | 0.1202 | ****************** 2008-03-14 | 0.1351 | ******************** 2008-03-15 | 0.1463 | ********************* 2008-03-16 | 0.1468 | ********************** 2008-03-17 | 0.1425 | ********************* 2008-03-18 | 0.1271 | ******************* 2008-03-19 | 0.1260 | ****************** 2008-03-20 | 0.1209 | ****************** 2008-03-21 | 0.1438 | ********************* 2008-03-23 | 0.1139 | ***************** 2008-03-24 | 0.0916 | ************* 2008-03-25 | 0.1469 | ********************** 2008-03-26 | 0.1316 | ******************* 2008-03-26 | 0.1323 | ******************* 2008-03-27 | 0.1397 | ******************** 2008-03-28 | 0.0927 | ************* <partial qrp deployment> 2008-03-29 | 0.0425 | ****** <full qrp deployment> 2008-03-30 | 0.0440 | ****** 2008-03-31 | 0.0461 | ****** 2008-04-01 | 0.0357 | ***** 2008-04-02 | 0.0319 | **** 2008-04-03 | 0.0325 | **** 2008-04-04 | 0.0314 | **** 2008-04-05 | 0.0664 | ********* 2008-04-05 | 0.0652 | ********* 2008-04-06 | 0.0823 | ************ 2008-04-07 | 0.0605 | ********* 2008-04-08 | 0.0553 | ******** 2008-04-09 | 0.0537 | ******** 2008-04-10 | 0.1166 | ***************** <something broke this day> 2008-04-11 | 0.0512 | ******* 2008-04-12 | 0.0546 | ******** 2008-04-13 | 0.0619 | ********* 2008-04-14 | 0.0519 | ******* 2008-04-15 | 0.0421 | ****** 2008-04-16 | 0.0441 | ****** 2008-04-17 | 0.0409 | ****** We had some internal problems on 2008-04-10 so things went to hell that day. Once again, qrp is needed for a Rails site I work on because: a) we unfortunately use a web service run by folks who suck at the Internet. Unfortunately the tech folks like myself have little control of this. b) One of our internal backend services have some pathologically bad corner cases we occasionally hit. Eliminating them isn''t possible due to strange business requirements (and some of the troublesome backend code is proprietary and we can''t improve it). [1] yes, I realize that saying that the number of >10s responses have dropped is like saying we''ve won the Special Olympics :) -- Eric Wong