The deployment scenario... Apache2 on shared host, proxying to lighttpd, which has 3 external fcgis running on localhost. The fcgis are managed by spinner/spawner. We''re noticing a definite speed issue on "first requests" to this site. For example: * Hit the site a few times, paying no attention to load time * Wait x period of time (haven''t quite narrowed this down yet, but probably 5-10 mins) * Hit site again once - this request will take anywhere from 5 - 30 or so seconds * Reload site a few times - these requests will be very quick - less than one second These load times are reflected not just in the "feel" we get from using the site, but are confirmed by the production.log What''s odd is that the time seems to be inconsistent with where it happens. The DB portion of the time is always very very small, even on the "first request" long requests. The overall completed time, for example, might be 10 seconds. Sometimes the ''Rendering'' component would be 6-7 seconds of that overall time, but sometimes it will be very small (under 1 second) and the other 8-9 seconds that aren''t explained by either Rendering or DB time are lost to.....something? So, basically... * Has anyone seen this issue before and know what the problem is? * Are there settings in any of apache2, lighttpd or rails itself that I''m unaware of which might cure this? The app uses the Globalize plugin, but is otherwise pretty standard. We''ve tried most combinations of switching between rails 1.0 and rails 1.1.2, tweaking ActionController::Base.allow_concurrency (we were also getting the "dropped mysql conn" errors in dev mode...), tweaking ActionView::Base.cache_template_loading (thought that might be slowing views down?), and so on all to no avail. Thoughts? -Matt
Michael Greenly
2006-Apr-12 20:12 UTC
[Rails] Re: Production deployment speed "wakeup" issue
Matt Jankowski wrote:> Thoughts? > > -MattI ran accross this in Apache''s proxy docs.. If you''re using the ProxyBlock directive, hostnames'' IP addresses are looked up and cached during startup for later match test. This may take a few seconds (or more) depending on the speed with which the hostname lookups occur -- Posted via http://www.ruby-forum.com/.
Matt Jankowski wrote:> The deployment scenario... > > Apache2 on shared host, proxying to lighttpd, which has 3 external fcgis > running on localhost. The fcgis are managed by spinner/spawner. > > We''re noticing a definite speed issue on "first requests" to this site. > > For example: > * Hit the site a few times, paying no attention to load time > * Wait x period of time (haven''t quite narrowed this down yet, but > probably 5-10 mins) > * Hit site again once - this request will take anywhere from 5 - 30 or > so seconds > * Reload site a few times - these requests will be very quick - less > than one secondThis has come up a number of times on the list. It may be that your sleeping fcgi processes are swapped out, and take time to be brought back to life. Various people have recommended using cron and wget (or curl) to request a dynamic page every few minutes to keep response times short.> > These load times are reflected not just in the "feel" we get from using > the site, but are confirmed by the production.log > > What''s odd is that the time seems to be inconsistent with where it > happens. The DB portion of the time is always very very small, even on > the "first request" long requests. The overall completed time, for > example, might be 10 seconds. Sometimes the ''Rendering'' component would > be 6-7 seconds of that overall time, but sometimes it will be very small > (under 1 second) and the other 8-9 seconds that aren''t explained by > either Rendering or DB time are lost to.....something?The slow rendering is more puzzling than the "missing time" - Rails couldn''t measure the time taken to swap a process back in. regards Justin
Matt Jankowski wrote:> > Thoughts?No thoughts, but here''s a hack:-) I was seeing this problem (shared host, running Apache/fcgi), and the occasional long connect times made me think my fcgi dispatcher was getting swapped out. So I added this to one of my controllers: def ping render :text => "<html><head></head><body>Ping!</body></html>" end And I run this script on my desktop machine: --- require ''open-uri'' def pingit(url) stuff = '''' begin open(url) do |f| stuff = f.read end rescue Exception => e puts("#{e} #{e.to_s} in #{url}\n") end stuff end while true puts Time.new s = pingit(ARGV[0]) # puts s sleep(600) end ---- This way I can say "./Pingit.rb http://domain/controller/ping" and it will hit the site every ten minutes, showing me any errors or failures to connect. It seems to work fairly well -- site responds pretty consistently in a second or two -- but this is a totally heuristic approach. --Al Evans -- Posted via http://www.ruby-forum.com/.
> * Hit the site a few times, paying no attention to load time > * Wait x period of time (haven''t quite narrowed this down yet, but > probably 5-10 mins) > * Hit site again once - this request will take anywhere from 5 - 30 > or so secondsi can top that, with lighttpd, don''t hit the site for a few hours. then the next request is a response 500. press F5, and then its fine. nothing decidedly interesting in the logs other than the fastcgi process decided to disappear. i''m going to try mongrel when getting around to deploying.. -- Posted via http://www.ruby-forum.com/.
mjankowski@unicorngroomers.com
2006-Jun-07 08:44 UTC
[Rails] Production deployment speed "wakeup" issue
Just following up on my own post from a while back, with a report on how the issue below resolved itself. Biggest issues we found * HUGE problem - the linux kernel which the machine was running was a release from the 2.4 series which had big VM/swap issues. This machine - which had been running a J2EE app with decent speeds and under very little load for the past year - was recently repurposed to do hosting for a few rails applications. I have no idea why the rails apps brought out the demons that the J2EE app had not, but they did. Moral of the story - make sure your kernel release is up to date. * Remember to index your DB! Maybe it''s because I''m thinking in terms of models and not in terms of DB tables/rows, but I consistently forget to add indexes to my tables while using migrations to create the DB. Needless to say, going back in and indexing frequently used associations provided a HUGE speedup for the application. * Lighttpd / Apache issue - we found a strange condition with apache proxying back to lighty where, on certain requests (usually asset files - js, css, images, etc) that were over ~20k in size, we''d get a lockup between apache/lighty. With a high timeout on the proxy, this leads to a scenario where the browser has the entire HTML page, but it''s waiting on some assets to render, and sits there until the proxy has timed out. We''ve since switched to Apache2.2.2/mod_proxy_balancer/mongrel, and aren''t particularly interested in tracking down what the actual issue here is. So, in conclusion, the mongrel/apache/proxy_balancer setup (along with rewrite rules in apache to serve static requests), is absolutely great, and easy to manage with capistrano and mongrel cluster. With the kernel fix, the removal of lighty, and the db indexing, the application is much much quicker and the "first page wait" issue is essentially gone.> Matt Jankowski wrote: >> The deployment scenario... >> >> Apache2 on shared host, proxying to lighttpd, which has 3 external fcgis >> running on localhost. The fcgis are managed by spinner/spawner. >> >> We''re noticing a definite speed issue on "first requests" to this site. >> >> For example: >> * Hit the site a few times, paying no attention to load time >> * Wait x period of time (haven''t quite narrowed this down yet, but >> probably 5-10 mins) >> * Hit site again once - this request will take anywhere from 5 - 30 or >> so seconds >> * Reload site a few times - these requests will be very quick - less >> than one second > > This has come up a number of times on the list. It may be that your > sleeping fcgi processes are swapped out, and take time to be brought > back to life. Various people have recommended using cron and wget (or > curl) to request a dynamic page every few minutes to keep response times > short. > >> >> These load times are reflected not just in the "feel" we get from using >> the site, but are confirmed by the production.log >> >> What''s odd is that the time seems to be inconsistent with where it >> happens. The DB portion of the time is always very very small, even on >> the "first request" long requests. The overall completed time, for >> example, might be 10 seconds. Sometimes the ''Rendering'' component would >> be 6-7 seconds of that overall time, but sometimes it will be very small >> (under 1 second) and the other 8-9 seconds that aren''t explained by >> either Rendering or DB time are lost to.....something? > > The slow rendering is more puzzling than the "missing time" - Rails > couldn''t measure the time taken to swap a process back in. > > regards > > Justin > > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails >