ehansen486
2010-Oct-11 18:44 UTC
streaming a large XML file; optimizing large file downloads in RAILS
I occasionally need to stream a large XML data file that represents key data in a DB. I''m porting over an application from PHP Symfony, and with my initial implementation, it takes around 7 times as long with rails. Also with Symfony, data begins to download almost as soon as I invoke the URL, whereas with rails, all data is processed on the server side before the client gets the first byte. I have a hand- crafted query to hit the DB, and use fetch_hash to use the raw data from the mysql gem and that renders extremely quickly. Also I''ve tried to write a tiny subset of XML while reading the entire resultSet; with that I get much faster performance, but of course that way the XML doesn''t come. I spent most of the past weekend trying to determine how to optimize this (hoping to do at least as well as PHP symfony) but can''t do it. I tried: - used render :text => (lambda do |response, output| ... ) - ruby 1.8.7 vs. ruby 1.9.2 - rails 2.3.5 vs. rails 3 - XmlBuilder vs. Nokogiri::XML::Builder - HAML vs ERB - passenger vs. script/server Nothing honestly moved the performance needle in a serious way. I''ve finally come to the conclusion that rails does not stream out as I''d expect. Here''s a look at the perf stats rendered as the request runs: Rendered hgrants/_request_detail (2.2ms) Rendered hgrants/_request_detail (3.9ms) Rendered hgrants/_request_detail (2.4ms) Rendered hgrants/_request_detail (2.3ms) Rendered hgrants/_request_detail (242.7ms) Rendered hgrants/_request_detail (2.2ms) Rendered hgrants/_request_detail (1.9ms) Rendered hgrants/_request_detail (1.8ms) We went from an average 2ms up to 242ms then back down. I saw this sporadically throughout the 1000 template renderings That suggests to me that memory is getting garbage collected. Also, I''m invoking the request from curl, and it reports no data downloaded until after my logfile tells me rails has finished processing all records in the view. The model IDs that result in the over-sized ms count vary from one request to another, so I''m convinced there is nothing in the app that is doing this. I even tested this by removing the call to the HAML template and replacing it with a block of generated text and observed similar behavior. This is how I''m invoking HAML from the XML Builder template: xml << render(:partial => ''hgrants/ request_detail.html.haml'', :locals => { :model => model }) I also tried using this trick to try to get it to stream, but I observed exactly the same behavior; no data showed up in curl until all records had been processed. render :text => (lambda do |response, output| extend ApplicationHelper xml = Builder::XmlMarkup.new( :target => StreamingOutputWrapper.new(output), :indent => 2) eval(default_template.source, binding, default_template.path) end) (Also, in rails 3, the render :text with a Proc, rails 3 renders the Proc as a to_str rather than calling it.) This particular issue I can certainly work around but it''s disappointing if it''s true that there''s no way in rails to stream output to the browser for large pages. And particularly disappointing if PHP/Symfony can outgun rails for streaming. I''ve been using rails since 2006 and most requests have fairly small responses so maybe the answer is to defer to a different technology for streaming larger files. But it seems like there should be a good solution for streaming data and flushing the output stream. Any help is greatly appreciated! Eric -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Marnen Laibow-Koser
2010-Oct-11 20:59 UTC
Re: streaming a large XML file; optimizing large file downloads in RAILS
On Oct 11, 2:44 pm, ehansen486 <ehansen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> I occasionally need to stream a large XML data file that represents > key data in a DB. I''m porting over an application from PHP Symfony, > and with my initial implementation, it takes around 7 times as long > with rails.[...]> I''ve finally come to the conclusion that rails does not stream out as > I''d expect.[...] Have you tried send_data? Alternatively, how does Symfony do its streaming? Can you write something equivalent for Rails? Best, -- Marnen Laibow-Koser http://www.marnen.org marnen-sbuyVjPbboAdnm+yROfE0A@public.gmane.org -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Marnen Laibow-Koser
2010-Oct-11 20:59 UTC
Re: streaming a large XML file; optimizing large file downloads in RAILS
On Oct 11, 2:44 pm, ehansen486 <ehansen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> I occasionally need to stream a large XML data file that represents > key data in a DB. I''m porting over an application from PHP Symfony, > and with my initial implementation, it takes around 7 times as long > with rails.[...]> I''ve finally come to the conclusion that rails does not stream out as > I''d expect.[...] Have you tried send_data? I think that''s what most people use to stream dynamic content. Alternatively, how does Symfony do its streaming? Can you write something equivalent for Rails? Best, -- Marnen Laibow-Koser http://www.marnen.org marnen-sbuyVjPbboAdnm+yROfE0A@public.gmane.org -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Frederick Cheung
2010-Oct-11 21:19 UTC
Re: streaming a large XML file; optimizing large file downloads in RAILS
On Oct 11, 7:44 pm, ehansen486 <ehansen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> Nothing honestly moved the performance needle in a serious way. > > I''ve finally come to the conclusion that rails does not stream out as > I''d expect. Here''s a look at the perf stats rendered as the request > runs:it doesn''t. Rails 3.1 will change some of that apparently (http:// yehudakatz.com/2010/09/07/automatic-flushing-the-rails-3-1-plan/) If you drop down to the rack level (ie write this as a rails metal) you should be able to stream responses - the rack body response can be any thing that responds to each. and rack will keep calling that each method until you''re done. .The docs also say that render :text => lambda { ...} allows streaming but with various conflicting opinions form actual users (I''ve never tried that). This may also depend on the server (mongrel, thin etc) you use - it''s no good you streaming data to rack if the next person down the chain sits on it until is done Fred> > Rendered hgrants/_request_detail (2.2ms) > Rendered hgrants/_request_detail (3.9ms) > Rendered hgrants/_request_detail (2.4ms) > Rendered hgrants/_request_detail (2.3ms) > Rendered hgrants/_request_detail (242.7ms) > Rendered hgrants/_request_detail (2.2ms) > Rendered hgrants/_request_detail (1.9ms) > Rendered hgrants/_request_detail (1.8ms) > > We went from an average 2ms up to 242ms then back down. I saw this > sporadically throughout the 1000 template renderings That suggests to > me that memory is getting garbage collected. Also, I''m invoking the > request from curl, and it reports no data downloaded until after my > logfile tells me rails has finished processing all records in the > view. The model IDs that result in the over-sized ms count vary from > one request to another, so I''m convinced there is nothing in the app > that is doing this. I even tested this by removing the call to the > HAML template and replacing it with a block of generated text and > observed similar behavior. > > This is how I''m invoking HAML from the XML Builder template: > xml << render(:partial => ''hgrants/ > request_detail.html.haml'', :locals => { :model => model }) > > I also tried using this trick to try to get it to stream, but I > observed exactly the same behavior; no data showed up in curl until > all records had been processed. > render :text => (lambda do |response, output| > extend ApplicationHelper > > xml = Builder::XmlMarkup.new( > :target => StreamingOutputWrapper.new(output), > :indent => 2) > eval(default_template.source, binding, default_template.path) > end) > > (Also, in rails 3, the render :text with a Proc, rails 3 renders the > Proc as a to_str rather than calling it.) > > This particular issue I can certainly work around but it''s > disappointing if it''s true that there''s no way in rails to stream > output to the browser for large pages. And particularly disappointing > if PHP/Symfony can outgun rails for streaming. I''ve been using rails > since 2006 and most requests have fairly small responses so maybe the > answer is to defer to a different technology for streaming larger > files. But it seems like there should be a good solution for > streaming data and flushing the output stream. > > Any help is greatly appreciated! > Eric-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
ehansen486
2010-Oct-12 16:04 UTC
Re: streaming a large XML file; optimizing large file downloads in RAILS
Hi Fred- What you''re saying makes a lot of sense. As your automatic-flushing- the-rails-3-1-plan article relates, for most rails interactions it''s difficult to stream because of all the evaluation that needs to occur. Larger file downloads really are a special case. Using rails metal to respond seems logical. When I get a moment I''ll create a brand new rails app and see if I can get rails to stream as I''d expect; perhaps there is something in rack that is preventing the streaming. In rails 3, the render :text => lambda { ... } is definitely broken. Thanks for the help! --> Eric On Oct 11, 2:19 pm, Frederick Cheung <frederick.che...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> On Oct 11, 7:44 pm, ehansen486 <ehansen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > Nothing honestly moved the performance needle in a serious way. > > > I''ve finally come to the conclusion that rails does not stream out as > > I''d expect. Here''s a look at the perf stats rendered as the request > > runs: > > it doesn''t. Rails 3.1 will change some of that apparently (http:// > yehudakatz.com/2010/09/07/automatic-flushing-the-rails-3-1-plan/) > > If you drop down to the rack level (ie write this as a rails metal) > you should be able to stream responses - the rack body response can be > any thing that responds to each. and rack will keep calling that each > method until you''re done. > > .The docs also say that render :text => lambda { ...} allows streaming > but with various conflicting opinions form actual users (I''ve never > tried that). This may also depend on the server (mongrel, thin etc) > you use - it''s no good you streaming data to rack if the next person > down the chain sits on it until is done > > Fred > > > > > > > Rendered hgrants/_request_detail (2.2ms) > > Rendered hgrants/_request_detail (3.9ms) > > Rendered hgrants/_request_detail (2.4ms) > > Rendered hgrants/_request_detail (2.3ms) > > Rendered hgrants/_request_detail (242.7ms) > > Rendered hgrants/_request_detail (2.2ms) > > Rendered hgrants/_request_detail (1.9ms) > > Rendered hgrants/_request_detail (1.8ms) > > > We went from an average 2ms up to 242ms then back down. I saw this > > sporadically throughout the 1000 template renderings That suggests to > > me that memory is getting garbage collected. Also, I''m invoking the > > request from curl, and it reports no data downloaded until after my > > logfile tells me rails has finished processing all records in the > > view. The model IDs that result in the over-sized ms count vary from > > one request to another, so I''m convinced there is nothing in the app > > that is doing this. I even tested this by removing the call to the > > HAML template and replacing it with a block of generated text and > > observed similar behavior. > > > This is how I''m invoking HAML from the XML Builder template: > > xml << render(:partial => ''hgrants/ > > request_detail.html.haml'', :locals => { :model => model }) > > > I also tried using this trick to try to get it to stream, but I > > observed exactly the same behavior; no data showed up in curl until > > all records had been processed. > > render :text => (lambda do |response, output| > > extend ApplicationHelper > > > xml = Builder::XmlMarkup.new( > > :target => StreamingOutputWrapper.new(output), > > :indent => 2) > > eval(default_template.source, binding, default_template.path) > > end) > > > (Also, in rails 3, the render :text with a Proc, rails 3 renders the > > Proc as a to_str rather than calling it.) > > > This particular issue I can certainly work around but it''s > > disappointing if it''s true that there''s no way in rails to stream > > output to the browser for large pages. And particularly disappointing > > if PHP/Symfony can outgun rails for streaming. I''ve been using rails > > since 2006 and most requests have fairly small responses so maybe the > > answer is to defer to a different technology for streaming larger > > files. But it seems like there should be a good solution for > > streaming data and flushing the output stream. > > > Any help is greatly appreciated! > > Eric-- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Claudio Poli
2010-Oct-13 19:58 UTC
Re: streaming a large XML file; optimizing large file downloads in RAILS
On 11 Ott, 20:44, ehansen486 <ehansen...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> I occasionally need to stream a large XML data file that represents > key data in a DB. I''m porting over an application from PHP Symfony,[...]> This particular issue I can certainly work around but it''s > disappointing if it''s true that there''s no way in rails to stream > output to the browser for large pages. And particularly disappointing > if PHP/Symfony can outgun rails for streaming. I''ve been using rails > since 2006 and most requests have fairly small responses so maybe the > answer is to defer to a different technology for streaming larger > files. But it seems like there should be a good solution for > streaming data and flushing the output stream.I''m in the same boat, Rails 2-3-stable, output.flush is said to be deprecated and no longer works, but it seems that using render :text => proc { |response, output| doesn''t send streamed data at all. I also tried with send_data without luck. After some research I thought that the flush would happen after a output.write but that does not seem the case, at least where I looked. We have potentially very large ajax requests (3mb) and from a java server we were able to cut down the action time greatly by manipulating the response; I''m trying to achieve the same from Rails but nothing I tried currently works. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Marnen Laibow-Koser
2010-Oct-13 20:04 UTC
Re: streaming a large XML file; optimizing large file downloads in RAILS
Claudio Poli wrote in post #949941: [...]> We have potentially very large ajax requests (3mb)It sounds like Rails'' streaming needs to improve, but a 3MB Ajax request is a huge design problem! For performance reasons, it should rarely be necessary to request more than 100K or so through Ajax. Best, -- Marnen Laibow-Koser http://www.marnen.org marnen-sbuyVjPbboAdnm+yROfE0A@public.gmane.org -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
Robert Walker
2010-Oct-13 20:23 UTC
Re: streaming a large XML file; optimizing large file downloads in RAILS
ehansen486 wrote in post #949547:> In rails 3, the render :text => lambda { ... } is definitely broken.I suppose then it might not be a bad idea to submit a documentation patch to either remove or note that this is broken in Rails 3.0. send_data ... ... Tip: if you want to stream large amounts of on-the-fly generated data to the browser, then use render :text => proc { ... } instead. See ActionController::Base#render for more information. -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.