I''m looking at writing a Mongrel handler that mimics the behavior of the Apache mod_put [1]. It allows for the streaming upload of very large (GB) files; it also supports resumable upload. Before I get too involved, I''d like to ask if my reading of the mongrel source code is correct, i.e. what I want to do isn''t currently possible. Looking at the class HttpRequest I see that request bodies larger than Mongrel::MAX_BODY get streamed to a tempfile before they are handed off to any handlers registered in the chain. Things were looking up for me when I saw the HttpHandler#request_progress call, but it doesn''t hand the body off to the handler (only the params). In other words, mongrel would need to stream the entire upload to a tempfile before it hands off to my "PUT handler," right? If that''s the case, any suggestions on how I could hack the code to hook into the HttpRequest class and redirect the write from "tempfile" to the real permanent file? This code path would only need to be activated for HTTP PUT where Content-Length is greater than Mongrel::MAX_BODY. Of course, I''d like that hack to allow all other requests to work normally. Thanks for any input and pointers. If you think I''m nuts for looking at Mongrel to do this operation, suggest another method that doesn''t involve Apache and mod_put. cr [1] http://www.gknw.at/development/apache/httpd-2.0/unix/modules/ mod_put-2.0.8.tar.gz
On Nov 26, 2006, at 1:17 PM, cremes.devlist at mac.com wrote:> I''m looking at writing a Mongrel handler that mimics the behavior of > the Apache mod_put [1]. It allows for the streaming upload of very > large (GB) files; it also supports resumable upload. > > Before I get too involved, I''d like to ask if my reading of the > mongrel source code is correct, i.e. what I want to do isn''t > currently possible. > > Looking at the class HttpRequest I see that request bodies larger > than Mongrel::MAX_BODY get streamed to a tempfile before they are > handed off to any handlers registered in the chain. Things were > looking up for me when I saw the HttpHandler#request_progress call, > but it doesn''t hand the body off to the handler (only the params). In > other words, mongrel would need to stream the entire upload to a > tempfile before it hands off to my "PUT handler," right? > > If that''s the case, any suggestions on how I could hack the code to > hook into the HttpRequest class and redirect the write from > "tempfile" to the real permanent file? This code path would only need > to be activated for HTTP PUT where Content-Length is greater than > Mongrel::MAX_BODY. Of course, I''d like that hack to allow all other > requests to work normally. > > Thanks for any input and pointers. > > If you think I''m nuts for looking at Mongrel to do this operation, > suggest another method that doesn''t involve Apache and mod_put.You are correct. The way mongrel upload progress works is mongrel itself streams to a tmpfile before it passes the request to any handlers. Then by the tiem your handler gets the request it has to parse the mime boundaries again which is inefficient. Zed is working on a fast C mime carver and I am going to implement a way to grab the tmpfile mongrel makes without reparsing it but we haven''t made progress yet. So let me know if you end up workign on this or if you want to work with me to come up with a way to get the first tmpfile without reparsing. Cheers- -- Ezra Zygmuntowicz -- Lead Rails Evangelist -- ez at engineyard.com -- Engine Yard, Serious Rails Hosting -- (866) 518-YARD (9273)
On Nov 26, 2006, at 4:14 PM, Ezra Zygmuntowicz wrote:> > On Nov 26, 2006, at 1:17 PM, cremes.devlist at mac.com wrote: > >> I''m looking at writing a Mongrel handler that mimics the behavior of >> the Apache mod_put [1]. It allows for the streaming upload of very >> large (GB) files; it also supports resumable upload. >> >> Before I get too involved, I''d like to ask if my reading of the >> mongrel source code is correct, i.e. what I want to do isn''t >> currently possible. >> [snip[ > > You are correct. The way mongrel upload progress works is mongrel > itself streams to a tmpfile before it passes the request to any > handlers. Then by the tiem your handler gets the request it has to > parse the mime boundaries again which is inefficient. Zed is working > on a fast C mime carver and I am going to implement a way to grab the > tmpfile mongrel makes without reparsing it but we haven''t made > progress yet.Ezra, I don''t see where Mongrel is doing any mime parsing in the HttpRequest class. As far as I can see, it''s just dumping the contents of @params.http_body to a TempFile without any intermediate parsing.> So let me know if you end up workign on this or if you want to work > with me to come up with a way to get the first tmpfile without > reparsing.Sure, I''d like to work with you on this. I either get this working in Mongrel or I bite the bullet and use Apache/mod_put (which, in my particular circumstances, will require me to bend all sorts of other rules). What ideas have you developed for grabbing the TempFile? From a cursory look at the code, here''s my first idea. Feel free to poke holes in it. 1. Modify the HttpRequest class to stop reading from @params.http_body into @body at Mongrel::MAX_BODY bytes. Set internal flag on this request to indicate the request is unfinished. 2. Add #done? to HttpRequest class to test internal flag and return its true/false state. 3. Modify HttpServer#process_client in the following ways: 3a. Do not allocate an HttpResponse object unless request.done? 3b. Do not finalize the response object until request.done? This would have the effect of calling handlers with a nil ''response'' and a potentially unfinalized ''request'' which would probably break all current handlers. :-( Alternately, create a second chain of handlers which register for ''partial requests'' and only call the other handler chain when the request and response objects are whole/finalized. We avoid the use of TempFiles altogether with this approach and allow this new class of handler to be the first to hit the disk. If the TempFile behavior is necessary for other handler''s proper operation, Mongrel could ship with a handler that restores this functionality as the default. cr