hi all, I want to move some routing tasks out of the router and into the controller. The goal is to make Merb feel less like mod_rewrite and give the user more control at the controller. The new Router is simple: it takes the path_info (not the whole request) then outputs a controller class and some parameters from the path matching. The rest of the routing would be done at the controller level. The controller is ultimately in charge of determining which action to call. In particular, only the controller gets to see the http method. Because resource controllers and normal controllers have very different routing rules, I think they should be different classes. I''d split them into two classes: Merb::Controllers::Standard and Merb::Controllers::Resource. These would both be subclasses of Controller, and would implement the method Controller#route Users would create controllers by subclassing one of these two. Some cute things can be done by letting the controller have control of action routing. For example, in the standard controller we could use punctuation to declare that an action receives a POST: def update! # POST controller/update def update # GET controller/update def index # GET controller/index The standard controller can safely ignore _method (for those of us who do not want emulated PUT and DELETE methods) We can put custom route rules nearer to the actions they effect instead of in a separate file. add_route ''test/:blah'', :index # assigns params[:blah] and routes to test action def test end Duane''s popular regular expression routing can be implemented in an add_route-like controller class method, say, match_route. In the end this will be a tighter, more readable, and more natural API than the current. Additionally this opportunity can be used to rid Merb of the increasingly complex @compiled_statement construction in route.rb and rely instead on a constructed data structure to match against. I suspect performance can be increased. This is a largish change and I would like to create it in an experimental branch to work on it. Thoughts? ry
Hi~ On Sep 17, 2007, at 6:14 PM, ry dahl wrote:> hi all, > > I want to move some routing tasks out of the router and into the > controller. The goal is to make Merb feel less like mod_rewrite and > give the user more control at the controller. The new Router is > simple: it takes the path_info (not the whole request) then outputs a > controller class and some parameters from the path matching. The rest > of the routing would be done at the controller level. The controller > is ultimately in charge of determining which action to call. In > particular, only the controller gets to see the http method. > > Because resource controllers and normal controllers have very > different routing rules, I think they should be different classes. I''d > split them into two classes: Merb::Controllers::Standard and > Merb::Controllers::Resource. These would both be subclasses of > Controller, and would implement the method Controller#route > Users would create controllers by subclassing one of these two. > > Some cute things can be done by letting the controller have control of > action routing. > > For example, in the standard controller we could use punctuation to > declare that an action receives a POST: > def update! # POST controller/update > def update # GET controller/update > def index # GET controller/index > > The standard controller can safely ignore _method (for those of us who > do not want emulated PUT and DELETE methods) > > We can put custom route rules nearer to the actions they effect > instead of in a separate file. > add_route ''test/:blah'', :index # assigns params[:blah] and routes > to test action > def test > end > Duane''s popular regular expression routing can be implemented in an > add_route-like controller class method, say, match_route. > > In the end this will be a tighter, more readable, and more natural API > than the current. Additionally this opportunity can be used to rid > Merb of the increasingly complex @compiled_statement construction in > route.rb and rely instead on a constructed data structure to match > against. I suspect performance can be increased. > > This is a largish change and I would like to create it in an > experimental branch to work on it. > > Thoughts? > > ryI''m not convinced that this would make things better. You''re welcome to work on a branch but I''m not going to merge it to trunk unless I think it''s a big improvement and I''m not seeing why this way would be better. I also don''t like the idea of two different types of controllers that are exactly the same and inherit from the same superclass but are different only in how they are routed to. I don''t want to discourage you from trying this out in a branch so go for it, but I have to reserve judgement until I see it or you show me a more complete mockup API. Cheers- -- Ezra Zygmuntowicz -- Founder & Ruby Hacker -- ez at engineyard.com -- Engine Yard, Serious Rails Hosting -- (866) 518-YARD (9273)
> I''m not going to merge it to trunk unless I > think it''s a big improvement and I''m not seeing why this way would be > better.Noted :) New branch created at /branches/ry_routes> I also don''t like the idea of two different types of > controllers that are exactly the same and inherit from the same > superclass but are different only in how they are routed to.Perhaps this distinction doesn''t need to be implemented as split classes, however I think from a user perspective it makes sense to think of resource controllers as fundamentally different than standard controllers - even if they share 99% of the code. ry
On Tue, Sep 18, 2007 at 03:14:14AM +0200, ry dahl wrote:> Some cute things can be done by letting the controller have control of > action routing. > > For example, in the standard controller we could use punctuation to > declare that an action receives a POST: > def update! # POST controller/update > def update # GET controller/update > def index # GET controller/indexI don''t understand why this can''t be done with "normal" routing, e.g. (untested) r.match(%r[/:controller(/::(/:id)?)?(\.:format)?], :method=>"post"). to(:action=>"[1]!") This is just a variant on default_routes> We can put custom route rules nearer to the actions they effect > instead of in a separate file. > add_route ''test/:blah'', :index # assigns params[:blah] and routes > to test action > def test > endIf the idea is to keep the route definition near to the controller, maybe it would be simpler just to allow the router to be re-opened in each controller source file (if it isn''t already), e.g. Merb::Router.prepare do |r| r.match("/whatever/test/:blah").to(:test) end class Whatever < Application def test end end However, the benefit of putting all the routing logic in one place is it makes it clear the order in which testing is made, to resolve ambiguities between routes.> In the end this will be a tighter, more readable, and more natural API > than the current. Additionally this opportunity can be used to rid > Merb of the increasingly complex @compiled_statement construction in > route.rb and rely instead on a constructed data structure to match > against. I suspect performance can be increased.I agree that the current linear search, i.e. a big if / elsif tree, is far from optimal. However I think it can be improved without moving the logic out of centralised routing. If I understand you correctly, you''re saying that all matches will be initially made against a fixed pattern /:controller/.*, and then the request is dispatched to the controller which performs its own sub-matching as required. By dispatching directly to a controller first, you avoid the linear search, which is a potentially a factor of N improvement when you have N controllers. Well, you could get the same benefit in the centralised routing model, by imposing an equivalent rule that all path regexps must start with /\w+. Bundle all the path routing rules into bins, perform a match on /(\w+), then perform a linear search under bins[$1]. That is, instead of having one @@compiled_statement, have multiple @@compiled_statements indexed by the first path component. (So it doesn''t make the complexity any less, I''m afraid!) However I suspect this can be generalised. One way I''m thinking is to combine all the path regexps into one mega path matching regexp, and let the regexp engine take care of it. You would group together all routing rules which have the same path regexp, and perform a linear search afterwards for the non-path tests. e.g. r.match("/admin/:action").to(www) r.match("/:flurble/:action", :method=>"delete").to(xxx) r.match("/:controller/:action", :method=>"post").to(yyy) r.match("/:controller/:action").to(zzz) The last three rules all match path /([^\/.,;?]+)/([^\/.,;?]+), and so you would compile to something like %r{ \A (/admin/([^\/.,;?]+)) | (/([^\/.,;?]+)/([^\/.,;?]+)) \z }x In the case of the first branch matching you''d dispatch to www, and in the case of the second branch matching you''d run compiled (pseudocode) if params[:method] == ''delete'' # set params[:flurble] = $1, params[:action] = $1 to(xxx) elsif params[:method] == ''post'' # set params[:controller] = $1, params[:action] = $1 to(yyy) else # set params[:controller] = $1, params[:action] = $1 to(zzz) end However I''m not sure if there''s a cheap way to identify which branch of a regexp has matched. If you have to do if $1 ... elsif $2 ... elsif $3 ... end then much of the benefit has been lost. And you also need to be sure of how ambiguous matching is resolved, e.g. in the above example /admin/foo matches the first regexp branch in preference to the second. Regards, Brian.
On Tue, Sep 18, 2007 at 12:01:07PM +0100, Brian Candler wrote:> However I suspect this can be generalised. One way I''m thinking is to > combine all the path regexps into one mega path matching regexp, and let the > regexp engine take care of it....> However I''m not sure if there''s a cheap way to identify which branch of a > regexp has matched.Here''s an outline of how I think it could be done. --- 8< --------------------------------------------------------------------- # Part one: show how it works manually RE = %r{\A(?: (?> ()/admin/(\w+)) | (?> ()/(\w+)/(\w+)) )\z}x puts RE.inspect ACTIONS = { 1 => proc { |m| puts "www: #{m.inspect}" }, 3 => proc { |m| puts "xxx/yyy/zzz: #{m.inspect}" }, } def test(patt) m = RE.match(patt).to_a puts m.inspect i = m.index("") ACTIONS[i].call(m[i+1..-1]) end test("/admin/wibble") test("/bibble/boing") puts "----" # Part two: proof of concept of automatic regexp construction class POC attr_reader :regexps, :re, :actions def initialize @regexps = [] # [[regexp,[action,action]], [regexp,[action,action], ...] end def match(regexp, action) regexp = Regexp.new(regexp) unless regexp.is_a? Regexp if entry = @regexps.find { |e| e[0] == regexp } entry[1] << action else @regexps << [regexp, [action]] end end def compile @re = %r{\A(?:#{@regexps.collect { |r| "(?>()#{r[0]})" }.join("|")})\z} @actions = {} i = 1 @regexps.each do |r,a| @actions[i] = a i += 1 + r.source.scan(%r{\((?!\?)}).size # ergh, count number of captures end end def run(patt, request={}) m = @re.match(patt).to_a i = m.index("") @actions[i].each { |a| res = a.call(request, m[i+1..-1]); return res if res } nil end end r = POC.new r.match(%r{/admin/(\w+)}, proc { |req,path| {:action=>"www"} }) r.match(%r{/(\w+)/(\w+)}, proc { |req,path| {:action=>"xxx"} if req[:method]=="delete" }) r.match(%r{/(\w+)/(\w+)}, proc { |req,path| {:action=>"yyy"} if req[:method]=="post" }) r.match(%r{/(\w+)/(\w+)}, proc { |req,path| {:action=>"zzz"} }) p r.regexps r.compile p r.re p r.actions p r.run("/admin/wibble", :method=>"get") p r.run("/bibble/boing", :method=>"delete") p r.run("/bibble/boing", :method=>"post") p r.run("/bibble/boing", :method=>"get") --- 8< --------------------------------------------------------------------- Seem feasible? Other things to do: - create procs automatically for the simple ''non-deferred'' routing, which is easy since procs are first-class objects. No eval required :-) - map m[0], m[1], m[2]... into the named params fields like :controller, :action etc if the original regexp had :foo placeholders Regards, Brian.
On Tue, Sep 18, 2007 at 01:17:41PM +0100, Brian Candler wrote:> Here''s an outline of how I think it could be done.Incidentally, there''s some scope for further optimisation here. Given that the applications with large numbers of routes are likely to be resource-based, e.g. /foos /foos/:id /foos/new /foos/:id/edit /foos/:id/:action then it may be worth reflecting this in the structure of the composed regexp by matching the shared prefix only once. Trying to combine adjacent regexps automatically may be hard, but resource routing could explicitly build regexps this way. The speed gains may not be significant in the bigger picture: $ cat speed.rb require ''benchmark'' include Benchmark re1 = "(?: () /)\n" re2 = "(?: () /)\n" ["things","bars","bazs","xyzs","abcs","foos","persons","admins"].each do |r| re1 << <<EOS | (?: () /#{r}) | (?: () /#{r}/(\\d+)) | (?: () /#{r}/new) | (?: () /#{r}/(\\d+)/edit) | (?: () /#{r}/(\\d+)/(\\w+)) EOS re2 << <<EOS | (/#{r} ( (?: () ) | (?: () /new) | (?: () (/\\d+) ( (?: () ) | (?: () /edit) | (?: () /(\\w+)) )) )) EOS end # Naive regexp RE1 = %r{^(?:#{re1})$}x # Prefix-optimised regexp RE2 = %r{^(?:#{re2})$}x N = 10_000 bm(6) do |test| test.report("Match 1:") { N.times { raise unless RE1 =~ "/foos/123/edit" } } test.report("Match 2:") { N.times { raise unless RE2 =~ "/foos/123/edit" } } test.report("Mismatch 1:") { N.times { raise if RE1 =~ "/flurble/baz" } } test.report("Mismatch 2:") { N.times { raise if RE2 =~ "/flurble/baz" } } end $ ruby speed.rb user system total real Match 1: 1.060000 0.000000 1.060000 ( 1.069919) Match 2: 0.620000 0.000000 0.620000 ( 0.620281) Mismatch 1: 0.640000 0.010000 0.650000 ( 0.636095) Mismatch 2: 0.150000 0.000000 0.150000 ( 0.156758) This is on a small laptop (Pentium M 1.1GHz), and shows an improvement of about 44 microseconds per match when matching the sixth out of eight resources. It would be interesting to see how this compares with current routing strategy though. Regards, Brian.
On Tue, Sep 18, 2007 at 01:17:41PM +0100, Brian Candler wrote:> Here''s an outline of how I think it could be done.This was broken as I was using (?> ...), which permanently eats characters and then won''t allow the data to match a different branch. Fixed version attached, with some other cleanups. Incidentally, instead of writing (?: () ...branch1... ) | (?: () ...branch2... ) it would be simpler to write ( ...branch1... ) | ( ...branch2... ) since the outer brackets also capture the entire match, which gives a suitable non-nil sentinel to mark the start of the branch. However this way means I need a fast way to detect the first non-nil element in the match array (rather than the first empty string element) and I couldn''t think of a way to do that faster than i = (0...m.size).find { |x| m[x] } Also, I decided to anchor the whole expression, which means the individual branch regexps don''t need to be anchored, but instead we could insist that the branches need to have their own anchors (which I think is the current behaviour) Regards, Brian. -------------- next part -------------- class POCRouter class Path attr_reader :regexp, :num_captures, :blocks def initialize(regexp, blocks=[]) @regexp = regexp @num_captures = @regexp.source.scan(%r{\((?!\?)}).size # FIXME, e.g. \( or [(] @blocks = blocks end end attr_reader :paths, :re, :capindex def initialize @paths = [] end def match(regexp, &blk) regexp = Regexp.new(regexp) unless regexp.is_a? Regexp if path = @paths.find { |p| p.regexp == regexp } path.blocks << blk else @paths << Path.new(regexp, [blk]) end end def compile @re = %r{\A(?: #{@paths.collect { |p| "(?: () #{p.regexp})" }.join("|\n")} )\z}x # extended RE is easier to read for debugging @capindex = {} i = 0 @paths.each do |p| @capindex[i] = p i += p.num_captures + 1 end end def inspect (@re || @paths).inspect end def run(path, request={}) m = @re.match(path).captures i = m.index("") || (return nil) path = capindex[i] path.blocks.each do |blk| res = blk.call(request, m[i+1, path.num_captures]) return res if res end nil end end r = POCRouter.new r.match(%r{/admin/(\w+)}) { |req,m| p m; {:action=>"www"} } r.match(%r{/(\w+)/(\w+)}) { |req,m| p m; {:action=>"xxx"} if req[:method]=="delete" } r.match(%r{/(\w+)/(\w+)}) { |req,m| {:action=>"yyy"} if req[:method]=="post" } r.match(%r{/(\w+)/(\w+)}) { |req,m| {:action=>"zzz"} } p r r.compile p r p r.run("/admin/wibble", :method=>"get") p r.run("/bibble/boing", :method=>"delete") p r.run("/bibble/boing", :method=>"post") p r.run("/bibble/boing", :method=>"get")
On Sep 18, 2007, at 6:55 AM, Brian Candler wrote:> On Tue, Sep 18, 2007 at 01:17:41PM +0100, Brian Candler wrote: >> Here''s an outline of how I think it could be done. > > Incidentally, there''s some scope for further optimisation here. > Given that > the applications with large numbers of routes are likely to be > resource-based, e.g. > > /foos > /foos/:id > /foos/new > /foos/:id/edit > /foos/:id/:action > > then it may be worth reflecting this in the structure of the > composed regexp > by matching the shared prefix only once. Trying to combine adjacent > regexps > automatically may be hard, but resource routing could explicitly build > regexps this way. > > The speed gains may not be significant in the bigger picture: > > $ cat speed.rb > require ''benchmark'' > include Benchmark > > re1 = "(?: () /)\n" > re2 = "(?: () /)\n" > ["things","bars","bazs","xyzs","abcs","foos","persons","admins"].each > do |r| > re1 << <<EOS > | (?: () /#{r}) > | (?: () /#{r}/(\\d+)) > | (?: () /#{r}/new) > | (?: () /#{r}/(\\d+)/edit) > | (?: () /#{r}/(\\d+)/(\\w+)) > EOS > re2 << <<EOS > | (/#{r} ( > (?: () ) | > (?: () /new) | > (?: () (/\\d+) ( > (?: () ) | > (?: () /edit) | > (?: () /(\\w+)) > )) > )) > EOS > end > > # Naive regexp > RE1 = %r{^(?:#{re1})$}x > # Prefix-optimised regexp > RE2 = %r{^(?:#{re2})$}x > > N = 10_000 > > bm(6) do |test| > test.report("Match 1:") { N.times { raise unless RE1 =~ "/foos/ > 123/edit" } } > test.report("Match 2:") { N.times { raise unless RE2 =~ "/foos/ > 123/edit" } } > test.report("Mismatch 1:") { N.times { raise if RE1 =~ "/flurble/ > baz" } } > test.report("Mismatch 2:") { N.times { raise if RE2 =~ "/flurble/ > baz" } } > end > $ ruby speed.rb > user system total real > Match 1: 1.060000 0.000000 1.060000 ( 1.069919) > Match 2: 0.620000 0.000000 0.620000 ( 0.620281) > Mismatch 1: 0.640000 0.010000 0.650000 ( 0.636095) > Mismatch 2: 0.150000 0.000000 0.150000 ( 0.156758) > > This is on a small laptop (Pentium M 1.1GHz), and shows an > improvement of > about 44 microseconds per match when matching the sixth out of eight > resources. > > It would be interesting to see how this compares with current routing > strategy though. > > Regards, > > Brian.I definitely want to keep all the routes in the router.rb file, I just think having them in one place instead of spread out is much better and more explicit in ordering etc. In the big picture of a merb dispatch the current routers compiled_statement is not even close to being a significant slice of the time spent dispatching a request. I don''t have any problems with trying to optimize it further but I don''t think it will gain us that much in the overall picture. I''m always happy to be proven wrong though ;) Cheers- -- Ezra Zygmuntowicz -- Founder & Ruby Hacker -- ez at engineyard.com -- Engine Yard, Serious Rails Hosting -- (866) 518-YARD (9273)
On Tue, Sep 18, 2007 at 01:45:58PM -0700, Ezra Zygmuntowicz wrote:> In the big picture of a > merb dispatch the current routers compiled_statement is not even > close to being a significant slice of the time spent dispatching a > request.Good point. I''m not sure what the current best way to profile merb is, so I hacked something very simple: --- /usr/lib/ruby/gems/1.8/gems/merb-0.4.0/lib/merb/server.rb.orig 2007-09-19 08:43:28.000000000 +0100 +++ /usr/lib/ruby/gems/1.8/gems/merb-0.4.0/lib/merb/server.rb 2007-09-19 08:47:03.000000000 +0100 @@ -439,6 +439,8 @@ start(@@merb_opts[:port]) else trap(''TERM'') { exit } + trap("USR1") { require ''profiler''; Profiler__::start_profile } + trap("USR2") { Profiler__::stop_profile; Profiler__::print_profile($stderr) } mongrel_start(@@merb_opts[:port]) end I tested with the following app, run in production mode (merb -e production): Merb::Router.prepare do |r| r.resources :foos r.resources :bars r.resources :bazs r.default_routes end class Hello < Application def world "Hello, world!" end end I then hit it 100 times on /hello/world, sent -USR1, hit it 100 times on /hello/world again, and sent -USR2. Here are the results: % cumulative self self total time seconds seconds calls ms/call ms/call name 5.41 0.84 0.84 1000 0.84 6.98 Logger#add 5.15 1.64 0.80 900 0.89 4.10 MonitorMixin.synchronize 3.41 2.17 0.53 900 0.59 0.79 Logger::Formatter#format_datetime 3.03 2.64 0.47 900 0.52 1.78 MonitorMixin.mon_exit 2.83 3.08 0.44 900 0.49 0.68 MonitorMixin.mon_release 2.25 3.43 0.35 900 0.39 0.43 Logger::Formatter#msg2str 2.12 3.76 0.33 900 0.37 0.40 Logger::LogDevice#check_shift_log 2.12 4.09 0.33 900 0.37 0.74 MonitorMixin.mon_enter 2.06 4.41 0.32 99 3.23 139.39 TCPServer#accept 2.00 4.72 0.31 100 3.10 4.80 Merb::Router#match 1.93 5.02 0.30 100 3.00 137.10 MerbHandler#process 1.93 5.32 0.30 3877 0.08 0.15 Hash#[] 1.93 5.62 0.30 900 0.33 1.77 Logger::Formatter#call 1.87 5.91 0.29 400 0.73 1.30 Hash#inspect 1.61 6.16 0.25 1900 0.13 0.14 Time#now 1.42 6.38 0.22 900 0.24 0.31 MonitorMixin.mon_check_owner 1.42 6.60 0.22 3600 0.06 0.06 Thread#critical 1.35 6.81 0.21 900 0.23 2.02 Logger#format_message 1.35 7.02 0.21 400 0.52 0.52 String#split 1.29 7.22 0.20 100 2.00 148.80 Mongrel::HttpServer#process_client 1.29 7.42 0.20 700 0.29 0.33 Mash#convert_key 1.29 7.62 0.20 900 0.22 0.24 MonitorMixin.mon_acquire 1.22 7.81 0.19 300 0.63 1.37 Hash#each_pair 1.22 8.00 0.19 300 0.63 0.73 Mongrel::HeaderOut#[] 1.16 8.18 0.18 2200 0.08 0.10 String#% 1.16 8.36 0.18 200 0.90 2.90 Mongrel::DirHandler#can_serve 1.09 8.53 0.17 1800 0.09 0.09 Array#shift 1.03 8.69 0.16 600 0.27 0.62 Class#read_inheritable_attribute 0.97 8.84 0.15 200 0.75 0.75 File#expand_path 0.97 8.99 0.15 1600 0.09 0.65 Class#new 0.97 9.14 0.15 100 1.50 2.70 Mongrel::HttpRequest#initialize 0.97 9.29 0.15 100 1.50 88.80 Merb::Dispatcher#handle 0.97 9.44 0.15 200 0.75 1.95 Merb::Request#query_parse 0.90 9.58 0.14 1800 0.08 0.08 Hash#default 0.90 9.72 0.14 78 1.79 19.87 Kernel.sleep 0.84 9.85 0.13 800 0.16 0.96 Merb::Request#route_params 0.77 9.97 0.12 200 0.60 0.80 Mash#default 0.77 10.09 0.12 800 0.15 0.78 Merb::Request#route_match 0.71 10.20 0.11 600 0.18 0.18 Class#inheritable_attributes 0.71 10.31 0.11 900 0.12 0.17 Logger#format_severity 0.64 10.41 0.10 1800 0.06 0.06 NilClass#nil? 0.64 10.51 0.10 1000 0.10 0.10 String#inspect 0.64 10.61 0.10 1300 0.08 0.08 Fixnum#= 0.64 10.71 0.10 900 0.11 0.11 Thread#pass 0.64 10.81 0.10 300 0.33 4.93 Merb::Request#params 0.64 10.91 0.10 106 0.94 1.60 Array#map 0.64 11.01 0.10 100 1.00 1.50 Merb::Request#path 0.64 11.11 0.10 200 0.50 0.70 Mongrel::HttpRequest#unescape 0.58 11.20 0.09 300 0.30 0.57 Merb::Request#controller_name 0.58 11.29 0.09 400 0.23 1.35 Merb::AbstractController#before_filters 0.58 11.38 0.09 100 0.90 4.00 Kernel.catch 0.58 11.47 0.09 753 0.12 18.46 Array#each 0.58 11.56 0.09 1200 0.08 0.08 Kernel.respond_to? 0.58 11.65 0.09 900 0.10 4.29 Logger::LogDevice#write 0.58 11.74 0.09 1300 0.07 0.07 IO#write 0.52 11.82 0.08 100 0.80 1.20 String#to_const_string 0.52 11.90 0.08 400 0.20 0.25 Mash#convert_value 0.52 11.98 0.08 100 0.80 0.80 Thread#initialize 0.52 12.06 0.08 400 0.20 0.28 Mongrel::HttpResponse#write 0.52 12.14 0.08 400 0.20 0.52 Merb::Controller#_session_id_key 0.52 12.22 0.08 100 0.80 5.60 Merb::Request#controller_class 0.52 12.30 0.08 1453 0.06 0.06 Kernel.nil? 0.45 12.37 0.07 400 0.18 0.88 Merb::AbstractController#after_filters 0.45 12.44 0.07 1800 0.04 0.04 Module#== 0.45 12.51 0.07 100 0.70 0.70 Fixnum#> 0.45 12.58 0.07 300 0.23 0.33 Array#include? 0.45 12.65 0.07 900 0.08 7.90 Logger#info 0.45 12.72 0.07 800 0.09 0.09 Symbol#inspect 0.45 12.79 0.07 200 0.35 1.00 Mongrel::HttpResponse#send_status 0.45 12.86 0.07 1000 0.07 0.07 Time#strftime 0.39 12.92 0.06 1000 0.06 0.06 String#<< 0.39 12.98 0.06 100 0.60 1.80 Hash#each 0.39 13.04 0.06 100 0.60 2.80 Mongrel::HttpResponse#initialize 0.39 13.10 0.06 200 0.30 0.35 Enumerable.inject 0.39 13.16 0.06 200 0.30 1.30 Merb::Request#body_and_query_params 0.39 13.22 0.06 300 0.20 0.20 Hash#key? 0.39 13.28 0.06 100 0.60 2.50 Object#full_const_get 0.39 13.34 0.06 1100 0.05 0.05 Fixnum#- 0.39 13.40 0.06 100 0.60 7.00 Merb::AbstractController#dispatch 0.39 13.46 0.06 1800 0.03 0.03 Thread#current 0.32 13.51 0.05 100 0.50 1.30 Merb::CookieStore#initialize 0.32 13.56 0.05 100 0.50 0.70 Mongrel::HttpParser#execute 0.32 13.61 0.05 100 0.50 2.80 Merb::Controller#set_dispatch_variables 0.32 13.66 0.05 200 0.25 0.80 Hash#to_mash 0.32 13.71 0.05 200 0.25 0.40 Mash#initialize 0.32 13.76 0.05 100 0.50 39.90 Merb::Controller#dispatch 0.26 13.80 0.04 400 0.10 0.20 Merb::Request#content_type 0.26 13.84 0.04 600 0.07 0.07 String#= 0.26 13.88 0.04 100 0.40 1.00 Time#httpdate 0.26 13.92 0.04 100 0.40 1.30 Merb::Request#query_params 0.26 13.96 0.04 200 0.20 0.20 String#tr 0.26 14.00 0.04 1100 0.04 0.04 Fixnum#< 0.26 14.04 0.04 100 0.40 0.40 Merb::AbstractController#default_thrown_content 0.26 14.08 0.04 1100 0.04 0.04 Array#[] ... snip everything below 0.25% 0.00 15.53 0.00 1 0.00 15530.00 #toplevel It looks like logging is the worst offender, with Logger#add and its mutex (MonitorMixin) accounting for the top 8 slots, 4.09 out of 15.53 seconds, or 26%. In fact, it looks like Logger#add and its children, at 6.98ms per call and called 1000 times, count for 6.98s in total. TCPAccept is next, and Merb::Router#match is third. However this only accounts for 2.00% Now, I''ve seen figures for Merb performance typically in the 400-600 requests per second range. Are these measured with logging turned off? If not, I think there is a strong argument for introducing a mutex-free logging system. (Simple idea: replace Logger#add with an append to a string buffer, which is atomic. Then once per second empty the buffer using String#slice! which should also be atomic, and write that to the log) Finally, I repeated the test using merb -e production -l WARN % cumulative self self total time seconds seconds calls ms/call ms/call name 4.62 0.42 0.42 1000 0.42 0.51 Logger#add 3.85 0.77 0.35 100 3.50 6.00 Merb::Router#match 3.74 1.11 0.34 100 3.40 10.30 IO#close 3.19 1.40 0.29 100 2.90 79.70 Mongrel::HttpServer#process_client 3.19 1.69 0.29 100 2.90 71.40 MerbHandler#process 2.75 1.94 0.25 900 0.28 0.81 Logger#info 2.42 2.16 0.22 1600 0.14 0.86 Class#new 2.31 2.37 0.21 600 0.35 0.50 Class#read_inheritable_attribute 2.09 2.56 0.19 400 0.48 0.73 Hash#inspect 2.09 2.75 0.19 400 0.47 0.47 String#split 2.09 2.94 0.19 3742 0.05 0.11 Hash#[] 1.98 3.12 0.18 738 0.24 10.15 Array#each 1.98 3.30 0.18 100 1.80 2.00 Time#httpdate 1.65 3.45 0.15 400 0.38 1.23 Merb::AbstractController#before_filters 1.65 3.60 0.15 300 0.50 5.93 Merb::Request#params 1.65 3.75 0.15 100 1.50 1.90 Merb::Request#json_params 1.54 3.89 0.14 100 1.40 2.10 Mongrel::HttpRequest#initialize 1.54 4.03 0.14 1000 0.14 0.17 Time#now 1.43 4.16 0.13 200 0.65 1.40 Hash#to_mash 1.43 4.29 0.13 100 1.30 55.30 Merb::Dispatcher#handle 1.32 4.41 0.12 100 1.20 17.30 Merb::Controller#dispatch 1.32 4.53 0.12 700 0.17 0.24 Mash#convert_key 1.32 4.65 0.12 300 0.40 0.80 Mongrel::HeaderOut#[] 1.21 4.76 0.11 100 1.10 3.90 Merb::Controller#build 1.10 4.86 0.10 800 0.13 1.03 Merb::Request#route_params 1.10 4.96 0.10 200 0.50 0.55 Mongrel::HttpResponse#send_header 0.99 5.05 0.09 300 0.30 1.00 Hash#each_pair 0.99 5.14 0.09 100 0.90 1.10 Merb::Request#method 0.99 5.23 0.09 55 1.64 24.73 Kernel.sleep 0.88 5.31 0.08 200 0.40 0.40 File#expand_path 0.88 5.39 0.08 400 0.20 0.20 String#% 0.88 5.47 0.08 200 0.40 0.70 Mash#initialize 0.88 5.55 0.08 200 0.40 0.55 Mash#include? 0.77 5.62 0.07 100 0.70 6.10 Merb::AbstractController#dispatch 0.77 5.69 0.07 100 0.70 4.90 Mongrel::HttpResponse#initialize 0.77 5.76 0.07 200 0.35 1.20 Mongrel::DirHandler#can_serve 0.77 5.83 0.07 1800 0.04 0.04 Hash#default 0.77 5.90 0.07 800 0.09 0.85 Merb::Request#route_match 0.77 5.97 0.07 100 0.70 1.60 Merb::Controller#set_dispatch_variables 0.66 6.03 0.06 400 0.15 0.62 Merb::AbstractController#after_filters 0.66 6.09 0.06 100 0.60 0.60 StringScanner#scan 0.66 6.15 0.06 100 0.60 0.70 Object#blank? 0.66 6.21 0.06 200 0.30 2.25 Merb::Request#query_parse 0.66 6.27 0.06 200 0.30 0.30 String#gsub 0.66 6.33 0.06 400 0.15 0.68 Merb::Controller#_session_id_key 0.66 6.39 0.06 1438 0.04 0.04 Kernel.nil? 0.66 6.45 0.06 1000 0.06 0.06 Hash#[] 0.55 6.50 0.05 99 0.51 52.63 TCPServer#accept 0.55 6.55 0.05 200 0.25 1.40 Merb::Request#cookies 0.55 6.60 0.05 900 0.06 0.06 Array#last 0.55 6.65 0.05 100 0.50 1.20 Mongrel::HttpResponse#finished 0.55 6.70 0.05 100 0.50 5.70 Merb::SessionMixin.setup_session 0.55 6.75 0.05 76 0.66 1.18 Array#map 0.55 6.80 0.05 400 0.13 0.22 Mash#convert_value 0.55 6.85 0.05 300 0.17 0.23 StringIO#initialize 0.55 6.90 0.05 400 0.13 0.22 Mongrel::HttpResponse#write 0.55 6.95 0.05 400 0.13 0.27 Merb::Request#content_type 0.55 7.00 0.05 100 0.50 4.70 Merb::Request#controller_class 0.55 7.05 0.05 100 0.50 2.60 Object#full_const_get 0.55 7.10 0.05 200 0.25 0.25 String#to_s 0.55 7.15 0.05 100 0.50 1.00 Thread#new 0.55 7.20 0.05 100 0.50 0.50 Merb::Controller#status ... snip everything below 0.5% 0.00 9.09 0.00 1 0.00 9090.00 #toplevel This is more interesting: even though nothing is actually logged, because of the large number of calls to Logger#add it still accounts for 4.62%, with another 2.75% for Logger#info and 1.54% for Time#now (the fact it has been called 1000 times suggests that timestamps are being made for log entries which are never written!) Merb::Router#match is now the number two offender at 3.85%, but this is probably still not enough by itself to warrant a rewrite. Regards, Brian.
OK, I hacked together a simple replacement logger, and this is what I get with logging *enabled* (merb -e production) % cumulative self self total time seconds seconds calls ms/call ms/call name 3.69 0.34 0.34 100 3.40 76.60 MerbHandler#process 3.25 0.64 0.30 800 0.38 0.73 Merb::Logger#info 3.15 0.93 0.29 3724 0.08 0.20 Hash#[] 3.04 1.21 0.28 400 0.70 1.47 Hash#inspect 2.93 1.48 0.27 300 0.90 2.10 Hash#each_pair 2.60 1.72 0.24 52 4.62 43.27 Kernel.sleep 2.60 1.96 0.24 200 1.20 1.75 Mongrel::HttpResponse#send_status 2.49 2.19 0.23 400 0.58 1.02 Merb::AbstractController#after_filters 2.17 2.39 0.20 100 2.00 2.60 Mongrel::HttpRequest#initialize 2.17 2.59 0.20 100 2.00 25.80 IO#close 2.06 2.78 0.19 1600 0.12 0.73 Class#new 2.06 2.97 0.19 100 1.90 4.60 Merb::Request#query_params 1.95 3.15 0.18 200 0.90 1.60 Mash#include? 1.95 3.33 0.18 400 0.45 0.45 String#split 1.84 3.50 0.17 100 1.70 8.50 Merb::AbstractController#dispatch 1.63 3.65 0.15 700 0.21 0.29 Mash#convert_key 1.52 3.79 0.14 100 1.40 19.90 Merb::Controller#dispatch 1.52 3.93 0.14 100 1.40 70.10 Mongrel::HttpServer#process_client 1.41 4.06 0.13 100 1.30 2.80 Merb::Router#match 1.41 4.19 0.13 800 0.16 0.59 Merb::Request#route_params 1.41 4.32 0.13 300 0.43 0.67 Mongrel::HeaderOut#[] 1.30 4.44 0.12 300 0.40 0.40 Hash#key? 1.30 4.56 0.12 400 0.30 0.30 NilClass#inspect 1.19 4.67 0.11 400 0.28 0.28 Mash#convert_value 1.19 4.78 0.11 1300 0.08 0.08 Kernel.class 1.19 4.89 0.11 900 0.12 0.12 String#[] 1.19 5.00 0.11 100 1.10 56.20 Merb::Dispatcher#handle 1.08 5.10 0.10 400 0.25 0.70 Merb::AbstractController#before_filters 1.08 5.20 0.10 200 0.50 2.10 Mash#default ... snip less than 1% 0.00 9.22 0.00 1 0.00 9220.00 #toplevel Performance test using ab -n 1000 http://localhost:4000/hello/world [Before patch] Requests per second: 276.78 [#/sec] (mean) Time per request: 3.613 [ms] (mean) [After patch] Requests per second: 401.49 [#/sec] (mean) Time per request: 2.491 [ms] (mean) This is on a 1.1GHz Pentium M. It would be interesting to see what people with higher-performance servers achieve. Cheers, Brian. P.S. In this code, a single write is done at the end of each request. I''ve not protected it with a mutex, because I think that IO#write is atomic, but I may be wrong. More performance could perhaps be achieved by not flushing after every request, but by doing this in a background thread. However I think it might make more sense if we''re going this route to give each request its own log buffer, so that logs don''t get interspersed from overlapping requests in a multi-threaded application. Also, I notice that the logger class and initialisation is currently hard-coded into merb.rb; perhaps it would be better to make this configurable. -------------- next part -------------- --- lib/merb.rb.orig 2007-09-19 09:55:24.000000000 +0100 +++ lib/merb.rb 2007-09-19 09:56:37.000000000 +0100 @@ -10,7 +10,7 @@ end require ''fileutils'' require ''merb/erubis_ext'' -require ''logger'' +require ''merb/logger'' require ''json'' require ''set'' autoload :MerbUploadHandler, ''merb/upload_handler'' @@ -67,8 +67,8 @@ MERB_VIEW_ROOT = MERB_ROOT / "app/views" MERB_SKELETON_DIR = File.join(MERB_FRAMEWORK_ROOT, ''../app_generators/merb/templates'') logpath = $TESTING ? "#{MERB_ROOT}/merb_test.log" : "#{MERB_ROOT}/log/merb.#{Merb::Server.config[:port]}.log" -MERB_LOGGER = Logger.new(logpath) -MERB_LOGGER.level = Logger.const_get(Merb::Server.config[:log_level].upcase) rescue Logger::INFO +MERB_LOGGER = Merb::Logger.new(logpath) +MERB_LOGGER.level = Merb::Logger.const_get(Merb::Server.config[:log_level].upcase) rescue Merb::Logger::INFO MERB_PATHS = [ "/app/models/**/*.rb", "/app/controllers/application.rb", @@ -82,4 +82,4 @@ if $TESTING test_files = File.join(lib, ''test'', ''*.rb'') Dir[test_files].each { |file| require file } -end \ No newline at end of file +end --- lib/merb/mongrel_handler.rb.orig 2007-09-16 19:32:04.000000000 +0100 +++ lib/merb/mongrel_handler.rb 2007-09-19 10:10:39.000000000 +0100 @@ -145,8 +145,8 @@ total_request_time = Time.now - start benchmarks[:total_request_time] = total_request_time - MERB_LOGGER.info("Request Times: #{benchmarks.inspect}") - MERB_LOGGER.info("Response status: #{response.status}\nComplete Request took: #{total_request_time} seconds, #{1.0/total_request_time} Requests/Second\n\n") + MERB_LOGGER.info("Request Times: #{benchmarks.inspect}\nResponse status: #{response.status}\nComplete Request took: #{total_request_time} seconds, #{1.0/total_request_time} Requests/Second\n\n") + MERB_LOGGER.flush end rescue Object => e # if an exception is raised here then something is --- lib/merb/logger.rb.orig 2007-09-19 10:14:33.000000000 +0100 +++ lib/merb/logger.rb 2007-09-19 10:06:37.000000000 +0100 @@ -0,0 +1,70 @@ +module Merb + # Logging is called several times for each request, so keep it short and + # sweet. Use << for atomic add to buffer, and slice! for atomic removal. + + class Logger + module Severity + DEBUG = 0 + INFO = 1 + WARN = 2 + ERROR = 3 + FATAL = 4 + UNKNOWN = 5 + end + include Severity + + attr_accessor :level + attr_reader :buf + def initialize(log) + @level = level + @buf = "" + if log.respond_to?(:write) + @log = log + elsif File.exist?(log) + @log = open(log, (File::WRONLY | File::APPEND)) + @log.sync = true + else + @log = open(log, (File::WRONLY | File::APPEND | File::CREAT)) + @log.sync = true + @log.write("# Logfile created on %s" % [Time.now.to_s]) + end + end + + def flush + return if @buf.size == 0 + @log.write @buf.slice!(0..-1) + end + + def close + flush + @log.close if @log.respond_to? :close + @log = nil + end + + def debug(msg) + return if @level > DEBUG + msg << "\n" unless msg[-1] == ?\n + @buf << msg + end + def info(msg) + return if @level > INFO + msg << "\n" unless msg[-1] == ?\n + @buf << msg + end + def warn(msg) + return if @level > WARN + msg << "\n" unless msg[-1] == ?\n + @buf << msg + end + def error(msg) + return if @level > ERROR + msg << "\n" unless msg[-1] == ?\n + @buf << msg + end + def error(msg) + return if @level > FATAL + msg << "\n" unless msg[-1] == ?\n + @buf << msg + end + end +end
Hi~ On Sep 19, 2007, at 2:36 AM, Brian Candler wrote:> OK, I hacked together a simple replacement logger, and this is what > I get > with logging *enabled* (merb -e production) > > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 3.69 0.34 0.34 100 3.40 76.60 > MerbHandler#process > 3.25 0.64 0.30 800 0.38 0.73 > Merb::Logger#info > 3.15 0.93 0.29 3724 0.08 0.20 Hash#[] > 3.04 1.21 0.28 400 0.70 1.47 Hash#inspect > 2.93 1.48 0.27 300 0.90 2.10 Hash#each_pair > 2.60 1.72 0.24 52 4.62 43.27 Kernel.sleep > 2.60 1.96 0.24 200 1.20 1.75 > Mongrel::HttpResponse#send_status > 2.49 2.19 0.23 400 0.58 1.02 > Merb::AbstractController#after_filters > 2.17 2.39 0.20 100 2.00 2.60 > Mongrel::HttpRequest#initialize > 2.17 2.59 0.20 100 2.00 25.80 IO#close > 2.06 2.78 0.19 1600 0.12 0.73 Class#new > 2.06 2.97 0.19 100 1.90 4.60 > Merb::Request#query_params 1.95 3.15 0.18 200 > 0.90 1.60 Mash#include? > 1.95 3.33 0.18 400 0.45 0.45 String#split > 1.84 3.50 0.17 100 1.70 8.50 > Merb::AbstractController#dispatch > 1.63 3.65 0.15 700 0.21 0.29 Mash#convert_key > 1.52 3.79 0.14 100 1.40 19.90 > Merb::Controller#dispatch > 1.52 3.93 0.14 100 1.40 70.10 > Mongrel::HttpServer#process_client > 1.41 4.06 0.13 100 1.30 2.80 > Merb::Router#match > 1.41 4.19 0.13 800 0.16 0.59 > Merb::Request#route_params 1.41 4.32 0.13 300 > 0.43 0.67 Mongrel::HeaderOut#[]> 1.30 4.44 0.12 300 0.40 0.40 Hash#key? > 1.30 4.56 0.12 400 0.30 0.30 NilClass#inspect > 1.19 4.67 0.11 400 0.28 0.28 > Mash#convert_value > 1.19 4.78 0.11 1300 0.08 0.08 Kernel.class > 1.19 4.89 0.11 900 0.12 0.12 String#[] > 1.19 5.00 0.11 100 1.10 56.20 > Merb::Dispatcher#handle > 1.08 5.10 0.10 400 0.25 0.70 > Merb::AbstractController#before_filters > 1.08 5.20 0.10 200 0.50 2.10 Mash#default > ... snip less than 1% > 0.00 9.22 0.00 1 0.00 9220.00 #toplevel > > Performance test using ab -n 1000 http://localhost:4000/hello/world > > [Before patch] > > Requests per second: 276.78 [#/sec] (mean) > Time per request: 3.613 [ms] (mean) > > [After patch] > > Requests per second: 401.49 [#/sec] (mean) > Time per request: 2.491 [ms] (mean) > > This is on a 1.1GHz Pentium M. It would be interesting to see what > people > with higher-performance servers achieve. > > Cheers, > > Brian. > > P.S. In this code, a single write is done at the end of each > request. I''ve > not protected it with a mutex, because I think that IO#write is > atomic, but > I may be wrong. > > More performance could perhaps be achieved by not flushing after every > request, but by doing this in a background thread. However I think > it might > make more sense if we''re going this route to give each request its > own log > buffer, so that logs don''t get interspersed from overlapping > requests in a > multi-threaded application. > > Also, I notice that the logger class and initialisation is currently > hard-coded into merb.rb; perhaps it would be better to make this > configurable. > <newlog.diff>Nice one. We were actually just talking about fixing the logger the other day after some profiling and I was going to make a logger that collected up the strings and then output them once per request. I;ll take a look at your patch and will probably make the logger configurable so we can test different loggers. Cheers- -- Ezra Zygmuntowicz -- Founder & Ruby Hacker -- ez at engineyard.com -- Engine Yard, Serious Rails Hosting -- (866) 518-YARD (9273)