thr3ads.net - Merb devel - Routes [Sep 2007]

If this information is useful, please help other people find it:
Share via:

ry dahl

2007-Sep-18 01:14 UTC

Routes

hi all,

I want to move some routing tasks out of the router and into the
controller. The goal is to make Merb feel less like mod_rewrite and
give the user more control at the controller. The new Router is
simple: it takes the path_info (not the whole request) then outputs a
controller class and some parameters from the path matching. The rest
of the routing would be done at the controller level. The controller
is ultimately in charge of determining which action to call. In
particular, only the controller gets to see the http method.

Because resource controllers and normal controllers have very
different routing rules, I think they should be different classes. I''d
split them into two classes: Merb::Controllers::Standard and
Merb::Controllers::Resource. These would both be subclasses of
Controller, and would implement the method Controller#route
Users would create controllers by subclassing one of these two.

Some cute things can be done by letting the controller have control of
action routing.

For example, in the standard controller we could use punctuation to
declare that an action receives a POST:
  def update! # POST controller/update
  def update  # GET controller/update
  def index    # GET controller/index

The standard controller can safely ignore _method (for those of us who
do not want emulated PUT and DELETE methods)

We can put custom route rules nearer to the actions they effect
instead of in a separate file.
  add_route ''test/:blah'', :index  # assigns params[:blah] and
routes
to test action
  def test
  end
Duane''s popular regular expression routing can be implemented in an
add_route-like controller class method, say, match_route.

In the end this will be a tighter, more readable, and more natural API
than the current. Additionally this opportunity can be used to rid
Merb of the increasingly complex @compiled_statement construction in
route.rb and rely instead on a constructed data structure to match
against. I suspect performance can be increased.

This is a largish change and I would like to create it in an
experimental branch to work on it.

Thoughts?

ry

Ezra Zygmuntowicz

2007-Sep-18 02:13 UTC

head link

Routes

Hi~

On Sep 17, 2007, at 6:14 PM, ry dahl wrote:
> hi all,
>
> I want to move some routing tasks out of the router and into the
> controller. The goal is to make Merb feel less like mod_rewrite and
> give the user more control at the controller. The new Router is
> simple: it takes the path_info (not the whole request) then outputs a
> controller class and some parameters from the path matching. The rest
> of the routing would be done at the controller level. The controller
> is ultimately in charge of determining which action to call. In
> particular, only the controller gets to see the http method.
>
> Because resource controllers and normal controllers have very
> different routing rules, I think they should be different classes.
I''d
> split them into two classes: Merb::Controllers::Standard and
> Merb::Controllers::Resource. These would both be subclasses of
> Controller, and would implement the method Controller#route
> Users would create controllers by subclassing one of these two.
>
> Some cute things can be done by letting the controller have control of
> action routing.
>
> For example, in the standard controller we could use punctuation to
> declare that an action receives a POST:
>   def update! # POST controller/update
>   def update  # GET controller/update
>   def index    # GET controller/index
>
> The standard controller can safely ignore _method (for those of us who
> do not want emulated PUT and DELETE methods)
>
> We can put custom route rules nearer to the actions they effect
> instead of in a separate file.
>   add_route ''test/:blah'', :index  # assigns params[:blah]
and routes
> to test action
>   def test
>   end
> Duane''s popular regular expression routing can be implemented in
an
> add_route-like controller class method, say, match_route.
>
> In the end this will be a tighter, more readable, and more natural API
> than the current. Additionally this opportunity can be used to rid
> Merb of the increasingly complex @compiled_statement construction in
> route.rb and rely instead on a constructed data structure to match
> against. I suspect performance can be increased.
>
> This is a largish change and I would like to create it in an
> experimental branch to work on it.
>
> Thoughts?
>
> ry

	I''m not convinced that this would make things better. You''re
welcome
to work on a branch but I''m not going to merge it to trunk unless I  
think it''s a big improvement and I''m not seeing why this way
would be
better. I also don''t like the idea of two different types of  
controllers that are exactly the same and inherit from the same  
superclass but are different only in how they are routed to.

	I don''t want to discourage you from trying this out in a branch so  
go for it, but I have to reserve judgement until I see it or you show  
me a more complete mockup API.

Cheers-

-- Ezra Zygmuntowicz 
-- Founder & Ruby Hacker
-- ez at engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)

ry dahl

2007-Sep-18 03:15 UTC

head link

Routes

> I''m not going to merge it to trunk unless I
> think it''s a big improvement and I''m not seeing why this
way would be
> better.
Noted :)
New branch created at /branches/ry_routes
> I also don''t like the idea of two different types of
> controllers that are exactly the same and inherit from the same
> superclass but are different only in how they are routed to.
Perhaps this distinction doesn''t need to be implemented as split
classes, however I think from a user perspective it makes sense to
think of resource controllers as fundamentally different than standard
controllers - even if they share 99% of the code.


ry

Brian Candler

2007-Sep-18 11:01 UTC

head link

Routes

On Tue, Sep 18, 2007 at 03:14:14AM +0200, ry dahl wrote:> Some cute things can be done by letting the controller have control of
> action routing.
> 
> For example, in the standard controller we could use punctuation to
> declare that an action receives a POST:
>   def update! # POST controller/update
>   def update  # GET controller/update
>   def index    # GET controller/index
I don''t understand why this can''t be done with
"normal" routing, e.g.
(untested)

  r.match(%r[/:controller(/::(/:id)?)?(\.:format)?],
:method=>"post").
    to(:action=>"[1]!")

This is just a variant on default_routes
> We can put custom route rules nearer to the actions they effect
> instead of in a separate file.
>   add_route ''test/:blah'', :index  # assigns params[:blah]
and routes
> to test action
>   def test
>   end
If the idea is to keep the route definition near to the controller, maybe it
would be simpler just to allow the router to be re-opened in each controller
source file (if it isn''t already), e.g.

Merb::Router.prepare do |r|
  r.match("/whatever/test/:blah").to(:test)
end
class Whatever < Application
  def test
  end
end

However, the benefit of putting all the routing logic in one place is it
makes it clear the order in which testing is made, to resolve ambiguities
between routes.
> In the end this will be a tighter, more readable, and more natural API
> than the current. Additionally this opportunity can be used to rid
> Merb of the increasingly complex @compiled_statement construction in
> route.rb and rely instead on a constructed data structure to match
> against. I suspect performance can be increased.
I agree that the current linear search, i.e. a big if / elsif tree, is far
from optimal. However I think it can be improved without moving the logic
out of centralised routing.

If I understand you correctly, you''re saying that all matches will be
initially made against a fixed pattern /:controller/.*, and then the request
is dispatched to the controller which performs its own sub-matching as
required. By dispatching directly to a controller first, you avoid the
linear search, which is a potentially a factor of N improvement when you
have N controllers.

Well, you could get the same benefit in the centralised routing model, by
imposing an equivalent rule that all path regexps must start with /\w+.
Bundle all the path routing rules into bins, perform a match on /(\w+),
then perform a linear search under bins[$1]. That is, instead of having one
@@compiled_statement, have multiple @@compiled_statements indexed by the
first path component. (So it doesn''t make the complexity any less,
I''m
afraid!)

However I suspect this can be generalised. One way I''m thinking is to
combine all the path regexps into one mega path matching regexp, and let the
regexp engine take care of it. You would group together all routing rules
which have the same path regexp, and perform a linear search afterwards for
the non-path tests. e.g.

  r.match("/admin/:action").to(www)
  r.match("/:flurble/:action", :method=>"delete").to(xxx)
  r.match("/:controller/:action",
:method=>"post").to(yyy)
  r.match("/:controller/:action").to(zzz)

The last three rules all match path /([^\/.,;?]+)/([^\/.,;?]+), and so you
would compile to something like

    %r{ \A
       (/admin/([^\/.,;?]+)) |
       (/([^\/.,;?]+)/([^\/.,;?]+))
       \z
    }x

In the case of the first branch matching you''d dispatch to www, and in
the
case of the second branch matching you''d run compiled (pseudocode)

   if params[:method] == ''delete''
     # set params[:flurble] = $1, params[:action] = $1
     to(xxx)
   elsif params[:method] == ''post''
     # set params[:controller] = $1, params[:action] = $1
     to(yyy)
   else
     # set params[:controller] = $1, params[:action] = $1
     to(zzz)
   end

However I''m not sure if there''s a cheap way to identify which
branch of a
regexp has matched. If you have to do

   if $1
     ...
   elsif $2
     ...
   elsif $3
     ...
   end

then much of the benefit has been lost. And you also need to be sure of how
ambiguous matching is resolved, e.g. in the above example /admin/foo matches
the first regexp branch in preference to the second.

Regards,

Brian.

Brian Candler

2007-Sep-18 12:17 UTC

head link

Routes

On Tue, Sep 18, 2007 at 12:01:07PM +0100, Brian Candler
wrote:> However I suspect this can be generalised. One way I''m thinking is
to
> combine all the path regexps into one mega path matching regexp, and let
the
> regexp engine take care of it.
...> However I''m not sure if there''s a cheap way to identify
which branch of a
> regexp has matched.
Here''s an outline of how I think it could be done.

--- 8< ---------------------------------------------------------------------
# Part one: show how it works manually
RE = %r{\A(?:
  (?> ()/admin/(\w+)) |
  (?> ()/(\w+)/(\w+))
)\z}x
puts RE.inspect

ACTIONS = {
        1 => proc { |m| puts "www: #{m.inspect}" },
        3 => proc { |m| puts "xxx/yyy/zzz: #{m.inspect}" },
}

def test(patt)
  m = RE.match(patt).to_a
  puts m.inspect
  i = m.index("")
  ACTIONS[i].call(m[i+1..-1])
end

test("/admin/wibble")
test("/bibble/boing")
puts "----"

# Part two: proof of concept of automatic regexp construction
class POC
  attr_reader :regexps, :re, :actions
  def initialize
    @regexps = []   #  [[regexp,[action,action]], [regexp,[action,action], ...]
  end

  def match(regexp, action)
    regexp = Regexp.new(regexp) unless regexp.is_a? Regexp
    if entry = @regexps.find { |e| e[0] == regexp }
      entry[1] << action
    else
      @regexps << [regexp, [action]]
    end
  end

  def compile
    @re = %r{\A(?:#{@regexps.collect { |r| "(?>()#{r[0]})"
}.join("|")})\z}
    @actions = {}
    i = 1
    @regexps.each do |r,a|
      @actions[i] = a
      i += 1 + r.source.scan(%r{\((?!\?)}).size   # ergh, count number of
captures
    end
  end

  def run(patt, request={})
    m = @re.match(patt).to_a
    i = m.index("")
    @actions[i].each { |a| res = a.call(request, m[i+1..-1]); return res if res
}
    nil
  end
end

r = POC.new
r.match(%r{/admin/(\w+)}, proc { |req,path| {:action=>"www"} })
r.match(%r{/(\w+)/(\w+)}, proc { |req,path| {:action=>"xxx"} if
req[:method]=="delete" })
r.match(%r{/(\w+)/(\w+)}, proc { |req,path| {:action=>"yyy"} if
req[:method]=="post" })
r.match(%r{/(\w+)/(\w+)}, proc { |req,path| {:action=>"zzz"} })
p r.regexps
r.compile
p r.re
p r.actions
p r.run("/admin/wibble", :method=>"get")
p r.run("/bibble/boing", :method=>"delete")
p r.run("/bibble/boing", :method=>"post")
p r.run("/bibble/boing", :method=>"get")
--- 8< ---------------------------------------------------------------------

Seem feasible?

Other things to do:

- create procs automatically for the simple ''non-deferred''
routing, which is
easy since procs are first-class objects. No eval required :-)

- map m[0], m[1], m[2]... into the named params fields like :controller,
:action etc if the original regexp had :foo placeholders

Regards,

Brian.

Brian Candler

2007-Sep-18 13:55 UTC

head link

Routes

On Tue, Sep 18, 2007 at 01:17:41PM +0100, Brian Candler
wrote:> Here''s an outline of how I think it could be done.
Incidentally, there''s some scope for further optimisation here. Given
that
the applications with large numbers of routes are likely to be
resource-based, e.g.

/foos
/foos/:id
/foos/new
/foos/:id/edit
/foos/:id/:action

then it may be worth reflecting this in the structure of the composed regexp
by matching the shared prefix only once. Trying to combine adjacent regexps
automatically may be hard, but resource routing could explicitly build
regexps this way.

The speed gains may not be significant in the bigger picture:

$ cat speed.rb
require ''benchmark''
include Benchmark

re1 = "(?: () /)\n"
re2 = "(?: () /)\n"
["things","bars","bazs","xyzs","abcs","foos","persons","admins"].each
do |r|
  re1 << <<EOS
| (?: () /#{r})
| (?: () /#{r}/(\\d+))
| (?: () /#{r}/new)
| (?: () /#{r}/(\\d+)/edit)
| (?: () /#{r}/(\\d+)/(\\w+))
EOS
  re2 << <<EOS
| (/#{r} (
  (?: () ) |
  (?: () /new) |
  (?: () (/\\d+) (
    (?: () ) |
    (?: () /edit) |
    (?: () /(\\w+))
  ))
))
EOS
end

# Naive regexp
RE1 = %r{^(?:#{re1})$}x
# Prefix-optimised regexp
RE2 = %r{^(?:#{re2})$}x

N = 10_000

bm(6) do |test|
  test.report("Match 1:") { N.times { raise unless RE1 =~
"/foos/123/edit" } }
  test.report("Match 2:") { N.times { raise unless RE2 =~
"/foos/123/edit" } }
  test.report("Mismatch 1:") { N.times { raise if RE1 =~
"/flurble/baz" } }
  test.report("Mismatch 2:") { N.times { raise if RE2 =~
"/flurble/baz" } }
end
$ ruby speed.rb
            user     system      total        real
Match 1:  1.060000   0.000000   1.060000 (  1.069919)
Match 2:  0.620000   0.000000   0.620000 (  0.620281)
Mismatch 1:  0.640000   0.010000   0.650000 (  0.636095)
Mismatch 2:  0.150000   0.000000   0.150000 (  0.156758)

This is on a small laptop (Pentium M 1.1GHz), and shows an improvement of
about 44 microseconds per match when matching the sixth out of eight
resources.

It would be interesting to see how this compares with current routing
strategy though.

Regards,

Brian.

Brian Candler

2007-Sep-18 20:30 UTC

head link

Routes

On Tue, Sep 18, 2007 at 01:17:41PM +0100, Brian Candler
wrote:> Here''s an outline of how I think it could be done.
This was broken as I was using (?> ...), which permanently eats characters
and then won''t allow the data to match a different branch. Fixed
version
attached, with some other cleanups.

Incidentally, instead of writing

    (?: () ...branch1... ) |
    (?: () ...branch2... )

it would be simpler to write

    ( ...branch1... ) |
    ( ...branch2... )

since the outer brackets also capture the entire match, which gives a
suitable non-nil sentinel to mark the start of the branch.

However this way means I need a fast way to detect the first non-nil element
in the match array (rather than the first empty string element) and I
couldn''t think of a way to do that faster than

    i = (0...m.size).find { |x| m[x] }

Also, I decided to anchor the whole expression, which means the individual
branch regexps don''t need to be anchored, but instead we could insist
that
the branches need to have their own anchors (which I think is the current
behaviour)

Regards,

Brian.
-------------- next part --------------
class POCRouter
  class Path
    attr_reader :regexp, :num_captures, :blocks
    def initialize(regexp, blocks=[])
      @regexp = regexp
      @num_captures = @regexp.source.scan(%r{\((?!\?)}).size  # FIXME, e.g. \(
or [(]
      @blocks = blocks
    end
  end

  attr_reader :paths, :re, :capindex
  def initialize
    @paths = []
  end

  def match(regexp, &blk)
    regexp = Regexp.new(regexp) unless regexp.is_a? Regexp
    if path = @paths.find { |p| p.regexp == regexp }
      path.blocks << blk
    else
      @paths << Path.new(regexp, [blk])
    end
  end

  def compile
    @re = %r{\A(?:
#{@paths.collect { |p| "(?: () #{p.regexp})" }.join("|\n")}
)\z}x  # extended RE is easier to read for debugging
    @capindex = {}
    i = 0
    @paths.each do |p|
      @capindex[i] = p
      i += p.num_captures + 1
    end
  end

  def inspect
    (@re || @paths).inspect
  end

  def run(path, request={})
    m = @re.match(path).captures
    i = m.index("") || (return nil)
    path = capindex[i]
    path.blocks.each do |blk|
      res = blk.call(request, m[i+1, path.num_captures])
      return res if res
    end
    nil
  end
end

r = POCRouter.new
r.match(%r{/admin/(\w+)}) { |req,m| p m; {:action=>"www"} }
r.match(%r{/(\w+)/(\w+)}) { |req,m| p m; {:action=>"xxx"} if
req[:method]=="delete" }
r.match(%r{/(\w+)/(\w+)}) { |req,m| {:action=>"yyy"} if
req[:method]=="post" }
r.match(%r{/(\w+)/(\w+)}) { |req,m| {:action=>"zzz"} }
p r
r.compile
p r
p r.run("/admin/wibble", :method=>"get")
p r.run("/bibble/boing", :method=>"delete")
p r.run("/bibble/boing", :method=>"post")
p r.run("/bibble/boing", :method=>"get")

Ezra Zygmuntowicz

2007-Sep-18 20:45 UTC

head link

Routes

On Sep 18, 2007, at 6:55 AM, Brian Candler wrote:
> On Tue, Sep 18, 2007 at 01:17:41PM +0100, Brian Candler wrote:
>> Here''s an outline of how I think it could be done.
>
> Incidentally, there''s some scope for further optimisation here.  
> Given that
> the applications with large numbers of routes are likely to be
> resource-based, e.g.
>
> /foos
> /foos/:id
> /foos/new
> /foos/:id/edit
> /foos/:id/:action
>
> then it may be worth reflecting this in the structure of the  
> composed regexp
> by matching the shared prefix only once. Trying to combine adjacent  
> regexps
> automatically may be hard, but resource routing could explicitly build
> regexps this way.
>
> The speed gains may not be significant in the bigger picture:
>
> $ cat speed.rb
> require ''benchmark''
> include Benchmark
>
> re1 = "(?: () /)\n"
> re2 = "(?: () /)\n"
>
["things","bars","bazs","xyzs","abcs","foos","persons","admins"].each
> do |r|
>   re1 << <<EOS
> | (?: () /#{r})
> | (?: () /#{r}/(\\d+))
> | (?: () /#{r}/new)
> | (?: () /#{r}/(\\d+)/edit)
> | (?: () /#{r}/(\\d+)/(\\w+))
> EOS
>   re2 << <<EOS
> | (/#{r} (
>   (?: () ) |
>   (?: () /new) |
>   (?: () (/\\d+) (
>     (?: () ) |
>     (?: () /edit) |
>     (?: () /(\\w+))
>   ))
> ))
> EOS
> end
>
> # Naive regexp
> RE1 = %r{^(?:#{re1})$}x
> # Prefix-optimised regexp
> RE2 = %r{^(?:#{re2})$}x
>
> N = 10_000
>
> bm(6) do |test|
>   test.report("Match 1:") { N.times { raise unless RE1 =~
"/foos/
> 123/edit" } }
>   test.report("Match 2:") { N.times { raise unless RE2 =~
"/foos/
> 123/edit" } }
>   test.report("Mismatch 1:") { N.times { raise if RE1 =~
"/flurble/
> baz" } }
>   test.report("Mismatch 2:") { N.times { raise if RE2 =~
"/flurble/
> baz" } }
> end
> $ ruby speed.rb
>             user     system      total        real
> Match 1:  1.060000   0.000000   1.060000 (  1.069919)
> Match 2:  0.620000   0.000000   0.620000 (  0.620281)
> Mismatch 1:  0.640000   0.010000   0.650000 (  0.636095)
> Mismatch 2:  0.150000   0.000000   0.150000 (  0.156758)
>
> This is on a small laptop (Pentium M 1.1GHz), and shows an  
> improvement of
> about 44 microseconds per match when matching the sixth out of eight
> resources.
>
> It would be interesting to see how this compares with current routing
> strategy though.
>
> Regards,
>
> Brian.

	I definitely want to keep all the routes in the router.rb file, I  
just think having them in one place instead of spread out is much  
better and more explicit in ordering etc.  In the big picture of a  
merb dispatch the current routers compiled_statement is not even  
close to being a significant slice of the time spent dispatching a  
request. I don''t have any problems with trying to optimize it further  
but I don''t think it will gain us that much in the overall picture.

	I''m always happy to be proven wrong though ;)


Cheers-
-- Ezra Zygmuntowicz 
-- Founder & Ruby Hacker
-- ez at engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)

Brian Candler

2007-Sep-19 08:41 UTC

head link

Routes

On Tue, Sep 18, 2007 at 01:45:58PM -0700, Ezra Zygmuntowicz
wrote:> In the big picture of a  
> merb dispatch the current routers compiled_statement is not even  
> close to being a significant slice of the time spent dispatching a  
> request.
Good point.

I''m not sure what the current best way to profile merb is, so I hacked
something very simple:

--- /usr/lib/ruby/gems/1.8/gems/merb-0.4.0/lib/merb/server.rb.orig     
2007-09-19 08:43:28.000000000 +0100
+++ /usr/lib/ruby/gems/1.8/gems/merb-0.4.0/lib/merb/server.rb   2007-09-19
08:47:03.000000000 +0100
@@ -439,6 +439,8 @@
           start(@@merb_opts[:port])
         else
           trap(''TERM'') { exit }
+          trap("USR1") { require ''profiler'';
Profiler__::start_profile }
+          trap("USR2") { Profiler__::stop_profile;
Profiler__::print_profile($stderr) }
           mongrel_start(@@merb_opts[:port])
         end

I tested with the following app, run in production mode (merb -e production):

Merb::Router.prepare do |r|
  r.resources :foos
  r.resources :bars
  r.resources :bazs
  r.default_routes
end

class Hello < Application
  def world
    "Hello, world!"
  end
end

I then hit it 100 times on /hello/world, sent -USR1, hit it 100 times on
/hello/world again, and sent -USR2.

Here are the results:

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
  5.41     0.84      0.84     1000     0.84     6.98  Logger#add
  5.15     1.64      0.80      900     0.89     4.10  MonitorMixin.synchronize
  3.41     2.17      0.53      900     0.59     0.79 
Logger::Formatter#format_datetime
  3.03     2.64      0.47      900     0.52     1.78  MonitorMixin.mon_exit
  2.83     3.08      0.44      900     0.49     0.68  MonitorMixin.mon_release
  2.25     3.43      0.35      900     0.39     0.43  Logger::Formatter#msg2str
  2.12     3.76      0.33      900     0.37     0.40 
Logger::LogDevice#check_shift_log
  2.12     4.09      0.33      900     0.37     0.74  MonitorMixin.mon_enter
  2.06     4.41      0.32       99     3.23   139.39  TCPServer#accept
  2.00     4.72      0.31      100     3.10     4.80  Merb::Router#match
  1.93     5.02      0.30      100     3.00   137.10  MerbHandler#process
  1.93     5.32      0.30     3877     0.08     0.15  Hash#[]
  1.93     5.62      0.30      900     0.33     1.77  Logger::Formatter#call
  1.87     5.91      0.29      400     0.73     1.30  Hash#inspect
  1.61     6.16      0.25     1900     0.13     0.14  Time#now
  1.42     6.38      0.22      900     0.24     0.31 
MonitorMixin.mon_check_owner
  1.42     6.60      0.22     3600     0.06     0.06  Thread#critical  1.35    
6.81      0.21      900     0.23     2.02  Logger#format_message
  1.35     7.02      0.21      400     0.52     0.52  String#split
  1.29     7.22      0.20      100     2.00   148.80 
Mongrel::HttpServer#process_client
  1.29     7.42      0.20      700     0.29     0.33  Mash#convert_key
  1.29     7.62      0.20      900     0.22     0.24  MonitorMixin.mon_acquire
  1.22     7.81      0.19      300     0.63     1.37  Hash#each_pair
  1.22     8.00      0.19      300     0.63     0.73  Mongrel::HeaderOut#[] 
1.16     8.18      0.18     2200     0.08     0.10  String#%
  1.16     8.36      0.18      200     0.90     2.90 
Mongrel::DirHandler#can_serve
  1.09     8.53      0.17     1800     0.09     0.09  Array#shift
  1.03     8.69      0.16      600     0.27     0.62 
Class#read_inheritable_attribute
  0.97     8.84      0.15      200     0.75     0.75  File#expand_path
  0.97     8.99      0.15     1600     0.09     0.65  Class#new
  0.97     9.14      0.15      100     1.50     2.70 
Mongrel::HttpRequest#initialize
  0.97     9.29      0.15      100     1.50    88.80  Merb::Dispatcher#handle
  0.97     9.44      0.15      200     0.75     1.95  Merb::Request#query_parse
  0.90     9.58      0.14     1800     0.08     0.08  Hash#default
  0.90     9.72      0.14       78     1.79    19.87  Kernel.sleep
  0.84     9.85      0.13      800     0.16     0.96  Merb::Request#route_params
0.77     9.97      0.12      200     0.60     0.80  Mash#default
  0.77    10.09      0.12      800     0.15     0.78  Merb::Request#route_match
  0.71    10.20      0.11      600     0.18     0.18 
Class#inheritable_attributes
  0.71    10.31      0.11      900     0.12     0.17  Logger#format_severity
  0.64    10.41      0.10     1800     0.06     0.06  NilClass#nil?
  0.64    10.51      0.10     1000     0.10     0.10  String#inspect
  0.64    10.61      0.10     1300     0.08     0.08  Fixnum#=  0.64    10.71   
0.10      900     0.11     0.11  Thread#pass
  0.64    10.81      0.10      300     0.33     4.93  Merb::Request#params
  0.64    10.91      0.10      106     0.94     1.60  Array#map
  0.64    11.01      0.10      100     1.00     1.50  Merb::Request#path
  0.64    11.11      0.10      200     0.50     0.70 
Mongrel::HttpRequest#unescape
  0.58    11.20      0.09      300     0.30     0.57 
Merb::Request#controller_name
  0.58    11.29      0.09      400     0.23     1.35 
Merb::AbstractController#before_filters
  0.58    11.38      0.09      100     0.90     4.00  Kernel.catch
  0.58    11.47      0.09      753     0.12    18.46  Array#each
  0.58    11.56      0.09     1200     0.08     0.08  Kernel.respond_to?
  0.58    11.65      0.09      900     0.10     4.29  Logger::LogDevice#write
  0.58    11.74      0.09     1300     0.07     0.07  IO#write
  0.52    11.82      0.08      100     0.80     1.20  String#to_const_string
  0.52    11.90      0.08      400     0.20     0.25  Mash#convert_value
  0.52    11.98      0.08      100     0.80     0.80  Thread#initialize
  0.52    12.06      0.08      400     0.20     0.28 
Mongrel::HttpResponse#write
  0.52    12.14      0.08      400     0.20     0.52 
Merb::Controller#_session_id_key
  0.52    12.22      0.08      100     0.80     5.60 
Merb::Request#controller_class
  0.52    12.30      0.08     1453     0.06     0.06  Kernel.nil?
  0.45    12.37      0.07      400     0.18     0.88 
Merb::AbstractController#after_filters
  0.45    12.44      0.07     1800     0.04     0.04  Module#==  0.45    12.51  
0.07      100     0.70     0.70  Fixnum#>  0.45    12.58      0.07      300  
0.23     0.33  Array#include?
  0.45    12.65      0.07      900     0.08     7.90  Logger#info
  0.45    12.72      0.07      800     0.09     0.09  Symbol#inspect
  0.45    12.79      0.07      200     0.35     1.00 
Mongrel::HttpResponse#send_status
  0.45    12.86      0.07     1000     0.07     0.07  Time#strftime
  0.39    12.92      0.06     1000     0.06     0.06  String#<<
  0.39    12.98      0.06      100     0.60     1.80  Hash#each
  0.39    13.04      0.06      100     0.60     2.80 
Mongrel::HttpResponse#initialize
  0.39    13.10      0.06      200     0.30     0.35  Enumerable.inject
  0.39    13.16      0.06      200     0.30     1.30 
Merb::Request#body_and_query_params
  0.39    13.22      0.06      300     0.20     0.20  Hash#key?
  0.39    13.28      0.06      100     0.60     2.50  Object#full_const_get
  0.39    13.34      0.06     1100     0.05     0.05  Fixnum#-
  0.39    13.40      0.06      100     0.60     7.00 
Merb::AbstractController#dispatch
  0.39    13.46      0.06     1800     0.03     0.03  Thread#current
  0.32    13.51      0.05      100     0.50     1.30 
Merb::CookieStore#initialize
  0.32    13.56      0.05      100     0.50     0.70 
Mongrel::HttpParser#execute
  0.32    13.61      0.05      100     0.50     2.80 
Merb::Controller#set_dispatch_variables
  0.32    13.66      0.05      200     0.25     0.80  Hash#to_mash
  0.32    13.71      0.05      200     0.25     0.40  Mash#initialize
  0.32    13.76      0.05      100     0.50    39.90  Merb::Controller#dispatch
  0.26    13.80      0.04      400     0.10     0.20  Merb::Request#content_type
0.26    13.84      0.04      600     0.07     0.07  String#=  0.26    13.88     
0.04      100     0.40     1.00  Time#httpdate
  0.26    13.92      0.04      100     0.40     1.30  Merb::Request#query_params
0.26    13.96      0.04      200     0.20     0.20  String#tr
  0.26    14.00      0.04     1100     0.04     0.04  Fixnum#<
  0.26    14.04      0.04      100     0.40     0.40 
Merb::AbstractController#default_thrown_content
  0.26    14.08      0.04     1100     0.04     0.04  Array#[]
... snip everything below 0.25%
  0.00    15.53      0.00        1     0.00 15530.00  #toplevel

It looks like logging is the worst offender, with Logger#add and its mutex
(MonitorMixin) accounting for the top 8 slots, 4.09 out of 15.53 seconds, or
26%. In fact, it looks like Logger#add and its children, at 6.98ms per call
and called 1000 times, count for 6.98s in total.

TCPAccept is next, and Merb::Router#match is third. However this only
accounts for 2.00%

Now, I''ve seen figures for Merb performance typically in the 400-600
requests per second range. Are these measured with logging turned off? If
not, I think there is a strong argument for introducing a mutex-free logging
system.

(Simple idea: replace Logger#add with an append to a string buffer, which is
atomic. Then once per second empty the buffer using String#slice! which
should also be atomic, and write that to the log)

Finally, I repeated the test using merb -e production -l WARN

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
  4.62     0.42      0.42     1000     0.42     0.51  Logger#add
  3.85     0.77      0.35      100     3.50     6.00  Merb::Router#match
  3.74     1.11      0.34      100     3.40    10.30  IO#close
  3.19     1.40      0.29      100     2.90    79.70 
Mongrel::HttpServer#process_client
  3.19     1.69      0.29      100     2.90    71.40  MerbHandler#process
  2.75     1.94      0.25      900     0.28     0.81  Logger#info
  2.42     2.16      0.22     1600     0.14     0.86  Class#new
  2.31     2.37      0.21      600     0.35     0.50 
Class#read_inheritable_attribute
  2.09     2.56      0.19      400     0.48     0.73  Hash#inspect
  2.09     2.75      0.19      400     0.47     0.47  String#split
  2.09     2.94      0.19     3742     0.05     0.11  Hash#[]
  1.98     3.12      0.18      738     0.24    10.15  Array#each
  1.98     3.30      0.18      100     1.80     2.00  Time#httpdate
  1.65     3.45      0.15      400     0.38     1.23 
Merb::AbstractController#before_filters
  1.65     3.60      0.15      300     0.50     5.93  Merb::Request#params
  1.65     3.75      0.15      100     1.50     1.90  Merb::Request#json_params
  1.54     3.89      0.14      100     1.40     2.10 
Mongrel::HttpRequest#initialize
  1.54     4.03      0.14     1000     0.14     0.17  Time#now
  1.43     4.16      0.13      200     0.65     1.40  Hash#to_mash
  1.43     4.29      0.13      100     1.30    55.30  Merb::Dispatcher#handle
  1.32     4.41      0.12      100     1.20    17.30  Merb::Controller#dispatch
  1.32     4.53      0.12      700     0.17     0.24  Mash#convert_key
  1.32     4.65      0.12      300     0.40     0.80  Mongrel::HeaderOut#[] 
1.21     4.76      0.11      100     1.10     3.90  Merb::Controller#build
  1.10     4.86      0.10      800     0.13     1.03  Merb::Request#route_params
1.10     4.96      0.10      200     0.50     0.55 
Mongrel::HttpResponse#send_header
  0.99     5.05      0.09      300     0.30     1.00  Hash#each_pair
  0.99     5.14      0.09      100     0.90     1.10  Merb::Request#method
  0.99     5.23      0.09       55     1.64    24.73  Kernel.sleep
  0.88     5.31      0.08      200     0.40     0.40  File#expand_path
  0.88     5.39      0.08      400     0.20     0.20  String#%
  0.88     5.47      0.08      200     0.40     0.70  Mash#initialize
  0.88     5.55      0.08      200     0.40     0.55  Mash#include?
  0.77     5.62      0.07      100     0.70     6.10 
Merb::AbstractController#dispatch
  0.77     5.69      0.07      100     0.70     4.90 
Mongrel::HttpResponse#initialize
  0.77     5.76      0.07      200     0.35     1.20 
Mongrel::DirHandler#can_serve
  0.77     5.83      0.07     1800     0.04     0.04  Hash#default
  0.77     5.90      0.07      800     0.09     0.85  Merb::Request#route_match
  0.77     5.97      0.07      100     0.70     1.60 
Merb::Controller#set_dispatch_variables
  0.66     6.03      0.06      400     0.15     0.62 
Merb::AbstractController#after_filters
  0.66     6.09      0.06      100     0.60     0.60  StringScanner#scan
  0.66     6.15      0.06      100     0.60     0.70  Object#blank?
  0.66     6.21      0.06      200     0.30     2.25  Merb::Request#query_parse
  0.66     6.27      0.06      200     0.30     0.30  String#gsub
  0.66     6.33      0.06      400     0.15     0.68 
Merb::Controller#_session_id_key
  0.66     6.39      0.06     1438     0.04     0.04  Kernel.nil?
  0.66     6.45      0.06     1000     0.06     0.06  Hash#[]  0.55     6.50    
0.05       99     0.51    52.63  TCPServer#accept
  0.55     6.55      0.05      200     0.25     1.40  Merb::Request#cookies
  0.55     6.60      0.05      900     0.06     0.06  Array#last
  0.55     6.65      0.05      100     0.50     1.20 
Mongrel::HttpResponse#finished
  0.55     6.70      0.05      100     0.50     5.70 
Merb::SessionMixin.setup_session
  0.55     6.75      0.05       76     0.66     1.18  Array#map
  0.55     6.80      0.05      400     0.13     0.22  Mash#convert_value
  0.55     6.85      0.05      300     0.17     0.23  StringIO#initialize
  0.55     6.90      0.05      400     0.13     0.22 
Mongrel::HttpResponse#write
  0.55     6.95      0.05      400     0.13     0.27  Merb::Request#content_type
0.55     7.00      0.05      100     0.50     4.70 
Merb::Request#controller_class
  0.55     7.05      0.05      100     0.50     2.60  Object#full_const_get
  0.55     7.10      0.05      200     0.25     0.25  String#to_s
  0.55     7.15      0.05      100     0.50     1.00  Thread#new
  0.55     7.20      0.05      100     0.50     0.50  Merb::Controller#status
... snip everything below 0.5%
  0.00     9.09      0.00        1     0.00  9090.00  #toplevel

This is more interesting: even though nothing is actually logged, because of
the large number of calls to Logger#add it still accounts for 4.62%, with
another 2.75% for Logger#info and 1.54% for Time#now (the fact it has been
called 1000 times suggests that timestamps are being made for log entries
which are never written!)

Merb::Router#match is now the number two offender at 3.85%, but this is
probably still not enough by itself to warrant a rewrite.

Regards,

Brian.

Brian Candler

2007-Sep-19 09:36 UTC

head link

Routes

OK, I hacked together a simple replacement logger, and this is what I get
with logging *enabled* (merb -e production)

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
  3.69     0.34      0.34      100     3.40    76.60  MerbHandler#process
  3.25     0.64      0.30      800     0.38     0.73  Merb::Logger#info
  3.15     0.93      0.29     3724     0.08     0.20  Hash#[]
  3.04     1.21      0.28      400     0.70     1.47  Hash#inspect
  2.93     1.48      0.27      300     0.90     2.10  Hash#each_pair
  2.60     1.72      0.24       52     4.62    43.27  Kernel.sleep
  2.60     1.96      0.24      200     1.20     1.75 
Mongrel::HttpResponse#send_status
  2.49     2.19      0.23      400     0.58     1.02 
Merb::AbstractController#after_filters
  2.17     2.39      0.20      100     2.00     2.60 
Mongrel::HttpRequest#initialize
  2.17     2.59      0.20      100     2.00    25.80  IO#close
  2.06     2.78      0.19     1600     0.12     0.73  Class#new
  2.06     2.97      0.19      100     1.90     4.60  Merb::Request#query_params
1.95     3.15      0.18      200     0.90     1.60  Mash#include?
  1.95     3.33      0.18      400     0.45     0.45  String#split
  1.84     3.50      0.17      100     1.70     8.50 
Merb::AbstractController#dispatch
  1.63     3.65      0.15      700     0.21     0.29  Mash#convert_key
  1.52     3.79      0.14      100     1.40    19.90  Merb::Controller#dispatch
  1.52     3.93      0.14      100     1.40    70.10 
Mongrel::HttpServer#process_client
  1.41     4.06      0.13      100     1.30     2.80  Merb::Router#match
  1.41     4.19      0.13      800     0.16     0.59  Merb::Request#route_params
1.41     4.32      0.13      300     0.43     0.67  Mongrel::HeaderOut#[]  1.30 
4.44      0.12      300     0.40     0.40  Hash#key?
  1.30     4.56      0.12      400     0.30     0.30  NilClass#inspect
  1.19     4.67      0.11      400     0.28     0.28  Mash#convert_value
  1.19     4.78      0.11     1300     0.08     0.08  Kernel.class
  1.19     4.89      0.11      900     0.12     0.12  String#[]
  1.19     5.00      0.11      100     1.10    56.20  Merb::Dispatcher#handle
  1.08     5.10      0.10      400     0.25     0.70 
Merb::AbstractController#before_filters
  1.08     5.20      0.10      200     0.50     2.10  Mash#default
... snip less than 1%
  0.00     9.22      0.00        1     0.00  9220.00  #toplevel

Performance test using ab -n 1000 http://localhost:4000/hello/world

[Before patch]

Requests per second:    276.78 [#/sec] (mean)
Time per request:       3.613 [ms] (mean)

[After patch]

Requests per second:    401.49 [#/sec] (mean)
Time per request:       2.491 [ms] (mean)

This is on a 1.1GHz Pentium M. It would be interesting to see what people
with higher-performance servers achieve.

Cheers,

Brian.

P.S. In this code, a single write is done at the end of each request.
I''ve
not protected it with a mutex, because I think that IO#write is atomic, but
I may be wrong.

More performance could perhaps be achieved by not flushing after every
request, but by doing this in a background thread. However I think it might
make more sense if we''re going this route to give each request its own
log
buffer, so that logs don''t get interspersed from overlapping requests
in a
multi-threaded application.

Also, I notice that the logger class and initialisation is currently
hard-coded into merb.rb; perhaps it would be better to make this
configurable.
-------------- next part --------------
--- lib/merb.rb.orig	2007-09-19 09:55:24.000000000 +0100
+++ lib/merb.rb	2007-09-19 09:56:37.000000000 +0100
@@ -10,7 +10,7 @@
 end
 require ''fileutils''
 require ''merb/erubis_ext''
-require ''logger''
+require ''merb/logger''
 require ''json''
 require ''set''
 autoload :MerbUploadHandler, ''merb/upload_handler''
@@ -67,8 +67,8 @@
 MERB_VIEW_ROOT = MERB_ROOT / "app/views"
 MERB_SKELETON_DIR = File.join(MERB_FRAMEWORK_ROOT,
''../app_generators/merb/templates'')
 logpath = $TESTING ? "#{MERB_ROOT}/merb_test.log" :
"#{MERB_ROOT}/log/merb.#{Merb::Server.config[:port]}.log"
-MERB_LOGGER = Logger.new(logpath)
-MERB_LOGGER.level = Logger.const_get(Merb::Server.config[:log_level].upcase)
rescue Logger::INFO
+MERB_LOGGER = Merb::Logger.new(logpath)
+MERB_LOGGER.level =
Merb::Logger.const_get(Merb::Server.config[:log_level].upcase) rescue
Merb::Logger::INFO
 MERB_PATHS = [ 
   "/app/models/**/*.rb",
   "/app/controllers/application.rb",
@@ -82,4 +82,4 @@
 if $TESTING
   test_files = File.join(lib, ''test'',
''*.rb'')
   Dir[test_files].each { |file| require file }
-end
\ No newline at end of file
+end
--- lib/merb/mongrel_handler.rb.orig	2007-09-16 19:32:04.000000000 +0100
+++ lib/merb/mongrel_handler.rb	2007-09-19 10:10:39.000000000 +0100
@@ -145,8 +145,8 @@
       total_request_time = Time.now - start
       benchmarks[:total_request_time] = total_request_time
         
-      MERB_LOGGER.info("Request Times: #{benchmarks.inspect}")
-      MERB_LOGGER.info("Response status: #{response.status}\nComplete
Request took: #{total_request_time} seconds, #{1.0/total_request_time}
Requests/Second\n\n")
+      MERB_LOGGER.info("Request Times: #{benchmarks.inspect}\nResponse
status: #{response.status}\nComplete Request took: #{total_request_time}
seconds, #{1.0/total_request_time} Requests/Second\n\n")
+      MERB_LOGGER.flush
     end
   rescue Object => e
     # if an exception is raised here then something is 
--- lib/merb/logger.rb.orig	2007-09-19 10:14:33.000000000 +0100
+++ lib/merb/logger.rb	2007-09-19 10:06:37.000000000 +0100
@@ -0,0 +1,70 @@
+module Merb
+  # Logging is called several times for each request, so keep it short and
+  # sweet. Use << for atomic add to buffer, and slice! for atomic
removal.
+
+  class Logger
+    module Severity
+      DEBUG = 0
+      INFO = 1
+      WARN = 2
+      ERROR = 3
+      FATAL = 4
+      UNKNOWN = 5
+    end
+    include Severity
+
+    attr_accessor :level
+    attr_reader :buf
+    def initialize(log)
+      @level = level
+      @buf = ""
+      if log.respond_to?(:write)
+        @log = log
+      elsif File.exist?(log)
+        @log = open(log, (File::WRONLY | File::APPEND))
+        @log.sync = true
+      else
+        @log = open(log, (File::WRONLY | File::APPEND | File::CREAT))
+        @log.sync = true
+        @log.write("# Logfile created on %s" % [Time.now.to_s])
+      end
+    end
+
+    def flush
+      return if @buf.size == 0
+      @log.write @buf.slice!(0..-1)
+    end
+
+    def close
+      flush
+      @log.close if @log.respond_to? :close
+      @log = nil
+    end
+
+    def debug(msg)
+      return if @level > DEBUG
+      msg << "\n" unless msg[-1] == ?\n
+      @buf << msg
+    end
+    def info(msg)
+      return if @level > INFO
+      msg << "\n" unless msg[-1] == ?\n
+      @buf << msg
+    end
+    def warn(msg)
+      return if @level > WARN
+      msg << "\n" unless msg[-1] == ?\n
+      @buf << msg
+    end
+    def error(msg)
+      return if @level > ERROR
+      msg << "\n" unless msg[-1] == ?\n
+      @buf << msg
+    end
+    def error(msg)
+      return if @level > FATAL
+      msg << "\n" unless msg[-1] == ?\n
+      @buf << msg
+    end
+  end
+end

Ezra Zygmuntowicz

2007-Sep-19 17:33 UTC

head link

Routes

Hi~

On Sep 19, 2007, at 2:36 AM, Brian Candler wrote:
> OK, I hacked together a simple replacement logger, and this is what  
> I get
> with logging *enabled* (merb -e production)
>
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>   3.69     0.34      0.34      100     3.40    76.60   
> MerbHandler#process
>   3.25     0.64      0.30      800     0.38     0.73   
> Merb::Logger#info
>   3.15     0.93      0.29     3724     0.08     0.20  Hash#[]
>   3.04     1.21      0.28      400     0.70     1.47  Hash#inspect
>   2.93     1.48      0.27      300     0.90     2.10  Hash#each_pair
>   2.60     1.72      0.24       52     4.62    43.27  Kernel.sleep
>   2.60     1.96      0.24      200     1.20     1.75   
> Mongrel::HttpResponse#send_status
>   2.49     2.19      0.23      400     0.58     1.02   
> Merb::AbstractController#after_filters
>   2.17     2.39      0.20      100     2.00     2.60   
> Mongrel::HttpRequest#initialize
>   2.17     2.59      0.20      100     2.00    25.80  IO#close
>   2.06     2.78      0.19     1600     0.12     0.73  Class#new
>   2.06     2.97      0.19      100     1.90     4.60   
> Merb::Request#query_params  1.95     3.15      0.18      200      
> 0.90     1.60  Mash#include?
>   1.95     3.33      0.18      400     0.45     0.45  String#split
>   1.84     3.50      0.17      100     1.70     8.50   
> Merb::AbstractController#dispatch
>   1.63     3.65      0.15      700     0.21     0.29  Mash#convert_key
>   1.52     3.79      0.14      100     1.40    19.90   
> Merb::Controller#dispatch
>   1.52     3.93      0.14      100     1.40    70.10   
> Mongrel::HttpServer#process_client
>   1.41     4.06      0.13      100     1.30     2.80   
> Merb::Router#match
>   1.41     4.19      0.13      800     0.16     0.59   
> Merb::Request#route_params  1.41     4.32      0.13      300      
> 0.43     0.67  Mongrel::HeaderOut#[]>   1.30     4.44      0.12      300
0.40     0.40  Hash#key?
>   1.30     4.56      0.12      400     0.30     0.30  NilClass#inspect
>   1.19     4.67      0.11      400     0.28     0.28   
> Mash#convert_value
>   1.19     4.78      0.11     1300     0.08     0.08  Kernel.class
>   1.19     4.89      0.11      900     0.12     0.12  String#[]
>   1.19     5.00      0.11      100     1.10    56.20   
> Merb::Dispatcher#handle
>   1.08     5.10      0.10      400     0.25     0.70   
> Merb::AbstractController#before_filters
>   1.08     5.20      0.10      200     0.50     2.10  Mash#default
> ... snip less than 1%
>   0.00     9.22      0.00        1     0.00  9220.00  #toplevel
>
> Performance test using ab -n 1000 http://localhost:4000/hello/world
>
> [Before patch]
>
> Requests per second:    276.78 [#/sec] (mean)
> Time per request:       3.613 [ms] (mean)
>
> [After patch]
>
> Requests per second:    401.49 [#/sec] (mean)
> Time per request:       2.491 [ms] (mean)
>
> This is on a 1.1GHz Pentium M. It would be interesting to see what  
> people
> with higher-performance servers achieve.
>
> Cheers,
>
> Brian.
>
> P.S. In this code, a single write is done at the end of each  
> request. I''ve
> not protected it with a mutex, because I think that IO#write is  
> atomic, but
> I may be wrong.
>
> More performance could perhaps be achieved by not flushing after every
> request, but by doing this in a background thread. However I think  
> it might
> make more sense if we''re going this route to give each request its
> own log
> buffer, so that logs don''t get interspersed from overlapping  
> requests in a
> multi-threaded application.
>
> Also, I notice that the logger class and initialisation is currently
> hard-coded into merb.rb; perhaps it would be better to make this
> configurable.
> <newlog.diff>

	Nice one. We were actually just talking about fixing the logger the  
other day after some profiling and I was going to make a logger that  
collected up the strings and then output them once per request. I;ll  
take a look at your patch and will probably make the logger  
configurable so we can test different loggers.

Cheers-
-- Ezra Zygmuntowicz 
-- Founder & Ruby Hacker
-- ez at engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)

Possibly Parallel Threads

Search for more possibly parallel threads

Merb devel - Sep 2007 - Routes

Routes

Routes

Routes

Routes

Routes

Routes

Routes

Routes

Routes

Routes

Routes

Possibly Parallel Threads