tfpt review "/shelveset:re;REDMOND\sborde" Comment : Adds tests in library\regexp\match_specs.rb and language\regexp_specs.rb Fixes the issues found by it. \c support had a typo had checked for \C instead. Added support for predefined character classes like [:alpha:]. I created a new class called RegexpTransformer for this to convert from Ruby regexp to CLR regexp pattern. Its state-driven and so can be extended if we need to do more complex analysis if needed. There are a few cases where we might need to do this in the future, and also if we want to give better error messages for bad regexps. Added Debug-only command line option called -compileRegexps to check perf impact of compiling Regexps to IL. It gives a 50%-300% improvement in throughput. Have not measured startup impact. The command line option will let us play with it easily. Added -ruby19 command line option to RunRSpec There are few more known issues with regexps that I will get to next. -------------- next part -------------- A non-text attachment was scrubbed... Name: re.diff Type: application/octet-stream Size: 27662 bytes Desc: re.diff URL: <http://rubyforge.org/pipermail/ironruby-core/attachments/20090205/5b2f8d68/attachment-0001.obj>
RegexpSpecs.syntax_error, and the similar test in language/regexp_specs.rb is
testing the parser, not regexps.
For the others, you are obscuring the code too much. Methods in the fixture
files shouldn''t hide the behavior, methods in the fixture files should
be a convenient way to get at data.
  it "returns nil if the object is nil" do
    /\w+/.send(@method, nil).should == nil
  end
  it ''supports escape characters'' do
    RegexpSpecs.match(@method, /\t/, "\t", ["\t"]) #
horizontal tab
    #...
  End
I know what the first one is doing, and if I want to see how Regexp#match, or
the other shared methods, behave I can see that easily. RegexpSpecs.match is
obscuring a lot of the behavior in the fixture files. In addition, since :=~ and
:match have slightly different behaviors, they shouldn''t be using a
shared spec. One way to reduce the code in a case like this is to use fixtures
to define character classes (like RegexpSpecs.blanks), then you could do loops
in each spec file to spec the behaviors in one line.
So the it ''supports escape characters'' do line becomes:
In classes.rb: 
#....
def self.escape_characters
  %w{\t \v \n \r \f \a \e}
end
In match_spec.rb under describe "Regexp#match":
it "supports escape characters" do
  RegexpSpecs.escape_characters.each do |char|
    char.send(@method, /#{char}/).to_a.should == [char]
  end
end
In match_spec.rb under describe "Regexp#=~ on a successful match":
it "supports escape characters" do 
  RegexpSpecs.escape_characters.each do |char|
    (/#{char}/ =~ char).should == 0
  end
end
Similar simplifications can be done for the others.
The idea is that each spec tests a facet of behavior of a method. If you are
trying to combine two facets (via case statements in this case), then you really
have two specs.
You can see my discussion with Brian Ford and Evan Phoenix about this here:
http://logs.jruby.org/rubyspec/2009-02-06.html.
JD
-----Original Message-----
From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at
rubyforge.org] On Behalf Of Shri Borde
Sent: Thursday, February 05, 2009 11:41 PM
To: IronRuby External Code Reviewers
Cc: ironruby-core at rubyforge.org
Subject: [Ironruby-core] Code Review: re
  tfpt review "/shelveset:re;REDMOND\sborde"
  Comment  : 
  Adds tests in library\regexp\match_specs.rb and language\regexp_specs.rb
  Fixes the issues found by it. \c support had a typo had checked for \C
instead. Added support for predefined character classes like [:alpha:]. I
created a new class called RegexpTransformer for this to convert from Ruby
regexp to CLR regexp pattern. Its state-driven and so can be extended if we need
to do more complex analysis if needed. There are a few cases where we might need
to do this in the future, and also if we want to give better error messages for
bad regexps.
  Added Debug-only command line option called -compileRegexps to check perf
impact of compiling Regexps to IL. It gives a 50%-300% improvement in
throughput. Have not measured startup impact. The command line option will let
us play with it easily.
  Added -ruby19 command line option to RunRSpec
  
  There are few more known issues with regexps that I will get to next.
Test review feedback has been incorporated.
  tfpt review "/shelveset:re2;REDMOND\sborde"
I still need a code review...
-----Original Message-----
From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at
rubyforge.org] On Behalf Of Jim Deville
Sent: Friday, February 06, 2009 12:57 PM
To: ironruby-core at rubyforge.org; IronRuby External Code Reviewers
Subject: Re: [Ironruby-core] Code Review: re
RegexpSpecs.syntax_error, and the similar test in language/regexp_specs.rb is
testing the parser, not regexps.
For the others, you are obscuring the code too much. Methods in the fixture
files shouldn''t hide the behavior, methods in the fixture files should
be a convenient way to get at data.
  it "returns nil if the object is nil" do
    /\w+/.send(@method, nil).should == nil
  end
  it ''supports escape characters'' do
    RegexpSpecs.match(@method, /\t/, "\t", ["\t"]) #
horizontal tab
    #...
  End
I know what the first one is doing, and if I want to see how Regexp#match, or
the other shared methods, behave I can see that easily. RegexpSpecs.match is
obscuring a lot of the behavior in the fixture files. In addition, since :=~ and
:match have slightly different behaviors, they shouldn''t be using a
shared spec. One way to reduce the code in a case like this is to use fixtures
to define character classes (like RegexpSpecs.blanks), then you could do loops
in each spec file to spec the behaviors in one line.
So the it ''supports escape characters'' do line becomes:
In classes.rb: 
#....
def self.escape_characters
  %w{\t \v \n \r \f \a \e}
end
In match_spec.rb under describe "Regexp#match":
it "supports escape characters" do
  RegexpSpecs.escape_characters.each do |char|
    char.send(@method, /#{char}/).to_a.should == [char]
  end
end
In match_spec.rb under describe "Regexp#=~ on a successful match":
it "supports escape characters" do 
  RegexpSpecs.escape_characters.each do |char|
    (/#{char}/ =~ char).should == 0
  end
end
Similar simplifications can be done for the others.
The idea is that each spec tests a facet of behavior of a method. If you are
trying to combine two facets (via case statements in this case), then you really
have two specs.
You can see my discussion with Brian Ford and Evan Phoenix about this here:
http://logs.jruby.org/rubyspec/2009-02-06.html.
JD
-----Original Message-----
From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at
rubyforge.org] On Behalf Of Shri Borde
Sent: Thursday, February 05, 2009 11:41 PM
To: IronRuby External Code Reviewers
Cc: ironruby-core at rubyforge.org
Subject: [Ironruby-core] Code Review: re
  tfpt review "/shelveset:re;REDMOND\sborde"
  Comment  : 
  Adds tests in library\regexp\match_specs.rb and language\regexp_specs.rb
  Fixes the issues found by it. \c support had a typo had checked for \C
instead. Added support for predefined character classes like [:alpha:]. I
created a new class called RegexpTransformer for this to convert from Ruby
regexp to CLR regexp pattern. Its state-driven and so can be extended if we need
to do more complex analysis if needed. There are a few cases where we might need
to do this in the future, and also if we want to give better error messages for
bad regexps.
  Added Debug-only command line option called -compileRegexps to check perf
impact of compiling Regexps to IL. It gives a 50%-300% improvement in
throughput. Have not measured startup impact. The command line option will let
us play with it easily.
  Added -ruby19 command line option to RunRSpec
  
  There are few more known issues with regexps that I will get to next.
_______________________________________________
Ironruby-core mailing list
Ironruby-core at rubyforge.org
http://rubyforge.org/mailman/listinfo/ironruby-core
-------------- next part --------------
A non-text attachment was scrubbed...
Name: re2.diff
Type: application/octet-stream
Size: 27381 bytes
Desc: re2.diff
URL:
<http://rubyforge.org/pipermail/ironruby-core/attachments/20090208/726e6c71/attachment-0001.obj>
Looks good.
Tomas
-----Original Message-----
From: Shri Borde 
Sent: Sunday, February 08, 2009 2:25 PM
To: ironruby-core at rubyforge.org; IronRuby External Code Reviewers
Subject: RE: Code Review: re
Test review feedback has been incorporated.
  tfpt review "/shelveset:re2;REDMOND\sborde"
I still need a code review...
-----Original Message-----
From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at
rubyforge.org] On Behalf Of Jim Deville
Sent: Friday, February 06, 2009 12:57 PM
To: ironruby-core at rubyforge.org; IronRuby External Code Reviewers
Subject: Re: [Ironruby-core] Code Review: re
RegexpSpecs.syntax_error, and the similar test in language/regexp_specs.rb is
testing the parser, not regexps.
For the others, you are obscuring the code too much. Methods in the fixture
files shouldn''t hide the behavior, methods in the fixture files should
be a convenient way to get at data.
  it "returns nil if the object is nil" do
    /\w+/.send(@method, nil).should == nil
  end
  it ''supports escape characters'' do
    RegexpSpecs.match(@method, /\t/, "\t", ["\t"]) #
horizontal tab
    #...
  End
I know what the first one is doing, and if I want to see how Regexp#match, or
the other shared methods, behave I can see that easily. RegexpSpecs.match is
obscuring a lot of the behavior in the fixture files. In addition, since :=~ and
:match have slightly different behaviors, they shouldn''t be using a
shared spec. One way to reduce the code in a case like this is to use fixtures
to define character classes (like RegexpSpecs.blanks), then you could do loops
in each spec file to spec the behaviors in one line.
So the it ''supports escape characters'' do line becomes:
In classes.rb: 
#....
def self.escape_characters
  %w{\t \v \n \r \f \a \e}
end
In match_spec.rb under describe "Regexp#match":
it "supports escape characters" do
  RegexpSpecs.escape_characters.each do |char|
    char.send(@method, /#{char}/).to_a.should == [char]
  end
end
In match_spec.rb under describe "Regexp#=~ on a successful match":
it "supports escape characters" do 
  RegexpSpecs.escape_characters.each do |char|
    (/#{char}/ =~ char).should == 0
  end
end
Similar simplifications can be done for the others.
The idea is that each spec tests a facet of behavior of a method. If you are
trying to combine two facets (via case statements in this case), then you really
have two specs.
You can see my discussion with Brian Ford and Evan Phoenix about this here:
http://logs.jruby.org/rubyspec/2009-02-06.html.
JD
-----Original Message-----
From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at
rubyforge.org] On Behalf Of Shri Borde
Sent: Thursday, February 05, 2009 11:41 PM
To: IronRuby External Code Reviewers
Cc: ironruby-core at rubyforge.org
Subject: [Ironruby-core] Code Review: re
  tfpt review "/shelveset:re;REDMOND\sborde"
  Comment  : 
  Adds tests in library\regexp\match_specs.rb and language\regexp_specs.rb
  Fixes the issues found by it. \c support had a typo had checked for \C
instead. Added support for predefined character classes like [:alpha:]. I
created a new class called RegexpTransformer for this to convert from Ruby
regexp to CLR regexp pattern. Its state-driven and so can be extended if we need
to do more complex analysis if needed. There are a few cases where we might need
to do this in the future, and also if we want to give better error messages for
bad regexps.
  Added Debug-only command line option called -compileRegexps to check perf
impact of compiling Regexps to IL. It gives a 50%-300% improvement in
throughput. Have not measured startup impact. The command line option will let
us play with it easily.
  Added -ruby19 command line option to RunRSpec
  
  There are few more known issues with regexps that I will get to next.
_______________________________________________
Ironruby-core mailing list
Ironruby-core at rubyforge.org
http://rubyforge.org/mailman/listinfo/ironruby-core
tfpt review "/shelveset:re;REDMOND\sborde"
  Comment  : 
  Fixes couple of remaining blocking bugs in re:
  Ruby''s /(?:m)/m should map to CLR''s "(?:s)"
  Hacky fix for /(expr){N}+/ should be treated like /(?:expr{N})+/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: re.diff
Type: application/octet-stream
Size: 7664 bytes
Desc: re.diff
URL:
<http://rubyforge.org/pipermail/ironruby-core/attachments/20090211/91ad8e71/attachment.obj>
Test looks good. I''ll also work on making the diff script include our
tests in diff''s so the list can see them.
JD
-----Original Message-----
From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at
rubyforge.org] On Behalf Of Shri Borde
Sent: Wednesday, February 11, 2009 10:47 AM
To: IronRuby External Code Reviewers
Cc: ironruby-core at rubyforge.org
Subject: [Ironruby-core] Code Review: re
  tfpt review "/shelveset:re;REDMOND\sborde"
  Comment  : 
  Fixes couple of remaining blocking bugs in re:
  Ruby''s /(?:m)/m should map to CLR''s "(?:s)"
  Hacky fix for /(expr){N}+/ should be treated like /(?:expr{N})+/
Looks good.
Tomas
-----Original Message-----
From: Shri Borde 
Sent: Wednesday, February 11, 2009 10:47 AM
To: IronRuby External Code Reviewers
Cc: ironruby-core at rubyforge.org
Subject: Code Review: re
  tfpt review "/shelveset:re;REDMOND\sborde"
  Comment  : 
  Fixes couple of remaining blocking bugs in re:
  Ruby''s /(?:m)/m should map to CLR''s "(?:s)"
  Hacky fix for /(expr){N}+/ should be treated like /(?:expr{N})+/
tfpt review "/shelveset:re;REDMOND\sborde" Comment : Regexp literal support for /o. /foo/ should compare equal with /foo/o, so we ensure that RubyRegexpOptions.Once is passed into the RubyOps runtime helper, but we then remove it in the RubyRegex constructor Caching of the RubyRegex (and the underlying System.Text.RegularExpressions.Regex) for regexp literals so that the expression is processed only once Sort the method names in the generated ReflectionCache. This will make reduce the chances of conflicts during merging. Currently, the order is whatever System.Reflection choses. -------------- next part -------------- A non-text attachment was scrubbed... Name: re.diff Type: application/octet-stream Size: 96529 bytes Desc: re.diff URL: <http://rubyforge.org/pipermail/ironruby-core/attachments/20090223/1be07af3/attachment-0001.obj>
ReflectionCacheGenerator: More concise way how to sort an array: 
var methods = ReflectMethods(typeof(RubyOps)).Values;
Array.Sort(methods, (m1, m2) => m1.Name.CompareTo(m2.Name));
ScringConstructor:
I would prefer TransformConcatenation to take 2 parameters (Expr regexOptions,
Expr regexCache) rather than params array.
Then this would not be necessary:
                    List<MSA.Expression> allArgs = new
List<MSA.Expression>(2 + additionalArgs.Length);
                    allArgs.Add(paramArray);
                    allArgs.Add(codePage);
                    allArgs.AddRange(additionalArgs);
and we could do just:
			  if (regexOptions != null) 
                      opFactory("N").OpCall(paramArray, codePage,
regexOptions, regexCache) : ...
Other than that looks good.
Tomas
-----Original Message-----
From: Shri Borde 
Sent: Monday, February 23, 2009 10:11 AM
To: IronRuby External Code Reviewers
Cc: ironruby-core at rubyforge.org
Subject: Code Review: re
  tfpt review "/shelveset:re;REDMOND\sborde"
  Comment  : 
  Regexp literal support for /o.
  /foo/ should compare equal with /foo/o, so we ensure that
RubyRegexpOptions.Once is passed into the RubyOps runtime helper, but we then
remove it in the RubyRegex constructor
  Caching of the RubyRegex (and the underlying
System.Text.RegularExpressions.Regex) for regexp literals so that the expression
is processed only once
  Sort the method names in the generated ReflectionCache. This will make reduce
the chances of conflicts during merging. Currently, the order is whatever
System.Reflection choses.
Array.Sort(methods, ...) does not work because methods is an IEnumerable, not
IList, and so I have to create an auxiliary list. I did use the inline lambda
for the comparision delegate. Changed TransformConcatenation too.
Thanks,
Shri
-----Original Message-----
From: Tomas Matousek 
Sent: Monday, February 23, 2009 11:15 AM
To: Shri Borde; IronRuby External Code Reviewers
Cc: ironruby-core at rubyforge.org
Subject: RE: Code Review: re
ReflectionCacheGenerator: More concise way how to sort an array: 
var methods = ReflectMethods(typeof(RubyOps)).Values;
Array.Sort(methods, (m1, m2) => m1.Name.CompareTo(m2.Name));
ScringConstructor:
I would prefer TransformConcatenation to take 2 parameters (Expr regexOptions,
Expr regexCache) rather than params array.
Then this would not be necessary:
                    List<MSA.Expression> allArgs = new
List<MSA.Expression>(2 + additionalArgs.Length);
                    allArgs.Add(paramArray);
                    allArgs.Add(codePage);
                    allArgs.AddRange(additionalArgs);
and we could do just:
			  if (regexOptions != null) 
                      opFactory("N").OpCall(paramArray, codePage,
regexOptions, regexCache) : ...
Other than that looks good.
Tomas
-----Original Message-----
From: Shri Borde 
Sent: Monday, February 23, 2009 10:11 AM
To: IronRuby External Code Reviewers
Cc: ironruby-core at rubyforge.org
Subject: Code Review: re
  tfpt review "/shelveset:re;REDMOND\sborde"
  Comment  : 
  Regexp literal support for /o.
  /foo/ should compare equal with /foo/o, so we ensure that
RubyRegexpOptions.Once is passed into the RubyOps runtime helper, but we then
remove it in the RubyRegex constructor
  Caching of the RubyRegex (and the underlying
System.Text.RegularExpressions.Regex) for regexp literals so that the expression
is processed only once
  Sort the method names in the generated ReflectionCache. This will make reduce
the chances of conflicts during merging. Currently, the order is whatever
System.Reflection choses.