Is the DLR going to be fixed so that it properly supports Unicode source files or is this an issue with IronRuby? If you attempt to create a new Code File with Visual Studio 2008 and call it test.rb and then execute it with: ScriptRuntime runtime = IronRuby.Ruby.CreateRuntime(); runtime.ExecuteFile( "test.rb" ); it blows up on the Unicode byte-order marker with: Unhandled Exception: Microsoft.Scripting.SyntaxErrorException: Invalid character ''?'' in expression at Microsoft.Scripting.ErrorSink.Add(SourceUnit source, String message, SourceSpan span, Int32 errorCode, Severity severity) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\ErrorSink.cs:line 34 at Microsoft.Scripting.ErrorCounter.Add(SourceUnit source, String message, SourceSpan span, Int32 errorCode, Severity severity) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\ErrorSink.cs:line 92 at IronRuby.Compiler.Tokenizer.Report(String message, Int32 errorCode, SourceSpan location, Severity severity) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 430 at IronRuby.Compiler.Tokenizer.ReportError(ErrorInfo info, Object[] args) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 442 at IronRuby.Compiler.Tokenizer.Tokenize(Boolean whitespaceSeen, Boolean cmdState) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 966 at IronRuby.Compiler.Tokenizer.Tokenize() in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 739 at IronRuby.Compiler.Tokenizer.GetNextToken() in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 711 at IronRuby.Compiler.Parser.GetNextToken() in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Parser.cs:line 99 at IronRuby.Compiler.ShiftReduceParser`2.Parse() in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\GPPG.cs:line 310 at IronRuby.Compiler.Parser.Parse(SourceUnit sourceUnit, RubyCompilerOptions options, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Parser.cs:line 158 at IronRuby.Runtime.RubyContext.ParseSourceCode(SourceUnit sourceUnit, RubyCompilerOptions options, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Runtime\RubyContext.cs:line 203 at IronRuby.Runtime.RubyContext.CompileSourceCode(SourceUnit sourceUnit, CompilerOptions options, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Runtime\RubyContext.cs:line 179 at Microsoft.Scripting.SourceUnit.Compile(CompilerOptions options, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\SourceUnit.cs:line 215 at Microsoft.Scripting.SourceUnit.Execute(Scope scope, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\SourceUnit.cs:line 225 at Microsoft.Scripting.Hosting.ScriptSource.Execute(ScriptScope scope) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\Hosting\ScriptSource.cs:line 129 at Microsoft.Scripting.Hosting.ScriptEngine.ExecuteFile(String path, ScriptScope scope) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\Hosting\ScriptEngine.cs:line 159 at Microsoft.Scripting.Hosting.ScriptEngine.ExecuteFile(String path) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\Hosting\ScriptEngine.cs:line 148 at Microsoft.Scripting.Hosting.ScriptRuntime.ExecuteFile(String path) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\Hosting\ScriptRuntime.cs:line 257 at HostingDLRConsole.Program.Main(String[] args) in C:\Users\ted\Documents\Visual Studio 2008\Projects\Books\IronRuby in Action\HostingDLRConsole\HostingDLRConsole\Program.cs:line 14 Press any key to continue . . . I know I can fix this by using the Advanced Save Options but the DLR spec talks about Unicode support, so I assume this means that ScriptRuntime.ExecuteFile() should also support Unicode source files.
We do this for compatibility with Ruby 1.8.6, though as you can see, we don''t have the error message quite right: PS F:\> C:\ruby\bin\ruby.exe x.rb x.rb:1: Invalid char `\377'' in expression x.rb:1: Invalid char `\376'' in expression :) I believe you''ll need to save as UTF-8 and then manually strip the BOM in order to use Unicode source files -- hopefully Tomas will tell me if I''m wrong. Source encoding for Ruby is extremely tricky, and (from what I can tell) hasn''t even yet been finalized for 1.9.x. We will eventually support whatever the Ruby standards are. -----Original Message----- From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at rubyforge.org] On Behalf Of Ted Milker Sent: Sunday, October 26, 2008 9:38 AM To: ironruby-core at rubyforge.org Subject: [Ironruby-core] Unicode Source Files Is the DLR going to be fixed so that it properly supports Unicode source files or is this an issue with IronRuby? If you attempt to create a new Code File with Visual Studio 2008 and call it test.rb and then execute it with: ScriptRuntime runtime = IronRuby.Ruby.CreateRuntime(); runtime.ExecuteFile( "test.rb" ); it blows up on the Unicode byte-order marker with: Unhandled Exception: Microsoft.Scripting.SyntaxErrorException: Invalid character ''?'' in expression at Microsoft.Scripting.ErrorSink.Add(SourceUnit source, String message, SourceSpan span, Int32 errorCode, Severity severity) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\ErrorSink.cs:line 34 at Microsoft.Scripting.ErrorCounter.Add(SourceUnit source, String message, SourceSpan span, Int32 errorCode, Severity severity) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\ErrorSink.cs:line 92 at IronRuby.Compiler.Tokenizer.Report(String message, Int32 errorCode, SourceSpan location, Severity severity) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 430 at IronRuby.Compiler.Tokenizer.ReportError(ErrorInfo info, Object[] args) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 442 at IronRuby.Compiler.Tokenizer.Tokenize(Boolean whitespaceSeen, Boolean cmdState) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 966 at IronRuby.Compiler.Tokenizer.Tokenize() in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 739 at IronRuby.Compiler.Tokenizer.GetNextToken() in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Tokenizer.cs:line 711 at IronRuby.Compiler.Parser.GetNextToken() in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Parser.cs:line 99 at IronRuby.Compiler.ShiftReduceParser`2.Parse() in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\GPPG.cs:line 310 at IronRuby.Compiler.Parser.Parse(SourceUnit sourceUnit, RubyCompilerOptions options, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Compiler\Parser\Parser.cs:line 158 at IronRuby.Runtime.RubyContext.ParseSourceCode(SourceUnit sourceUnit, RubyCompilerOptions options, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Runtime\RubyContext.cs:line 203 at IronRuby.Runtime.RubyContext.CompileSourceCode(SourceUnit sourceUnit, CompilerOptions options, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\ironruby\Runtime\RubyContext.cs:line 179 at Microsoft.Scripting.SourceUnit.Compile(CompilerOptions options, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\SourceUnit.cs:line 215 at Microsoft.Scripting.SourceUnit.Execute(Scope scope, ErrorSink errorSink) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\SourceUnit.cs:line 225 at Microsoft.Scripting.Hosting.ScriptSource.Execute(ScriptScope scope) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\Hosting\ScriptSource.cs:line 129 at Microsoft.Scripting.Hosting.ScriptEngine.ExecuteFile(String path, ScriptScope scope) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\Hosting\ScriptEngine.cs:line 159 at Microsoft.Scripting.Hosting.ScriptEngine.ExecuteFile(String path) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\Hosting\ScriptEngine.cs:line 148 at Microsoft.Scripting.Hosting.ScriptRuntime.ExecuteFile(String path) in C:\Users\ted\Desktop\IronRuby\src\Microsoft.Scripting\Hosting\ScriptRuntime.cs:line 257 at HostingDLRConsole.Program.Main(String[] args) in C:\Users\ted\Documents\Visual Studio 2008\Projects\Books\IronRuby in Action\HostingDLRConsole\HostingDLRConsole\Program.cs:line 14 Press any key to continue . . . I know I can fix this by using the Advanced Save Options but the DLR spec talks about Unicode support, so I assume this means that ScriptRuntime.ExecuteFile() should also support Unicode source files. _______________________________________________ Ironruby-core mailing list Ironruby-core at rubyforge.org http://rubyforge.org/mailman/listinfo/ironruby-core
Why so rigorous? I understand the need to maintain compatibility but this effectively eliminates Visual Studio as an editor for .rb files, without some kind of clunky build mechanism. I guess I will just use an extension method to get around the behavior for the time being.>From the things I have read about Ruby and UTF-8, it seems more likeit is just extremely broken, rather than extremely tricky. I still cannot even get pure Ruby stuff in Windows to work properly with UTF-8, like when using the Shoes toolkit for example. On Sun, Oct 26, 2008 at 11:52 AM, Curt Hagenlocher <curth at microsoft.com> wrote:> We do this for compatibility with Ruby 1.8.6, though as you can see, we don''t have the error message quite right: > > PS F:\> C:\ruby\bin\ruby.exe x.rb > x.rb:1: Invalid char `\377'' in expression > x.rb:1: Invalid char `\376'' in expression > > :) > > I believe you''ll need to save as UTF-8 and then manually strip the BOM in order to use Unicode source files -- hopefully Tomas will tell me if I''m wrong. > > Source encoding for Ruby is extremely tricky, and (from what I can tell) hasn''t even yet been finalized for 1.9.x. We will eventually support whatever the Ruby standards are.
Here is the extension method I am using if anyone else is interested: public static object ExecuteUnicodeFile( this ScriptRuntime rt, string filename ) { string rbCode; // OpenText will strip the BOM and keep the Unicode intact using( var rdr = File.OpenText( filename ) ) { rbCode = rdr.ReadToEnd(); } return IronRuby.Ruby.GetEngine( rt ).Execute( rbCode ); } It works great for using Japanese in strings in Ruby with IronRuby and WPF.
A lot of the work in 1.9 and 2.0 has gone to better unicode support. Most string handling functions are now codepoint aware, and there is now the ability for the source file to have an encoding attached to it. Like Curt said, these are in flex, but they are spec''d in RubySpec, so they are more than just fleeting ideas. If you are able to solve this with an extension method, then it looks likely that any VS integration work for IRb will take care of that. As it is, I use GVim for most of my Ruby coding these days. :) JD -----Original Message----- From: Ted Milker <tmilker at gmail.com> Sent: October 26, 2008 12:08 PM To: ironruby-core at rubyforge.org <ironruby-core at rubyforge.org> Subject: Re: [Ironruby-core] Unicode Source Files Here is the extension method I am using if anyone else is interested: public static object ExecuteUnicodeFile( this ScriptRuntime rt, string filename ) { string rbCode; // OpenText will strip the BOM and keep the Unicode intact using( var rdr = File.OpenText( filename ) ) { rbCode = rdr.ReadToEnd(); } return IronRuby.Ruby.GetEngine( rt ).Execute( rbCode ); } It works great for using Japanese in strings in Ruby with IronRuby and WPF. _______________________________________________ Ironruby-core mailing list Ironruby-core at rubyforge.org http://rubyforge.org/mailman/listinfo/ironruby-core
If you save in "Western European (Windows) - Codepage 1252" from within Visual Studio, you''ll get the right result -- as long as you''re not using any characters with a codepoint greater than 127. And if you are, you''re probably better off anyway expressing this code point as an explicit set of UTF-8 compatible bytes because -- as you''ve noticed -- Ruby''s currently a bit weird in its Unicode support. -----Original Message----- From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at rubyforge.org] On Behalf Of Ted Milker Sent: Sunday, October 26, 2008 11:34 AM To: ironruby-core at rubyforge.org Subject: Re: [Ironruby-core] Unicode Source Files Why so rigorous? I understand the need to maintain compatibility but this effectively eliminates Visual Studio as an editor for .rb files, without some kind of clunky build mechanism. I guess I will just use an extension method to get around the behavior for the time being.>From the things I have read about Ruby and UTF-8, it seems more likeit is just extremely broken, rather than extremely tricky. I still cannot even get pure Ruby stuff in Windows to work properly with UTF-8, like when using the Shoes toolkit for example. On Sun, Oct 26, 2008 at 11:52 AM, Curt Hagenlocher <curth at microsoft.com> wrote:> We do this for compatibility with Ruby 1.8.6, though as you can see, we don''t have the error message quite right: > > PS F:\> C:\ruby\bin\ruby.exe x.rb > x.rb:1: Invalid char `\377'' in expression > x.rb:1: Invalid char `\376'' in expression > > :) > > I believe you''ll need to save as UTF-8 and then manually strip the BOM in order to use Unicode source files -- hopefully Tomas will tell me if I''m wrong. > > Source encoding for Ruby is extremely tricky, and (from what I can tell) hasn''t even yet been finalized for 1.9.x. We will eventually support whatever the Ruby standards are._______________________________________________ Ironruby-core mailing list Ironruby-core at rubyforge.org http://rubyforge.org/mailman/listinfo/ironruby-core
On Sun, Oct 26, 2008 at 3:17 PM, Jim Deville <jdeville at microsoft.com> wrote:> > If you are able to solve this with an extension method, then it looks likely that any VS integration work for IRb will take care of that. As it is, I use GVim for most of my Ruby coding these days. :)I use ViEmu for the best of both worlds. :)
You can switch to 1.9 compat mode by passing -19 argument on command line. Tomas -----Original Message----- From: ironruby-core-bounces at rubyforge.org [mailto:ironruby-core-bounces at rubyforge.org] On Behalf Of Ted Milker Sent: Sunday, October 26, 2008 1:57 PM To: ironruby-core at rubyforge.org Subject: Re: [Ironruby-core] Unicode Source Files On Sun, Oct 26, 2008 at 3:17 PM, Jim Deville <jdeville at microsoft.com> wrote:> > If you are able to solve this with an extension method, then it looks likely that any VS integration work for IRb will take care of that. As it is, I use GVim for most of my Ruby coding these days. :)I use ViEmu for the best of both worlds. :) _______________________________________________ Ironruby-core mailing list Ironruby-core at rubyforge.org http://rubyforge.org/mailman/listinfo/ironruby-core