takumi iino
2007-Jan-12 10:39 UTC
[Mechanize-users] why dose to_absolute_uri use URI.escape?
hello. This code is abort with Mechanize 0.6.4 . ---------------------------- # sample.rb require "rubygems" require "mechanize" agent = WWW::Mechanize.new agent.user_agent_alias=''Windows Mozilla'' # top page of wikipedia for japanese agent.get("http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8") -----------------------------> ruby sample.rbruby sample.rb C:/opt/ruby-1.8/lib/ruby/1.8/uri/common.rb:432:in `split'': bad URI(is not URI?): http://ja.wikipedia.org/wiki/??????????? (URI::InvalidURIError) from C:/opt/ruby-1.8/lib/ruby/1.8/uri/common.rb:481:in `parse'' from C:/opt/ruby-1.8/lib/ruby/gems/1.8/gems/mechanize-0.6.4/lib/mechanize.rb:272:in `to_absolute_uri'' from C:/opt/ruby-1.8/lib/ruby/gems/1.8/gems/mechanize-0.6.4/lib/mechanize.rb:141:in `get'' from sample.rb:6 to_absolute_uri in mechanize.rb url = URI.parse( URI.unescape(Util.html_unescape(url.to_s.strip)).gsub(/ /, ''%20'') ) unless url.is_a? URI This code cann''t run with escaped multibyte character. Why URI.unescape( "uri" ).gsub(/ /, ''%20'') ? I guess URI.unescape( "uri" ).gsub(/ /, ''%20'') is not needed. url = URI.parse( Util.html_unescape(url.to_s.strip) ) unless url.is_a? URI --------- takumi