FUJINAKA Tohru
2007-May-02 13:31 UTC
[iCalendar-devel] [patch] fix a bug divides a multibytes character into two by crlf
Hi all,
I have just started using iCalender library version 0.98.
And I noticed that `to_ical()'' would unfortunately split a multibytes
character into two unmeaningful characters with "\r\n".
Here is a patch.
--- [patch begin] ---
cd /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/
diff -c /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org
/home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb
*** /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org
Wed May 2 13:45:15 2007
--- /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb Wed May
2 19:15:06 2007
***************
*** 145,151 ****
value = ":#{val.to_ical}"
escaped = prelude + value.gsub("\\",
"\\\\").gsub("\n", "\\n").gsub(",",
"\\,").gsub(";", "\\;")
! s << escaped.slice!(0, MAX_LINE_LENGTH) << "\r\n
" while escaped.size > MAX_LINE_LENGTH
s << escaped << "\r\n"
s.gsub!(/ *$/, '''')
end
--- 145,151 ----
value = ":#{val.to_ical}"
escaped = prelude + value.gsub("\\",
"\\\\").gsub("\n", "\\n").gsub(",",
"\\,").gsub(";", "\\;")
! s << $1 << "\r\n " while
escaped.sub!(/^(.{#{MAX_LINE_LENGTH}})/, '''')
s << escaped << "\r\n"
s.gsub!(/ *$/, '''')
end
--- [patch end] ---
The problem was because of using a non-multicharacter-awared method of
String class.
With the applied version, users who need to handle text include
multibytes characters sholud set `$KCODE'' to an appropriate string
(e.g.
"UTF8") before calling ''to_cal()'', like this:
--- [sample begin] ---
#!/usr/bin/ruby -w
$KCODE = ''UTF8''
$vsave, $VERBOSE = $VERBOSE, false
require ''rubygems''
require ''icalendar''
$VERBOSE = $vsave
cal = Icalendar::Calendar.new
...
print cal.to_ical
--- [sample end] ---
Uh, at this time I don''t care of its performance penalty; just works
well functionally. :)
Best regards,
--
FUJINAKA Tohru <tohru at nakarika.com>
Jeff Rose
2007-May-02 13:58 UTC
[iCalendar-devel] [patch] fix a bug divides a multibytes character into two by crlf
Great, thanks a lot. I''ll include the patch and try to push a new version out. -Jeff FUJINAKA Tohru wrote:> Hi all, > > I have just started using iCalender library version 0.98. > And I noticed that `to_ical()'' would unfortunately split a multibytes > character into two unmeaningful characters with "\r\n". > > Here is a patch. > > --- [patch begin] --- > > cd /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/ > diff -c /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb > *** /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org Wed May 2 13:45:15 2007 > --- /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb Wed May 2 19:15:06 2007 > *************** > *** 145,151 **** > value = ":#{val.to_ical}" > > escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") > ! s << escaped.slice!(0, MAX_LINE_LENGTH) << "\r\n " while escaped.size > MAX_LINE_LENGTH > s << escaped << "\r\n" > s.gsub!(/ *$/, '''') > end > --- 145,151 ---- > value = ":#{val.to_ical}" > > escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") > ! s << $1 << "\r\n " while escaped.sub!(/^(.{#{MAX_LINE_LENGTH}})/, '''') > s << escaped << "\r\n" > s.gsub!(/ *$/, '''') > end > > --- [patch end] --- > > The problem was because of using a non-multicharacter-awared method of > String class. > > With the applied version, users who need to handle text include > multibytes characters sholud set `$KCODE'' to an appropriate string (e.g. > "UTF8") before calling ''to_cal()'', like this: > > --- [sample begin] --- > #!/usr/bin/ruby -w > $KCODE = ''UTF8'' > $vsave, $VERBOSE = $VERBOSE, false > require ''rubygems'' > require ''icalendar'' > $VERBOSE = $vsave > > cal = Icalendar::Calendar.new > ... > print cal.to_ical > --- [sample end] --- > > Uh, at this time I don''t care of its performance penalty; just works > well functionally. :) > > Best regards, >
FUJINAKA Tohru
2007-May-02 14:03 UTC
[iCalendar-devel] [patch] fix a bug divides a multibytes character into two by crlf
Hi,
While taking a bath after writing the previous mail
I found a bug in the patch I wrote...
The following is the fixed version against the original component.rb.
--- [patch begin] ---
cd /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/
diff -c /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org
/home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb
*** /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org
Wed May 2 13:45:15 2007
--- /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb Wed May
2 22:57:44 2007
***************
*** 145,152 ****
value = ":#{val.to_ical}"
escaped = prelude + value.gsub("\\",
"\\\\").gsub("\n", "\\n").gsub(",",
"\\,").gsub(";", "\\;")
! s << escaped.slice!(0, MAX_LINE_LENGTH) << "\r\n
" while escaped.size > MAX_LINE_LENGTH
! s << escaped << "\r\n"
s.gsub!(/ *$/, '''')
end
--- 145,152 ----
value = ":#{val.to_ical}"
escaped = prelude + value.gsub("\\",
"\\\\").gsub("\n", "\\n").gsub(",",
"\\,").gsub(";", "\\;")
! s << $1 << "\r\n " while
escaped.sub!(/^(.{#{MAX_LINE_LENGTH}})/, '''')
! s << escaped << "\r\n" unless escaped.empty?
s.gsub!(/ *$/, '''')
end
--- [patch end] ---
regards,
--
FUJINAKA Tohru <tohru at nakarika.com>
FUJINAKA Tohru
2007-May-07 07:34 UTC
[iCalendar-devel] [patch] fix a bug divides a multibytes character into two by crlf
Hi,
In the RFC2445 sec. 4.1:
"Lines of text SHOULD NOT be longer than 75 octets, excluding the line
break."
While ruby''s match(/./) matches 1 character and its length(octets)
might be more than 1 in UTF8 encoding, the patch I posted previously would
break the spec.
Still the spec uses "SHOULD NOT" instead of "MUST NOT", over
75 octets
lines should be acceptable for applications, though, to comply the spec
would be more acceptable. So I wrote more code.
A method for folding string:
class String
def ical_fold(octet, split_by = "\r\n ")
return clone if octet <= 0
return clone if length <= octet
lead_octets split_by.match(/\n([^\n]+)$/m) ? $1.length : 0
result = ""
n = 0
scan(/./m) do |c|
n += c.length
if octet < n
result << split_by
n = lead_octets + c.length
end
result << c
end
return result
end
end
And a patch against the original code in version 0.98:
diff -c /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org
/home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb
*** /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org
Wed May 2 13:45:15 2007
--- /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb Mon May
7 15:56:02 2007
***************
*** 145,152 ****
value = ":#{val.to_ical}"
escaped = prelude + value.gsub("\\",
"\\\\").gsub("\n", "\\n").gsub(",",
"\\,").gsub(";", "\\;")
! s << escaped.slice!(0, MAX_LINE_LENGTH) << "\r\n
" while escaped.size > MAX_LINE_LENGTH
! s << escaped << "\r\n"
s.gsub!(/ *$/, '''')
end
--- 145,151 ----
value = ":#{val.to_ical}"
escaped = prelude + value.gsub("\\",
"\\\\").gsub("\n", "\\n").gsub(",",
"\\,").gsub(";", "\\;")
! s << escaped.ical_fold(MAX_LINE_LENGTH, "\r\n ")
<< "\r\n"
s.gsub!(/ *$/, '''')
end
Regards,
--
FUJINAKA Tohru <tohru at nakarika.com>