FUJINAKA Tohru
2007-May-02 13:31 UTC
[iCalendar-devel] [patch] fix a bug divides a multibytes character into two by crlf
Hi all, I have just started using iCalender library version 0.98. And I noticed that `to_ical()'' would unfortunately split a multibytes character into two unmeaningful characters with "\r\n". Here is a patch. --- [patch begin] --- cd /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/ diff -c /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb *** /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org Wed May 2 13:45:15 2007 --- /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb Wed May 2 19:15:06 2007 *************** *** 145,151 **** value = ":#{val.to_ical}" escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") ! s << escaped.slice!(0, MAX_LINE_LENGTH) << "\r\n " while escaped.size > MAX_LINE_LENGTH s << escaped << "\r\n" s.gsub!(/ *$/, '''') end --- 145,151 ---- value = ":#{val.to_ical}" escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") ! s << $1 << "\r\n " while escaped.sub!(/^(.{#{MAX_LINE_LENGTH}})/, '''') s << escaped << "\r\n" s.gsub!(/ *$/, '''') end --- [patch end] --- The problem was because of using a non-multicharacter-awared method of String class. With the applied version, users who need to handle text include multibytes characters sholud set `$KCODE'' to an appropriate string (e.g. "UTF8") before calling ''to_cal()'', like this: --- [sample begin] --- #!/usr/bin/ruby -w $KCODE = ''UTF8'' $vsave, $VERBOSE = $VERBOSE, false require ''rubygems'' require ''icalendar'' $VERBOSE = $vsave cal = Icalendar::Calendar.new ... print cal.to_ical --- [sample end] --- Uh, at this time I don''t care of its performance penalty; just works well functionally. :) Best regards, -- FUJINAKA Tohru <tohru at nakarika.com>
Jeff Rose
2007-May-02 13:58 UTC
[iCalendar-devel] [patch] fix a bug divides a multibytes character into two by crlf
Great, thanks a lot. I''ll include the patch and try to push a new version out. -Jeff FUJINAKA Tohru wrote:> Hi all, > > I have just started using iCalender library version 0.98. > And I noticed that `to_ical()'' would unfortunately split a multibytes > character into two unmeaningful characters with "\r\n". > > Here is a patch. > > --- [patch begin] --- > > cd /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/ > diff -c /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb > *** /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org Wed May 2 13:45:15 2007 > --- /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb Wed May 2 19:15:06 2007 > *************** > *** 145,151 **** > value = ":#{val.to_ical}" > > escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") > ! s << escaped.slice!(0, MAX_LINE_LENGTH) << "\r\n " while escaped.size > MAX_LINE_LENGTH > s << escaped << "\r\n" > s.gsub!(/ *$/, '''') > end > --- 145,151 ---- > value = ":#{val.to_ical}" > > escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") > ! s << $1 << "\r\n " while escaped.sub!(/^(.{#{MAX_LINE_LENGTH}})/, '''') > s << escaped << "\r\n" > s.gsub!(/ *$/, '''') > end > > --- [patch end] --- > > The problem was because of using a non-multicharacter-awared method of > String class. > > With the applied version, users who need to handle text include > multibytes characters sholud set `$KCODE'' to an appropriate string (e.g. > "UTF8") before calling ''to_cal()'', like this: > > --- [sample begin] --- > #!/usr/bin/ruby -w > $KCODE = ''UTF8'' > $vsave, $VERBOSE = $VERBOSE, false > require ''rubygems'' > require ''icalendar'' > $VERBOSE = $vsave > > cal = Icalendar::Calendar.new > ... > print cal.to_ical > --- [sample end] --- > > Uh, at this time I don''t care of its performance penalty; just works > well functionally. :) > > Best regards, >
FUJINAKA Tohru
2007-May-02 14:03 UTC
[iCalendar-devel] [patch] fix a bug divides a multibytes character into two by crlf
Hi, While taking a bath after writing the previous mail I found a bug in the patch I wrote... The following is the fixed version against the original component.rb. --- [patch begin] --- cd /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/ diff -c /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb *** /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org Wed May 2 13:45:15 2007 --- /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb Wed May 2 22:57:44 2007 *************** *** 145,152 **** value = ":#{val.to_ical}" escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") ! s << escaped.slice!(0, MAX_LINE_LENGTH) << "\r\n " while escaped.size > MAX_LINE_LENGTH ! s << escaped << "\r\n" s.gsub!(/ *$/, '''') end --- 145,152 ---- value = ":#{val.to_ical}" escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") ! s << $1 << "\r\n " while escaped.sub!(/^(.{#{MAX_LINE_LENGTH}})/, '''') ! s << escaped << "\r\n" unless escaped.empty? s.gsub!(/ *$/, '''') end --- [patch end] --- regards, -- FUJINAKA Tohru <tohru at nakarika.com>
FUJINAKA Tohru
2007-May-07 07:34 UTC
[iCalendar-devel] [patch] fix a bug divides a multibytes character into two by crlf
Hi, In the RFC2445 sec. 4.1: "Lines of text SHOULD NOT be longer than 75 octets, excluding the line break." While ruby''s match(/./) matches 1 character and its length(octets) might be more than 1 in UTF8 encoding, the patch I posted previously would break the spec. Still the spec uses "SHOULD NOT" instead of "MUST NOT", over 75 octets lines should be acceptable for applications, though, to comply the spec would be more acceptable. So I wrote more code. A method for folding string: class String def ical_fold(octet, split_by = "\r\n ") return clone if octet <= 0 return clone if length <= octet lead_octets split_by.match(/\n([^\n]+)$/m) ? $1.length : 0 result = "" n = 0 scan(/./m) do |c| n += c.length if octet < n result << split_by n = lead_octets + c.length end result << c end return result end end And a patch against the original code in version 0.98: diff -c /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb *** /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb.org Wed May 2 13:45:15 2007 --- /home/trac/ruby/gem/gems/icalendar-0.98/lib/icalendar/component.rb Mon May 7 15:56:02 2007 *************** *** 145,152 **** value = ":#{val.to_ical}" escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") ! s << escaped.slice!(0, MAX_LINE_LENGTH) << "\r\n " while escaped.size > MAX_LINE_LENGTH ! s << escaped << "\r\n" s.gsub!(/ *$/, '''') end --- 145,151 ---- value = ":#{val.to_ical}" escaped = prelude + value.gsub("\\", "\\\\").gsub("\n", "\\n").gsub(",", "\\,").gsub(";", "\\;") ! s << escaped.ical_fold(MAX_LINE_LENGTH, "\r\n ") << "\r\n" s.gsub!(/ *$/, '''') end Regards, -- FUJINAKA Tohru <tohru at nakarika.com>