thr3ads.net - Rails core - Deprecate ActiveSupport::SafeBuffer modifications [Nov 2014]

If this information is useful, please help other people find it:
Share via:
Nash Bridges
2014-Nov-21 18:42 UTC
Deprecate ActiveSupport::SafeBuffer modifications

In this proposal every method which is considered as "unsafe" for 
`ActiveSupport::SafeBuffer` is going to be deprecated, both bang and 
non-destructive versions.

At first glance, it seems that this would violate Liskov substitution 
principle, because  `ActiveSupport::SafeBuffer` is a subclass of `String`. 
However, let's step back and remember what's the purpose of safe
strings.

Safebuffer strings were introduced in Rails 3 in order to distinguish 
between strings that are already sanitized, that don't need to be sanitized 
and non-safe strings. Consider this ERB template:

<div>Hello!</div>
<%= params[:evil_input] %>
<%= link_to params[:evil_input], users_path %>

The first string is static therefore it is completely safe and shouldn't be 
sanitized at all.
The second one is definitely unsafe and should be sanitized.
The third one is made by Rails helper which returns safe strings and 
shouldn't be sanitized. However, parameter itself is unsafe and should be 
sanitized by Rails before wrapping it into <a></a>.

When ERB is compiled into a Ruby method, it wraps each dynamic string into 
a sanitizing method call, which in turn must have a way to distinguish 
between second and third row. This is when Safebuffers are really used. 
They respond with +true+ to +html_safe?+, while other strings and objects 
(except numbers) respond with +false+.

So, this is completely a view concept and even could be considered as a 
part of ActionView private API, because nobody uses safebuffer to pass data 
throughout whole application. SafeBuffers were not designed to enhance Ruby 
String in some way that other parsing/converter libraries could benefit 
from. As for subclassing String, [here][yehudakatz] explains that this was 
performance only decision ("Because SafeBuffer inherits from String, Ruby 
creates this wrapper extremely efficiently (just sharing the internal char 
* storage).") Should we bother if some methods from String are missing
then?

Back to modifications and "unsafe" methods. It turned out that if we
would
perform, for example, +gsub+ on already sanitized string that was wrapped 
into a SafeBuffer, we couldn't call it safe anymore. This is because one 
could revert any changes made by a sanitizer by another series of 
+gsub("&lt;", "<")+ etc. So the decision was made to
return raw string for
non-destructive versions of such unsafe methods and mark SafeBuffer as 
non-safe for destructive (bang) ones.

As for +gsub!+ and its friends. Not only "unsafe SafeBuffer" sounds 
ridiculous, but this highly complicated the class implementation! Let's 
look how it could look like:

module ActiveSupport #:nodoc:
  class SafeBuffer < String
    UNSAFE_STRING_METHODS = %w(
      capitalize chomp chop delete downcase gsub lstrip next reverse rstrip
      slice squeeze strip sub succ swapcase tr tr_s upcase
    )

    alias_method :safe_concat, :concat

    class SafeConcatError < StandardError
      def initialize
        super 'Could not concatenate to the buffer because it is not html 
safe.'
      end
    end

    def initialize_copy(other)
      raise SafeConcatError unless other.html_safe?
      super
    end

    def clone_empty
      self[0, 0]
    end

    def concat(value)
      super(ERB.unwrapped_html_escape(value))
    end
    alias << concat

    def prepend(value)
      super(ERB.unwrapped_html_escape(value))
    end

    def +(other)
      dup.concat(other)
    end

    def %(args)
      case args
      when Hash
        escaped_args = Hash[args.map { |k,arg| [k, 
ERB.unwrapped_html_escape(arg)] }]
      else
        escaped_args = Array(args).map { |arg| 
ERB.unwrapped_html_escape(arg) }
      end

      self.class.new(super(escaped_args))
    end

    def html_safe?
      true
    end

    def to_s
      self
    end
    alias_method :html_safe, :to_s

    def to_param
      to_str
    end

    def encode_with(coder)
      coder.represent_scalar nil, to_str
    end

    UNSAFE_STRING_METHODS.each do |unsafe_method|
      if unsafe_method.respond_to?(unsafe_method)
        class_eval <<-EOT, __FILE__, __LINE__ + 1
          def #{unsafe_method}(*args, &block)
            raise NoMethodError
          end

          def #{unsafe_method}!(*args)
            raise NoMethodError
          end
        EOT
      end
    end
  end
end

Now, aint' that nice? [Here][output_safety] you can compare it with the 
current implementation. By declaring safebuffers *always* safe we make it 
easier to understand, and we have less code to perform! 
([see][tenderlove_rant] for more information about +#safe_concat+). We also 
always return +self+ for +html_safe+ (there were a 
[couple][html_safe_pull_1] rejected [approaches][html_safe_pull_2] to 
enhance that).

One would agree that destructive modifications is true evil, but what's 
wrong with +gsub+? Well, first of all, it [doesn't work as 
expected][broken_gsub]. [Here][deprecate_gsub_block] is a proposal to fix 
that, but we see that this doesn't solve the problem entirely. Moreover, it 
leads to writing less performant and (possibly) vulnerable code. Consider 
this example code:

user_input = ERB.unwrapped_html_escape(params[:evil_input]
string_that_is_going_to_view =
"<div>#{user_input}</div>".html_safe

This code is 100% safe and efficient. Suddenly, a programmer decides to 
change something

patched_string = string_that_is_going_to_view.gsub(",", ".")

Now, his "<div"> is escaped by ERB, this is no good. What's
the solution?
Throw another +html_safe+ !

patched_string = string_that_is_going_to_view.gsub(",",
".").html_safe

Wha'ts wrong with this code? First of all, it creates two SafeBuffer 
objects while first is disposed immediately. [Here][tenderlove_unwrapped] 
@tenderlove clearly states that this is wrong approach and every object 
counts! Next, it can possibly lead to XSS attack. Consider this variant of 
code above:

patched_string = string_that_is_going_to_view.gsub("[",
"<").gsub("]",
">").html_safe

Now, even if +user_input+ was previously sanitized it can be turned into 
unsafe again.

One could ask "isn't it a user decision to take care of such
situations?" I
would answer that Rails is opinionated and should force a user to write 
better code. When I turned deprecation warnings on all unsafe methods and 
run Rails test suite, I saw *a lot* of deprecations on my screen. There're 
even specially written tests that checks if SafeBuffer works correctly with 
+#titleize+ and its family (which obviously modies the passed string). Not 
to mention that Rails views itself use +gsub!+ as well.

So, if those methods are being deprecated what should programmer do? He/she 
must realise that any change should be made *before* wrapping a string into 
Safebuffer. This would be fast and efficient. If it's impossible, then call 
+to_str+, do a modification and instantiate a new safebuffer. No changes 
with current approach, except it is explicit now (so a user would feel the 
pain and maybe decide to rewrite the code). For a library writer (who 
doesn't know beforehand what kind of data will be passed into a method) I 
would propose ActiveSupport to have monkeypatched +Kernel.String+ method, 
so it would return string from a safebuffer, otherwise call +to_s_ as usual.

[yehudakatz]: http://yehudakatz.com/2010/02/01/safebuffers-and-rails-3-0/
[output_safety]:
https://github.com/rails/rails/blob/master/activesupport/lib/active_support/core_ext/string/output_safety.rb#L131
[tenderlove_rant]:
http://tenderlovemaking.com/2014/06/04/yagni-methods-slow-us-down.html
[html_safe_pull_1]: https://github.com/rails/rails/pull/17206
[html_safe_pull_2]: https://github.com/rails/rails/pull/17250
[broken_gsub]: https://github.com/rails/rails/issues/1555
[deprecate_gsub_block]: https://github.com/rails/rails/pull/16957
[tenderlove_unwrapped]:
https://github.com/rails/rails/commit/8899503f62a556a918c45b3e7b5c2effaaa943f4

-- 
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to rubyonrails-core+unsubscribe@googlegroups.com.
To post to this group, send email to rubyonrails-core@googlegroups.com.
Visit this group at http://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/d/optout.
Rails core - Nov 2014 - Deprecate ActiveSupport::SafeBuffer modifications

Deprecate ActiveSupport::SafeBuffer modifications