In this proposal every method which is considered as "unsafe" for
`ActiveSupport::SafeBuffer` is going to be deprecated, both bang and
non-destructive versions.
At first glance, it seems that this would violate Liskov substitution
principle, because `ActiveSupport::SafeBuffer` is a subclass of `String`.
However, let's step back and remember what's the purpose of safe
strings.
Safebuffer strings were introduced in Rails 3 in order to distinguish
between strings that are already sanitized, that don't need to be sanitized
and non-safe strings. Consider this ERB template:
<div>Hello!</div>
<%= params[:evil_input] %>
<%= link_to params[:evil_input], users_path %>
The first string is static therefore it is completely safe and shouldn't be
sanitized at all.
The second one is definitely unsafe and should be sanitized.
The third one is made by Rails helper which returns safe strings and
shouldn't be sanitized. However, parameter itself is unsafe and should be
sanitized by Rails before wrapping it into <a></a>.
When ERB is compiled into a Ruby method, it wraps each dynamic string into
a sanitizing method call, which in turn must have a way to distinguish
between second and third row. This is when Safebuffers are really used.
They respond with +true+ to +html_safe?+, while other strings and objects
(except numbers) respond with +false+.
So, this is completely a view concept and even could be considered as a
part of ActionView private API, because nobody uses safebuffer to pass data
throughout whole application. SafeBuffers were not designed to enhance Ruby
String in some way that other parsing/converter libraries could benefit
from. As for subclassing String, [here][yehudakatz] explains that this was
performance only decision ("Because SafeBuffer inherits from String, Ruby
creates this wrapper extremely efficiently (just sharing the internal char
* storage).") Should we bother if some methods from String are missing
then?
Back to modifications and "unsafe" methods. It turned out that if we
would
perform, for example, +gsub+ on already sanitized string that was wrapped
into a SafeBuffer, we couldn't call it safe anymore. This is because one
could revert any changes made by a sanitizer by another series of
+gsub("<", "<")+ etc. So the decision was made to
return raw string for
non-destructive versions of such unsafe methods and mark SafeBuffer as
non-safe for destructive (bang) ones.
As for +gsub!+ and its friends. Not only "unsafe SafeBuffer" sounds
ridiculous, but this highly complicated the class implementation! Let's
look how it could look like:
module ActiveSupport #:nodoc:
class SafeBuffer < String
UNSAFE_STRING_METHODS = %w(
capitalize chomp chop delete downcase gsub lstrip next reverse rstrip
slice squeeze strip sub succ swapcase tr tr_s upcase
)
alias_method :safe_concat, :concat
class SafeConcatError < StandardError
def initialize
super 'Could not concatenate to the buffer because it is not html
safe.'
end
end
def initialize_copy(other)
raise SafeConcatError unless other.html_safe?
super
end
def clone_empty
self[0, 0]
end
def concat(value)
super(ERB.unwrapped_html_escape(value))
end
alias << concat
def prepend(value)
super(ERB.unwrapped_html_escape(value))
end
def +(other)
dup.concat(other)
end
def %(args)
case args
when Hash
escaped_args = Hash[args.map { |k,arg| [k,
ERB.unwrapped_html_escape(arg)] }]
else
escaped_args = Array(args).map { |arg|
ERB.unwrapped_html_escape(arg) }
end
self.class.new(super(escaped_args))
end
def html_safe?
true
end
def to_s
self
end
alias_method :html_safe, :to_s
def to_param
to_str
end
def encode_with(coder)
coder.represent_scalar nil, to_str
end
UNSAFE_STRING_METHODS.each do |unsafe_method|
if unsafe_method.respond_to?(unsafe_method)
class_eval <<-EOT, __FILE__, __LINE__ + 1
def #{unsafe_method}(*args, &block)
raise NoMethodError
end
def #{unsafe_method}!(*args)
raise NoMethodError
end
EOT
end
end
end
end
Now, aint' that nice? [Here][output_safety] you can compare it with the
current implementation. By declaring safebuffers *always* safe we make it
easier to understand, and we have less code to perform!
([see][tenderlove_rant] for more information about +#safe_concat+). We also
always return +self+ for +html_safe+ (there were a
[couple][html_safe_pull_1] rejected [approaches][html_safe_pull_2] to
enhance that).
One would agree that destructive modifications is true evil, but what's
wrong with +gsub+? Well, first of all, it [doesn't work as
expected][broken_gsub]. [Here][deprecate_gsub_block] is a proposal to fix
that, but we see that this doesn't solve the problem entirely. Moreover, it
leads to writing less performant and (possibly) vulnerable code. Consider
this example code:
user_input = ERB.unwrapped_html_escape(params[:evil_input]
string_that_is_going_to_view =
"<div>#{user_input}</div>".html_safe
This code is 100% safe and efficient. Suddenly, a programmer decides to
change something
patched_string = string_that_is_going_to_view.gsub(",", ".")
Now, his "<div"> is escaped by ERB, this is no good. What's
the solution?
Throw another +html_safe+ !
patched_string = string_that_is_going_to_view.gsub(",",
".").html_safe
Wha'ts wrong with this code? First of all, it creates two SafeBuffer
objects while first is disposed immediately. [Here][tenderlove_unwrapped]
@tenderlove clearly states that this is wrong approach and every object
counts! Next, it can possibly lead to XSS attack. Consider this variant of
code above:
patched_string = string_that_is_going_to_view.gsub("[",
"<").gsub("]",
">").html_safe
Now, even if +user_input+ was previously sanitized it can be turned into
unsafe again.
One could ask "isn't it a user decision to take care of such
situations?" I
would answer that Rails is opinionated and should force a user to write
better code. When I turned deprecation warnings on all unsafe methods and
run Rails test suite, I saw *a lot* of deprecations on my screen. There're
even specially written tests that checks if SafeBuffer works correctly with
+#titleize+ and its family (which obviously modies the passed string). Not
to mention that Rails views itself use +gsub!+ as well.
So, if those methods are being deprecated what should programmer do? He/she
must realise that any change should be made *before* wrapping a string into
Safebuffer. This would be fast and efficient. If it's impossible, then call
+to_str+, do a modification and instantiate a new safebuffer. No changes
with current approach, except it is explicit now (so a user would feel the
pain and maybe decide to rewrite the code). For a library writer (who
doesn't know beforehand what kind of data will be passed into a method) I
would propose ActiveSupport to have monkeypatched +Kernel.String+ method,
so it would return string from a safebuffer, otherwise call +to_s_ as usual.
[yehudakatz]: http://yehudakatz.com/2010/02/01/safebuffers-and-rails-3-0/
[output_safety]:
https://github.com/rails/rails/blob/master/activesupport/lib/active_support/core_ext/string/output_safety.rb#L131
[tenderlove_rant]:
http://tenderlovemaking.com/2014/06/04/yagni-methods-slow-us-down.html
[html_safe_pull_1]: https://github.com/rails/rails/pull/17206
[html_safe_pull_2]: https://github.com/rails/rails/pull/17250
[broken_gsub]: https://github.com/rails/rails/issues/1555
[deprecate_gsub_block]: https://github.com/rails/rails/pull/16957
[tenderlove_unwrapped]:
https://github.com/rails/rails/commit/8899503f62a556a918c45b3e7b5c2effaaa943f4
--
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to rubyonrails-core+unsubscribe@googlegroups.com.
To post to this group, send email to rubyonrails-core@googlegroups.com.
Visit this group at http://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/d/optout.