JSON / ActiveSupport Rails gotcha: Avoid XSS exploits when passing HTML in JSON

Ran into a tricky situation today -- we're working on the ability to support javascript on Posterous blogs. One problem we saw with the Theme Editor was that </script> tags were causing problems with JSON. Browsers would see a </script> block and actually interpret as the end of the entire script block, as opposed to merely an entity within the JSON string. When dealing with user generated content, that also opens your site up to a pretty serious JS XSS attack.

ActiveSupport actually has been modified to translate < and > into their unicode encodings and avoid this problem. However, if you like many people use / require the JSON gem, this ActiveSupport to_json implementation is stripped entirely.

The simple fix -- make your own String that contains the necessary String to_json method. Be sure to use the new string class in place of the standard string class when you want the appropriate behavior of escaping angle brackets.

Here's the code:
# this is needed so that we can still access the original ActiveSupport version of JSON encoding
# JSON gem is faster but does not support automatic unicode conversion for < and >, which can cause
# problems for </script> in JSON output (browser interprets as exiting the script area, and results in XSS exploit)
#
# e.g. EscapableJsonString.new('<no_xss>').to_json
# => "\u003Cno_xss\u003E"
#
class EscapableJsonString < String
def to_json(options = nil) #:nodoc:
json = '"' + gsub(ActiveSupport::JSON::Encoding.escape_regex) { |s|
ActiveSupport::JSON::Encoding::ESCAPED_CHARS[s]
}
json.force_encoding('ascii-8bit') if respond_to?(:force_encoding)
json.gsub(/([\xC0-\xDF][\x80-\xBF]|
[\xE0-\xEF][\x80-\xBF]{2}|
[\xF0-\xF7][\x80-\xBF]{3})+/nx) { |s|
s.unpack("U*").pack("n*").unpack("H*")[0].gsub(/.{4}/, '\\\\u\&')
} + '"'
end
end
views