[PATCH] templatefilters: don't escape <> in JSON

Gregory Szorc gregory.szorc at gmail.com
Fri Jan 16 11:39:32 CST 2015


On Fri, Jan 16, 2015 at 6:58 AM, Augie Fackler <raf at durin42.com> wrote:

> On Fri, Jan 16, 2015 at 12:48:15AM -0800, Pierre-Yves David wrote:
> >
> >
> > On 01/15/2015 09:04 PM, Gregory Szorc wrote:
> > ># HG changeset patch
> > ># User Gregory Szorc <gregory.szorc at gmail.com>
> > ># Date 1421384385 28800
> > >#      Thu Jan 15 20:59:45 2015 -0800
> > ># Node ID a07b22eefd8e4c629b739778b3ca5f3d53a8b1de
> > ># Parent  049a9e3a078d7c988cb12ed456aad6ec2779ea69
> > >templatefilters: don't escape <> in JSON
> > >
> > >55c763926a28 added escaping of "<" and ">" in JSON. I could not find any
> > >specification claiming that these are special characters that need to be
> > >escaped. Furthermore, feeding these characters through both Python's and
> > >SpiderMonkey's JSON serialization API revealed no escaping.
> >
> > CCing matt as he is the paranoid who did this. Tests or more extensible
> > explanation would be cool.
>
> I believe this defends against certain types of XSS attacks. This[0]
> site reinforces that belief. I'm not sure of any specific modern
> browsers this defends, but I'm happy to retain this escaping.
>
> 0:
> https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#Output_Encoding_Rules_Summary
>


I'm pretty sure if you are doing an e.g. .innerHTML = <output from
Mercurial>, you are going to get XSS no matter what.

The \uXXXX encoding here does nothing more than change the representation
in the JSON serialization: anything that parses it will normalize \u003c to
<. As evidenced by SpiderMonkey:

    js> JSON.parse('"\\u003c\\u003e"')
    "<>"

FTR, I've seen some insanely weird JSON encoders that perform the \u
escaping on all non-reserved characters, even those in the ASCII range.
It's perfectly valid JSON. It comes out as ASCII on the other end:

    js> JSON.parse('"\\u0068\\u0065\\u006c\\u006c\\u006f"')
    "hello"

If you care about preventing XSS, you need to HTML escape content from
JSON. Period. The alternative would be to HTML escape strings before they
go into the JSON payload. But that would be a violation because Mercurial's
JSON is not always destined for HTML. Mercurial's JSON should be "pure" and
target agnostic. In other words, it should follow the spec (RFC 4627).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150116/b04e0a12/attachment.html>


More information about the Mercurial-devel mailing list