[PATCH] templatefilters: don't escape <> in JSON

Augie Fackler raf at durin42.com
Fri Jan 16 11:41:37 CST 2015


On Jan 16, 2015, at 12:39 PM, Gregory Szorc <gregory.szorc at gmail.com> wrote:

> On Fri, Jan 16, 2015 at 6:58 AM, Augie Fackler <raf at durin42.com> wrote:
> On Fri, Jan 16, 2015 at 12:48:15AM -0800, Pierre-Yves David wrote:
> >
> >
> > On 01/15/2015 09:04 PM, Gregory Szorc wrote:
> > ># HG changeset patch
> > ># User Gregory Szorc <gregory.szorc at gmail.com>
> > ># Date 1421384385 28800
> > >#      Thu Jan 15 20:59:45 2015 -0800
> > ># Node ID a07b22eefd8e4c629b739778b3ca5f3d53a8b1de
> > ># Parent  049a9e3a078d7c988cb12ed456aad6ec2779ea69
> > >templatefilters: don't escape <> in JSON
> > >
> > >55c763926a28 added escaping of "<" and ">" in JSON. I could not find any
> > >specification claiming that these are special characters that need to be
> > >escaped. Furthermore, feeding these characters through both Python's and
> > >SpiderMonkey's JSON serialization API revealed no escaping.
> >
> > CCing matt as he is the paranoid who did this. Tests or more extensible
> > explanation would be cool.
> 
> I believe this defends against certain types of XSS attacks. This[0]
> site reinforces that belief. I'm not sure of any specific modern
> browsers this defends, but I'm happy to retain this escaping.
> 
> 0: https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#Output_Encoding_Rules_Summary
> 
> 
> I'm pretty sure if you are doing an e.g. .innerHTML = <output from Mercurial>, you are going to get XSS no matter what.
> 
> The \uXXXX encoding here does nothing more than change the representation in the JSON serialization: anything that parses it will normalize \u003c to <. As evidenced by SpiderMonkey:
> 
>     js> JSON.parse('"\\u003c\\u003e"')
>     "<>"
> 
> FTR, I've seen some insanely weird JSON encoders that perform the \u escaping on all non-reserved characters, even those in the ASCII range. It's perfectly valid JSON. It comes out as ASCII on the other end:
> 
>     js> JSON.parse('"\\u0068\\u0065\\u006c\\u006c\\u006f"')
>     "hello"

Yes, after it's parsed. I believe the owasp guidelines (and similar ones I've seen internally at Google) are around people doing stupid stuff with the raw un-parsed json.

> 
> If you care about preventing XSS, you need to HTML escape content from JSON. Period. The alternative would be to HTML escape strings before they go into the JSON payload. But that would be a violation because Mercurial's JSON is not always destined for HTML. Mercurial's JSON should be "pure" and target agnostic. In other words, it should follow the spec (RFC 4627).
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150116/312063f2/attachment.pgp>


More information about the Mercurial-devel mailing list