[PATCH stable] templatefilters: make json filter handle multibyte characters correctly

Yuya Nishihara yuya at tcha.org
Sun Aug 8 11:54:46 CDT 2010


Patrick Mézard wrote:
> Le 07/08/10 09:36, Yuya Nishihara a écrit :
> > # HG changeset patch
> > # User Yuya Nishihara <yuya at tcha.org>
> > # Date 1281166036 -32400
> > # Branch stable
> > # Node ID 0e36aafcca8fedbf60e05b985d5f6426045c8e28
> > # Parent  36e25f25dec11e68fc3240326999c02b3879ab10
> > templatefilters: make json filter handle multibyte characters correctly
> > 
> > It aims to fix javascript error of hgweb's graph view in Japanese 'cp932'
> > encoding.
> > 
> > 'cp932' contains multibyte characters ending with '\x5c' (backslash),
> > e.g. '\x94\x5c' for Japanese Kanji 'Noh'.
> > Due to json filter escapes '\' to '\\', multibyte string ending with
> > '\x5c' is translated to "xxx\", resulting javascript parse error on
> > a web browser.
> > 
> > This patch changes json() to pass unicode to jsonescape().
> > 
> > diff --git a/mercurial/templatefilters.py b/mercurial/templatefilters.py
> > --- a/mercurial/templatefilters.py
> > +++ b/mercurial/templatefilters.py
> > @@ -156,9 +156,13 @@ def json(obj):
> >      elif isinstance(obj, int) or isinstance(obj, float):
> >          return str(obj)
> >      elif isinstance(obj, str):
> > -        return '"%s"' % jsonescape(obj)
> > +        try:
> > +            return '"%s"' % jsonescape(unicode(
> > +                obj, encoding.encoding)).encode(encoding.encoding)
> > +        except (UnicodeEncodeError, UnicodeDecodeError):
> > +            return '"%s"' % jsonescape(obj)
> 
> So, if we fail to decode/encode the string, we still may generate an invalid
> JSON string, right? Shouldn't we "unicode(obj, encoding.encoding, 'replace')"
> or something similar instead?

If we can assume that the encoding is correctly setup, 'replace' seems better.

AFAIK, problematic encodings are only Japanese cp932 (aka Shift_JIS),
iso-2022-jp and simplified Chinese.
Most decent encodings don't use ASCII code for their multibyte sequences.

Yuya,


More information about the Mercurial-devel mailing list