[PATCH STABLE] extdiff: escape path for docstring (issue5301)

Matt Mackall mpm at selenic.com
Mon Jul 25 17:26:03 EDT 2016


# HG changeset patch
# User Matt Mackall <mpm at selenic.com>
# Date 1468877135 18000
#      Mon Jul 18 16:25:35 2016 -0500
# Node ID 13a9fb0d88ac8544e28b17e1663315d31f11859b
# Parent  953839de96ab574caa40557c542c262286c6287c
extdiff: escape path for docstring (issue5301)

The existing code (a) assumed path would be specified in
encoding.encoding and (b) assumed unicode() objects wouldn't cause
other parts of Mercurial to blow up. Both are dangerous assumptions.

Since we don't know the encoding of path and can't pass non-ASCII
through docstrings, just escape the path and drop the early _(). Will
have to suffice until we can teach docstrings to handle UTF-8b
escaping.

This has the side-effect that the line containing the path is now
variable by the time it reaches _() and thus can't be translated.

diff -r 953839de96ab -r 13a9fb0d88ac hgext/extdiff.py
--- a/hgext/extdiff.py	Sun Jul 17 15:13:51 2016 -0700
+++ b/hgext/extdiff.py	Mon Jul 18 16:25:35 2016 -0500
@@ -365,7 +365,10 @@
                 if options:
                     options = ' ' + options
                 return dodiff(ui, repo, cmdline + options, pats, opts)
-            doc = _('''\
+            # We can't pass non-ASCII through docstrings (and path is
+            # in an unknown encoding anyway)
+            docpath = path.encode("string-escape")
+            mydiff.__doc__ = '''\
 use %(path)s to diff repository (or selected files)
 
     Show differences between revisions for the specified files, using
@@ -376,15 +379,7 @@
     that revision is compared to the working directory, and, when no
     revisions are specified, the working directory files are compared
     to its parent.\
-''') % {'path': util.uirepr(path)}
-
-            # We must translate the docstring right away since it is
-            # used as a format string. The string will unfortunately
-            # be translated again in commands.helpcmd and this will
-            # fail when the docstring contains non-ASCII characters.
-            # Decoding the string to a Unicode string here (using the
-            # right encoding) prevents that.
-            mydiff.__doc__ = doc.decode(encoding.encoding)
+''' % {'path': util.uirepr(docpath)}
             return mydiff
         command(cmd, extdiffopts[:], _('hg %s [OPTION]... [FILE]...') % cmd,
                 inferrepo=True)(save(cmdline))
diff -r 953839de96ab -r 13a9fb0d88ac tests/test-extdiff.t
--- a/tests/test-extdiff.t	Sun Jul 17 15:13:51 2016 -0700
+++ b/tests/test-extdiff.t	Mon Jul 18 16:25:35 2016 -0500
@@ -389,3 +389,23 @@
   $ cd ..
 
 #endif
+
+Test handling of non-ASCII paths in generated docstrings (issue5301)
+
+  >>> open("u", "w").write("\xa5\xa5")
+  $ U=`cat u`
+
+  $ HGPLAIN=1 hg --config hgext.extdiff= --config extdiff.cmd.td=hi help -k xyzzy
+  abort: no matches
+  (try "hg help" for a list of topics)
+  [255]
+
+  $ HGPLAIN=1 hg --config hgext.extdiff= --config extdiff.cmd.td=hi help td > /dev/null
+
+  $ LC_MESSAGES=ja_JP.UTF-8 hg --config hgext.extdiff= --config extdiff.cmd.td=$U help -k xyzzy
+  abort: no matches
+  (try "hg help" for a list of topics)
+  [255]
+
+  $ LC_MESSAGES=ja_JP.UTF-8 hg --config hgext.extdiff= --config extdiff.cmd.td=$U help td | grep "^use"
+  use '\xa5\xa5' to diff repository (or selected files)


More information about the Mercurial-devel mailing list