D6184: changelog: extract a _string_unescape() to mirror _string_escape()

Tue Apr 2 19:30:45 UTC 2019

martinvonz created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  We use our own _string_escape() to encode the "extras" field. Then we
  use codecs.escape_decode() to escape it. But there's also a little
  workaround for dealing with escaped text that looks like octal numbers
  since the fix for
  https://bz.mercurial-scm.org/show_bug.cgi?id=3156. This patch extracts
  the call to codecs.escape_decode() along with the fix for octal
  numbers and puts it in a _string_unescape(). It also updates the test
  to check for the octal-number case from the aforementioned bug.
  
  As you may have suspected, I want to be able to reuse this new
  function later.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6184

AFFECTED FILES
  mercurial/changelog.py

CHANGE DETAILS

diff --git a/mercurial/changelog.py b/mercurial/changelog.py
--- a/mercurial/changelog.py
+++ b/mercurial/changelog.py
@@ -35,17 +35,25 @@
     """
     >>> from .pycompat import bytechr as chr
     >>> d = {b'nl': chr(10), b'bs': chr(92), b'cr': chr(13), b'nul': chr(0)}
-    >>> s = b"ab%(nl)scd%(bs)s%(bs)sn%(nul)sab%(cr)scd%(bs)s%(nl)s" % d
+    >>> s = b"ab%(nl)scd%(bs)s%(bs)sn%(nul)s12ab%(cr)scd%(bs)s%(nl)s" % d
     >>> s
-    'ab\\ncd\\\\\\\\n\\x00ab\\rcd\\\\\\n'
+    'ab\\ncd\\\\\\\\n\\x0012ab\\rcd\\\\\\n'
     >>> res = _string_escape(s)
-    >>> s == stringutil.unescapestr(res)
+    >>> s == _string_unescape(res)
     True
     """
     # subset of the string_escape codec
     text = text.replace('\\', '\\\\').replace('\n', '\\n').replace('\r', '\\r')
     return text.replace('\0', '\\0')
 
+def _string_unescape(text):
+    if '\\0' in text:
+        # fix up \0 without getting into trouble with \\0
+        text = text.replace('\\\\', '\\\\\n')
+        text = text.replace('\\0', '\0')
+        text = text.replace('\n', '')
+    return stringutil.unescapestr(text)
+
 def decodeextra(text):
     """
     >>> from .pycompat import bytechr as chr
@@ -60,12 +68,7 @@
     extra = _defaultextra.copy()
     for l in text.split('\0'):
         if l:
-            if '\\0' in l:
-                # fix up \0 without getting into trouble with \\0
-                l = l.replace('\\\\', '\\\\\n')
-                l = l.replace('\\0', '\0')
-                l = l.replace('\n', '')
-            k, v = stringutil.unescapestr(l).split(':', 1)
+            k, v = _string_unescape(l).split(':', 1)
             extra[k] = v
     return extra
 



To: martinvonz, #hg-reviewers
Cc: mercurial-devel