[PATCH] changelog: fix decoding of extra data (issue3156)

Matt Mackall mpm at selenic.com
Fri Dec 16 18:15:24 CST 2011


On Fri, 2011-12-16 at 11:26 -0700, Lars Boehnke wrote:
> # HG changeset patch
> # User lboehnke <lboehnke at symmetricom.com>
> # Date 1324058856 25200
> # Node ID 56a4a7dd2adc50e4db9486b5b90c4b1a5fb06ccd
> # Parent  c7b0bedbb07ac3752243db046f54c9a4bbc3273d
> changelog: fix decoding of extra data (issue3156)
> 
> The 'string_escape' codec is not compatible with the custom
> encoding used in _string_escape.  This adds a custom decoder that
> is compatible.

Thanks for researching this. Unfortunately, this is a bit
performance-sensitive, so we'd like to use the built-in decoder and
string replacement functions wherever possible rather than doing
character by character manipulation in Python.

Here's the patch I'm going to go with. It passes your excellent test
harness and I've embedded a couple tests for future reference:

diff -r cf96c922295e mercurial/changelog.py
--- a/mercurial/changelog.py	Fri Dec 16 12:08:10 2011 -0600
+++ b/mercurial/changelog.py	Fri Dec 16 18:03:57 2011 -0600
@@ -27,11 +27,18 @@
     """
     >>> decodeextra(encodeextra({'foo': 'bar', 'baz': chr(0) + '2'}))
     {'foo': 'bar', 'baz': '\\x002'}
+    >>> decodeextra(encodeextra({'foo': 'bar', 'baz': chr(92) + chr(0) + '2'}))
+    {'foo': 'bar', 'baz': '\\\\\\x002'}
     """
     extra = {}
     for l in text.split('\0'):
         if l:
-            k, v = l.replace('\\0', '\0').decode('string_escape').split(':', 1)
+            if '\\0' in l:
+                # fix up \0 without getting into trouble with \\0
+                l = l.replace('\\\\', '\\\\\n')
+                l = l.replace('\\0', '\0')
+                l = l.replace('\n', '')
+            k, v = l.decode('string_escape').split(':', 1)
             extra[k] = v
     return extra

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list