[Bug 5031] New: toutf8b <-> fromutf8b roundtrip error

mercurial-bugs at selenic.com mercurial-bugs at selenic.com
Sat Jan 2 23:12:15 UTC 2016


https://bz.mercurial-scm.org/show_bug.cgi?id=5031

            Bug ID: 5031
           Summary: toutf8b <-> fromutf8b roundtrip error
           Product: Mercurial
           Version: default branch
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: feature
          Priority: wish
         Component: Mercurial
          Assignee: bugzilla at selenic.com
          Reporter: gregory.szorc at gmail.com
                CC: mercurial-devel at selenic.com

The fuzzing with hypothesis has revealed another issue with utfb roundtripping.

--- /home/gps/src/hg/tests/test-encoding.t
+++ /home/gps/src/hg/tests/test-encoding.t.err
@@ -280,6 +280,21 @@
   >>> from hypothesishelpers import *
   >>> from mercurial import encoding
   >>> roundtrips(st.binary(), encoding.fromutf8b, encoding.toutf8b)
-  Round trip OK
+  Falsifying example: testroundtrips(value='\xf1\x80\x80\x80\x80')
+  Traceback (most recent call last):
+    File "/home/gps/src/hg/tests/hypothesishelpers.py", line 50, in roundtrips
+      testroundtrips()
+    File "/home/gps/src/hg/tests/hypothesishelpers.py", line 40, in
testroundtrips
+      def testroundtrips(value):
+    File
"/home/gps/venvs/hg_dev/lib/python2.7/site-packages/hypothesis/core.py", line
585, in wrapped_test
+      print_example=True, is_final=True
+    File
"/home/gps/venvs/hg_dev/lib/python2.7/site-packages/hypothesis/executors/executors.py",
line 25, in default_executor
+      return function()
+    File
"/home/gps/venvs/hg_dev/lib/python2.7/site-packages/hypothesis/core.py", line
365, in run
+      return test(*args, **kwargs)
+    File "/home/gps/src/hg/tests/hypothesishelpers.py", line 47, in
testroundtrips
+      decoded
+  ValueError: Round trip failed: toutf8b('\xf1\x80\x80\x80\x80') ->
fromutf8b('\xf1\x80\x80\x80\xed\xb2\x80') -> '\xed\xa3\x80\x00\x80'
+  ValueError("Round trip failed: toutf8b('\\xf1\\x80\\x80\\x80\\x80') ->
fromutf8b('\\xf1\\x80\\x80\\x80\\xed\\xb2\\x80') ->
'\\xed\\xa3\\x80\\x00\\x80'",)

 #endif

Other sequences that don't roundtrip include:

\x80\xf1\x80\x80\x80
\x80\xf0\x90\x80\x80

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Mercurial-devel mailing list