[PATCH 0 of 1] Decode UTF-8 e-mail headers present in git formatted patches

funman at videolan.org funman at videolan.org
Tue Nov 12 10:13:17 UTC 2013


This new patch adds a test which is currently failing.

http://mercurial.selenic.com/wiki/EncodingStrategy#The_encoding_tracking_problem
says user names are 'owned and managed' by Mercurial and should be encoded as UTF-8.

This seems to not be the case, as HGENCODING=ascii (as used by the testsuite) will 
render the user name 'ë' (0xc3 0xab) as '?'
Is encoding.fromlocal being used? I couldn't figure it out.

Having email_decode return an UTF-8 string instead of using tolocal will make the test 
fail with:

transaction abort!
rollback completed
abort: decoding near '\xc3\xab': 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)! (esc)

So it seems that user names are being rendered as ascii instead of UTF-8.

Remarks welcome!

Thanks,


More information about the Mercurial-devel mailing list