[PATCH] mail: take --encoding and HGENCODING into account
Yuya Nishihara
yuya at tcha.org
Sat Oct 8 04:59:00 EDT 2016
On Fri, 07 Oct 2016 09:56:10 -0500, Gábor Stefanik wrote:
> # HG changeset patch
> # User Gábor Stefanik <gabor.stefanik at nng.com>
> # Date 1475667922 -7200
> # Wed Oct 05 13:45:22 2016 +0200
> # Node ID 31350841be0c6af1c335fb02b28b8fd1f79089b9
> # Parent 91a3c58ecf938ed675f5364b88f0d663f12b0047
> mail: take --encoding and HGENCODING into account
New encoding strategy looks good. Can you update tests and resend?
Also, I found a couple of nits. Please see the inline comments.
> --- a/mercurial/mail.py
> +++ b/mercurial/mail.py
> @@ -205,22 +205,40 @@
>
> def mimetextpatch(s, subtype='plain', display=False):
> '''Return MIME message suitable for a patch.
> - Charset will be detected as utf-8 or (possibly fake) us-ascii.
> + Charset will be detected by first trying to decode as us-ascii, then utf-8,
> + and finally the global encodings. If all those fail, fall back to
> + ISO-8859-1, an encoding with that allows all byte sequences.
> Transfer encodings will be used if necessary.'''
>
> - cs = 'us-ascii'
> + def codec2iana(encoding):
> + encoding = email.charset.Charset(encoding).input_charset.lower()
> +
> + if encoding.startswith("iso") and not encoding.startswith("iso-"):
> + return "iso-" + encoding[3:]
> + return encoding
- encoding.charset is a module. we need "import encoding.charset" in case
it isn't imported yet.
- we generally define this kind of functions in module scope, which has to
capture no local variables.
- better to not shadow the global "encoding" module.
- can you add a comment why we have to fix 'iso' aliases?
> + cs = "iso-8859-1" # a "safe" encoding with no invalid byte sequences
> if not display:
This change is mostly the source of the test failure. Maybe we can move it
to "not display" block.
> try:
> s.decode('us-ascii')
> + cs = 'us-ascii'
> except UnicodeDecodeError:
> try:
> s.decode('utf-8')
> cs = 'utf-8'
> except UnicodeDecodeError:
> - # We'll go with us-ascii as a fallback.
> - pass
> + try:
> + s.decode(encoding.encoding)
> + cs = encoding.encoding
> + except UnicodeDecodeError:
> + try:
> + s.decode(encoding.fallbackencoding)
> + cs = encoding.fallbackencoding
> + except UnicodeDecodeError
SyntaxError
> + # fall back to ISO-8859-1
> + pass
It's time to rewrite them as a for loop?
More information about the Mercurial-devel
mailing list