email encoding help needed

Gregory Szorc gregory.szorc at gmail.com
Fri Feb 15 16:28:12 EST 2019


https://www.mercurial-scm.org/repo/hg/rev/9b3be572ff0c documented my
findings when I looked at this a few days back.

Something feels "off" with regards to our handling of encodings here. But
I'm not sure exactly what we should change.

On Fri, Feb 15, 2019 at 12:41 PM Augie Fackler <raf at durin42.com> wrote:

> Howdy folks! We're down to only a few (single digits!) failing tests on
> Python 3, but one in particular has us stuck:
>
> cd tests && python3 run-tests.py test-notify.t
> running 1 tests using 1 parallel processes
>
> --- tests/test-notify.t
> +++ /tests/test-notify.t.err
> @@ -415,36 +415,28 @@
>    >   -m `"$PYTHON" -c
> 'print("\xc3\xa0\xc3\xa1\xc3\xa2\xc3\xa3\xc3\xa4")'`
>    $ hg --traceback --cwd b --encoding utf-8 pull ../a | \
>    >   "$PYTHON" $TESTTMP/filter.py
> +  error: incoming.notify hook raised an exception: 'ascii' codec can't
> encode characters in position 42-51: ordinal not in range(128)
> +  Traceback (most recent call last):
> +    File "hgtests.fckrh2v2/install/lib/python/mercurial/hook.py", line
> 98, in pythonhook
> +      r = obj(ui=ui, repo=repo, hooktype=htype,
> **pycompat.strkwargs(args))
> +    File "/hgtests.fckrh2v2/install/lib/python/hgext/notify.py", line
> 519, in hook
> +      n.send(ctx, count, data)
> +    File "hgtests.fckrh2v2/install/lib/python/hgext/notify.py", line 384,
> in send
> +      msg = mail.mimeencode(self.ui, payload, self.charsets, self.test)
> +    File "hgtests.fckrh2v2/install/lib/python/mercurial/mail.py", line
> 366, in mimeencode
> +      return mimetextqp(s, 'plain', cs)
> +    File "hgtests.fckrh2v2/install/lib/python/mercurial/mail.py", line
> 253, in mimetextqp
> +      msg.set_payload(body, cs)
> +    File "lib/python3.7/email/message.py", line 315, in set_payload
> +      payload = payload.encode(charset.output_charset)
> +  UnicodeEncodeError: 'ascii' codec can't encode characters in position
> 42-51: ordinal not in range(128)
>
>
> The wrinkle is that the commit message comes from this:
>
>   $ hg --cwd a --encoding utf-8 commit -A -d '0 0' \
>   >   -m `"$PYTHON" -c 'print("\xc3\xa0\xc3\xa1\xc3\xa2\xc3\xa3\xc3\xa4")'`
>
>
> IOW, it's intentionally some UTF-8. For commit messages we can expect
> UTF8, but for patch bodies we're not so lucky, so I'm curious what we
> should do. Does anyone have an informed opinion on an encoding we should
> (or should not!) use for plain-text patches in message bodies? I'm pretty
> convinced at this point that we're doing invalid things in our emails
> today, and they're largely working by good fortune.
>
> Thanks,
> Augie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20190215/1c93b1af/attachment.html>


More information about the Mercurial-devel mailing list