[PATCH 0 of 6] Encode non-ascii chars in mails (issue814)

Christian Ebert blacktrash at gmx.net
Sat Jul 12 19:32:47 UTC 2008


Hi,

This patch series implements the capability to send messages
containing non-ascii chars. Message parts containing patches are
only mime-encoded if they are clean utf-8.

Patch 6 addresses issue #814.

2 main tasks:

1. Patches

Patches must be kept independent of conventions between sender
and recipient. They are sent in ascii, utf-8, or as fake ascii
(current behaviour; see also TODO). utf-8 is safe to detect.

2. Mail parts that do not contain patches

Introduce new [email] charsets config (default:
util._encoding). us-ascii is always implied and tried first.

[email]
# for westerners
charsets = iso-8859-1, iso-8859-15, windows-1252
# other examples:
# iso-8859-1, iso-8859-15, windows-1252, iso-8859-2, windows-1250
# iso-8859-1, iso-8859-15, windows-1252, iso-8859-2, windows-1250, iso-2022-jp, iso-2022-jp-ms

(idea stolen from Mutt)

For headers and message parts that do not contain patches the
convert function cycles through charsets in descending order
to try a successful conversion.

Both $HGENCODING and util._fallbackencoding are tried for input.

As last resort the conversion falls back to fake ascii (ie. the
current behaviour).

New methods in mail module:
mimetextpatch
charsets
headencode
addressencode
mimeencode

TODO:
No conversion for "hg email --test"
Force quoted-printable instead of base64?
Add proper tests.
Allow other utf charsets for patches?
How to handle patches containing 8-bit text and of course
changeset patches containing more than one charset. The cleanest
solution would probably be to send them as binary attachments
(application/x-patch? -- difficult w/o
email.MIME.MIMEApplication, introduced in Python 2.5).

The patch queue is also available at
<http://www.blacktrash.org/hg/hg-mail-mq/>

Crew repo with queue on top at
<http://www.blacktrash.org/hg/hg-crew-mq/>

c


More information about the Mercurial-devel mailing list