[PATCH 0 of 4] mailutil: extension for international email support

Christian Ebert blacktrash at gmx.net
Thu Mar 6 02:37:31 CST 2008


* Matt Mackall on Wednesday, March 05, 2008 at 09:55:19 -0600
> On Wed, Mar 05, 2008 at 01:03:09PM +0100, Christian Ebert wrote:
>> This patch series implements support to properly encode email
>> headers and bodies, as long as the bodies do not contain patches.
> 
> Given the subtlety of the issues involved here, I'm going to need a
> lot more description of what this thinks it's doing.

It thinks (oh, the power of illusions) it's doing something like
what Mutt's $send_charset variable does. At least it knows that
it's stolen and only longs to be improved, corrected etc. by
other proposals.

More seriously:

The aim is to make Mercurial mails more readable, and send them
in the "less bit-intensive" charset possible.

1. it introduces a configlist item for [email]: sendcharsets

sendcharsets is a list of charsets that are cycled by the methods
in mailutil in the given order to make non-ascii mail headers and
bodies readable in the receiving mail client. This list does not
need to include us-ascii and utf-8, as these are tried first and
last, respectively. If strict conversion succeeds the encoded
string is returned ready to be used for it's respective purpose
(header, address, body).

If none of the above succeed it falls back to util.tolocal()
using util._encoding as "sendcharset".

atm, it relies on util._encoding being compatible with current
locale. In case $HGENCODING is set to a different value it
declares "us-ascii" as charset and handles errors by replacing.
See also description of 1. patch

2. it does nothing in itself, but it's methods are meant to be
   used by other extensions that send mail. Methods:

headencode: encodes a header

    probably subject most of the time, but also used internally
    by:

addressencode: encodes an email address

    as a side effect the address is checked for validity.

mimeencode: encodes body or message parts, returning a mimetext
object with corresponding charset and content-transfer-encoding
declaration.

The last is *not* meant for patches, as said in the introductory
message. An example is the 2nd patch where patch data is left as
pseudo-ascii at the moment. But if you do "hg email -a" the patch
description /part/ should be readable in a mail client.

c
-- 
I would prefer not to.
--Herman Melville, Bartleby


More information about the Mercurial-devel mailing list