[PATCH] patchbomb: Use "unknown-8bit" as fallback character set

Christian Ebert blacktrash at gmx.net
Tue May 12 03:17:01 CDT 2009


* Rocco Rutte on Tuesday, May 12, 2009 at 09:19:05 +0200
>> While this seems like the right thing to do, I'm a little worried that
>> not all clients will handle it correctly?
> 
> I don't know, I only have mutt and apple mail to test. unknown-8bit
> instead of latin1 works in both.
> 
>> There are two important considerations here:
> 
>> 1) If I send a patch containing an \xe1 byte, I want it
>> to arrive on the client's disk as \xe1, regardless of the locales
>> involved on the sender and receiver's machines or their mail clients'
>> particular ideas about transcoding.
> 
> Of course. However, I wouldn't bet that us-ascii gets through untouched
> in all cases if it's actually 8bit. This patch was aimed at making the
> transport as safe as possible. But I don't have hard numbers how many
> gateways still convert messages. Everybody should support 8bit by now.
> 
> unknown-8bit doesn't have to be a better fallback than us-ascii, but at
> least it doesn't claim that we know something about the input which
> isn't true. Though it's not an option, application/octet-stream should
> be safest here.
> 
> unknown-8bit is also sort of problematic since it's supposed to be
> inserted by MTAs and never by MUAs and the RfC doesn't specify what to
> do when encountering this media type on the receiving side. On the other
> hand, we need a charset for text media types.
> 
>> 2) At the same time, I don't want the entire message to become
>> unreadable -in- the mail client.
> 
> Though maybe not most convenient: every mail client should have a way to
> display raw messages.
> 
> Btw, how about a command line/config option that specifies a charset to
> use so the user can override it if he knows better? Maybe even a list so
> we can use the best match? This would at least help where there's only
> one non-ascii charset used.

Perhaps mail.mimeencode() (uses mail._charsets() derived from
email.charsets) could be used optionally for patches ... but if
someone has this option "hardcoded" in [defaults] ...

btw, mail.mimeencode() should probably set unknown-8bit instead
of us-ascii as last resort too.

c
-- 
\black\trash movie    _C O W B O Y_  _C A N O E_  _C O M A_
Ein deutscher Western/A German Western
-->> http://www.blacktrash.org/underdogma/ccc.html
-->> http://www.blacktrash.org/underdogma/ccc-en.html


More information about the Mercurial-devel mailing list