[PATCH] patchbomb: Encode overly long lines

Thu May 7 18:44:30 CDT 2009

Rocco Rutte <pdmef at gmx.net> writes:

> # HG changeset patch
> # User Rocco Rutte <pdmef at gmx.net>
> # Date 1241711034 -7200
> # Node ID 09fb4ed3b1e1fd6bf28ca090654f2e83d7d040b5
> # Parent  db52cc4f2f97e6a125d2f71363230357c0100570
> patchbomb: Encode overly long lines
>
> A limit of 998 byte + CRLF is mandated by RfC2822.

We had a discussion about this just the other day:

  http://markmail.org/message/pmo5dlro2vqpuvcp

There the symptom was that lines were broken after 990 chars, so maybe
we should break a bit earlier to be on the safe side? The RFC says that
lines SHOULD be no more than 78 chars.

> diff --git a/mercurial/mail.py b/mercurial/mail.py
> --- a/mercurial/mail.py
> +++ b/mercurial/mail.py
> @@ -6,7 +6,7 @@
>  # GNU General Public License version 2, incorporated herein by reference.
>  
>  from i18n import _
> -import os, smtplib, socket
> +import os, smtplib, socket, quopri

There is a email.Encoders.encode_quopri function which does the same and
sets the Content-Transfer-Encoding header at the same time.

>  import email.Header, email.MIMEText, email.Utils
>  import util, encoding
>  
> @@ -88,14 +88,37 @@ def validateconfig(ui):
>  
>  def mimetextpatch(s, subtype='plain', display=False):
>      '''If patch in utf-8 transfer-encode it.'''
> +
> +    def encode_qp(str):
> +        for line in str.split('\n'):
> +            if len(line) > 998:
> +                return quopri.encodestring(str), "quoted-printable"
> +        return str, None
> +
> +    passed = False
>      if not display:
>          for cs in ('us-ascii', 'utf-8'):
>              try:
>                  s.decode(cs)
> -                return email.MIMEText.MIMEText(s, subtype, cs)
> +                s, enc = encode_qp(s)
> +                passed = True
> +                msg = email.MIMEText.MIMEText(s, subtype, cs)
> +                if enc is not None:
> +                    del msg['Content-Transfer-Encoding']
> +                    msg['Content-Transfer-Encoding'] = enc
> +                return msg
>              except UnicodeDecodeError:
>                  pass
> -    return email.MIMEText.MIMEText(s, subtype)
> +
> +    if passed:
> +        return email.MIMEText.MIMEText(s, subtype)
> +
> +    s, enc = encode_qp(s)
> +    msg = email.MIMEText.MIMEText(s, subtype)
> +    if enc is not None:
> +        del msg['Content-Transfer-Encoding']
> +        msg['Content-Transfer-Encoding'] = enc
> +    return msg

It's getting late here, but is the above code not sort of repeated? :-)

Also, if passed is set to True, then msg will always have be returned
From the loop -- or can MIMEText also throw a UnicodeDecodeError?

-- 
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://selenic.com/pipermail/mercurial-devel/attachments/20090508/4f85a0c6/attachment.pgp