[PATCH] patchbomb: fix address parsing, allow multiple addrs in --to/cc/bcc

Marti Raudsepp marti at juffo.org
Tue Nov 3 18:49:17 CST 2009


Here is version 2 of my patch to allow multiple addresses in --to/cc/bcc
arguments.

On Mon, 2009-11-02 at 09:45 +0200, timeless wrote: 
> >  addrs = ','.join(opts.get(opt)).split(',')
> 
> from a purity perspective, this is wrong :)
> 
> at work, an email address might be of the form:
> "Smith, David" <david.smith at example.com>
> 
> proper parsing of email addresses is non trivial

You are right. The current code already breaks your example because it
splits configuration variables/interactive input by ','. In fact, MIME
doesn't even require a comma between multiple addresses. So instead of
trying to extend this broken behavior, I will fix it.

I change mecurial/mail.py to add a new function addrlistencode(). This
gets rid of the ugly list comprehension etc, because Python's
email.utils.getaddresses() already does stripping, splitting and
everything else for us and it knows more about proper MIME syntax than
we do.

For reference, here are the outputs of the new testcase: 
hg email --date '1980-1-1 0:1' -m tmp.mbox -f quux -t 'spam<spam><eggs>' \
 -t toast -c 'foo,bar at example.com' -c '"A, B <>" <a at example.com>' -s test -r 0 \
 --config email.bcc='"Quux, A." <quux>'

with patch:
  To: spam <spam>, eggs, toast
  Cc: foo, bar at example.com, "A, B <>" <a at example.com>
  Bcc: "Quux, A." <quux>

without patch:
  To: spam <spam>, toast
  Cc: foo, "A, B <>" <a at example.com>
  Bcc: Quux, A.

Should I split this patch in two? One to change mail.py, the other to
implement this in patchbomb.py?

Marti 

# HG changeset patch
# User Marti Raudsepp <marti at juffo.org>
# Date 1257295431 -7200
# Node ID b1521074fae86dcfed479e2e3885c1fce5fda347
# Parent  dd5a16ad420e6d5845d45dc7bde717a90900ccc2
patchbomb: fix address parsing, allow multiple addrs in --to/cc/bcc

Instead of using custom code to split apart addresses, we now use Python's
email.Utils.getaddresses() which always does the Right Thing.

Adds a new function addrlistencode to mercurial.mail, like addressencode, which
accepts a list of addresses as input, and returns a list of formatted
addresses. Each element in the input list can contain multiple addresses.

Previously, 'hg email --to=foo,bar' only respected foo and discarded bar. Also,
when something like "Lastname, Firstname <foo>" was specified interactively or
in hgrc, patchbomb would get confused.

The testcase uses '-m tmp.mbox' because -n disables address mangling.

diff --git a/hgext/patchbomb.py b/hgext/patchbomb.py
--- a/hgext/patchbomb.py
+++ b/hgext/patchbomb.py
@@ -379,20 +379,21 @@
     else:
         msgs = getpatchmsgs(list(getpatches(revs)))
 
-    def getaddrs(opt, prpt, default = None):
-        addrs = opts.get(opt) or (ui.config('email', opt) or
-                                  ui.config('patchbomb', opt) or
-                                  prompt(ui, prpt, default)).split(',')
-        return [mail.addressencode(ui, a.strip(), _charsets, opts.get('test'))
-                for a in addrs if a.strip()]
+    def getaddrs(opt, prpt = None, default = None):
+        if opts.get(opt):
+            return mail.addrlistencode(ui, opts.get(opt), _charsets,
+                                       opts.get('test'))
+
+        addrs = (ui.config('email', opt) or
+                 ui.config('patchbomb', opt) or '')
+        if not addrs and prpt:
+            addrs = prompt(ui, prpt, default)
+
+        return mail.addrlistencode(ui, [addrs], _charsets, opts.get('test'))
 
     to = getaddrs('to', 'To')
     cc = getaddrs('cc', 'Cc', '')
-
-    bcc = opts.get('bcc') or (ui.config('email', 'bcc') or
-                          ui.config('patchbomb', 'bcc') or '').split(',')
-    bcc = [mail.addressencode(ui, a.strip(), _charsets, opts.get('test'))
-           for a in bcc if a.strip()]
+    bcc = getaddrs('bcc')
 
     ui.write('\n')
 
diff --git a/mercurial/mail.py b/mercurial/mail.py
--- a/mercurial/mail.py
+++ b/mercurial/mail.py
@@ -160,11 +160,7 @@
         return str(email.Header.Header(s, cs))
     return s
 
-def addressencode(ui, address, charsets=None, display=False):
-    '''Turns address into RFC-2047 compliant header.'''
-    if display or not address:
-        return address or ''
-    name, addr = email.Utils.parseaddr(address)
+def _addressencode(ui, name, addr, charsets=None):
     name = headencode(ui, name, charsets)
     try:
         acc, dom = addr.split('@')
@@ -181,6 +177,26 @@
             raise util.Abort(_('invalid local address: %s') % addr)
     return email.Utils.formataddr((name, addr))
 
+def addressencode(ui, address, charsets=None, display=False):
+    '''Turns address into RFC-2047 compliant header.'''
+    if display or not address:
+        return address or ''
+    name, addr = email.Utils.parseaddr(address)
+    return _addressencode(ui, name, addr, charsets)
+
+def addrlistencode(ui, addrs, charsets=None, display=False):
+    '''Turns a list of addresses into a list of RFC-2047 compliant headers.
+    A single element of input list may contain multiple addresses, but output
+    always has one address per item'''
+    if display:
+        return [a.strip() for a in addrs if a.strip()]
+
+    result = []
+    for name, addr in email.Utils.getaddresses(addrs):
+        if name or addr:
+            result.append(_addressencode(ui, name, addr, charsets))
+    return result
+
 def mimeencode(ui, s, charsets=None, display=False):
     '''creates mime text object, encodes it if needed, and sets
     charset and transfer-encoding accordingly.'''
diff --git a/tests/test-patchbomb b/tests/test-patchbomb
--- a/tests/test-patchbomb
+++ b/tests/test-patchbomb
@@ -170,3 +170,10 @@
 echo "% test multiple flags for multiple patches"
 hg email --date '1970-1-1 0:1' -n --flag fooFlag --flag barFlag -f quux -t foo \
  -c bar -s test -r 0:1 | fixheaders
+
+echo "% test address parsing"
+hg email --date '1980-1-1 0:1' -m tmp.mbox -f quux -t 'spam<spam><eggs>' \
+ -t toast -c 'foo,bar at example.com' -c '"A, B <>" <a at example.com>' -s test -r 0 \
+ --config email.bcc='"Quux, A." <quux>'
+cat tmp.mbox | fixheaders
+
diff --git a/tests/test-patchbomb.out b/tests/test-patchbomb.out
--- a/tests/test-patchbomb.out
+++ b/tests/test-patchbomb.out
@@ -1469,3 +1469,36 @@
 @@ -0,0 +1,1 @@
 +b
 
+% test address parsing
+This patch series consists of 1 patches.
+
+
+Writing [PATCH] test ...
+From quux Tue Jan  1 00:01:01 1980
+Content-Type: text/plain; charset="us-ascii"
+MIME-Version: 1.0
+Content-Transfer-Encoding: 7bit
+Subject: [PATCH] test
+X-Mercurial-Node: 8580ff50825a50c8f716709acdf8de0deddcd6ab
+Message-Id: <8580ff50825a50c8f716.315532860@
+User-Agent: Mercurial-patchbomb
+Date: Tue, 01 Jan 1980 00:01:00 +0000
+From: quux
+To: spam <spam>, eggs, toast
+Cc: foo, bar at example.com, "A, B <>" <a at example.com>
+Bcc: "Quux, A." <quux>
+
+# HG changeset patch
+# User test
+# Date 1 0
+# Node ID 8580ff50825a50c8f716709acdf8de0deddcd6ab
+# Parent  0000000000000000000000000000000000000000
+a
+
+diff -r 000000000000 -r 8580ff50825a a
+--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
++++ b/a	Thu Jan 01 00:00:01 1970 +0000
+@@ -0,0 +1,1 @@
++a
+
+




More information about the Mercurial-devel mailing list