[PATCH 3 of 4 py3 V2] pycompat: custom implementation of urllib.parse.quote()

Gregory Szorc gregory.szorc at gmail.com
Wed Mar 22 02:00:01 EDT 2017


On Tue, Mar 14, 2017 at 1:43 AM, Yuya Nishihara <yuya at tcha.org> wrote:

> On Mon, 13 Mar 2017 13:08:51 -0700, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc at gmail.com>
> > # Date 1489432607 25200
> > #      Mon Mar 13 12:16:47 2017 -0700
> > # Node ID ae4c0d8f0ca7218c8b1d259b72660c99b4c085aa
> > # Parent  4dca07be3c215ee7f47ccd7473d78514968f1bb4
> > pycompat: custom implementation of urllib.parse.quote()
> >
> > urllib.parse.quote() accepts either str or bytes and returns str.
> >
> > There exists a urllib.parse.quote_from_bytes() which only accepts
> > bytes. We should probably use that to retain strong typing and
> > avoid surprises.
> >
> > In addition, since nearly all strings in Mercurial are bytes, we
> > probably don't want quote() returning unicode.
> >
> > So, this patch implements a custom quote() that only accepts bytes
> > and returns bytes. The quoted URL should only contain URL safe
> > characters which is a strict subset of ASCII. So
> > `.encode('ascii', 'strict')` should be safe.
> >
> > diff --git a/mercurial/pycompat.py b/mercurial/pycompat.py
> > --- a/mercurial/pycompat.py
> > +++ b/mercurial/pycompat.py
> > @@ -269,7 +269,6 @@ if not ispy3:
> >  else:
> >      import urllib.parse
> >      urlreq._registeraliases(urllib.parse, (
> > -        "quote",
> >          "splitattr",
> >          "splitpasswd",
> >          "splitport",
> > @@ -313,3 +312,12 @@ else:
> >          "SimpleHTTPRequestHandler",
> >          "CGIHTTPRequestHandler",
> >      ))
> > +
> > +    # urllib.parse.quote() accepts both str and bytes, decodes bytes
> > +    # (if necessary), and returns str. This is wonky. We provide a
> custom
> > +    # implementation that only accepts bytes and emits bytes.
> > +    def quote(s, safe=r'/'):
> > +        s = urllib.parse.quote_from_bytes(s, safe=safe)
> > +        return s.encode('ascii', 'strict')
> > +
> > +    urlreq.quote = quote
>
> unquote() is exported as pycompat.urlunquote(). Maybe we should choose one
> way.
>

I thought this change was trivial. But it turned into an 8 part series
which I just patchpombed. Thanks for spotting the redundancy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20170321/33db35c6/attachment.html>


More information about the Mercurial-devel mailing list