[PATCH py3] ui: construct _keepalnum list in a python3-friendly way

Yuya Nishihara yuya at tcha.org
Mon Feb 20 08:35:13 EST 2017


On Sun, 19 Feb 2017 20:05:50 -0500, Augie Fackler wrote:
> 
> > On Feb 19, 2017, at 9:29 AM, Yuya Nishihara <yuya at tcha.org> wrote:
> > 
> > On Sat, 18 Feb 2017 22:58:10 +0000, Martijn Pieters wrote:
> >> On 16 Feb 2017, at 16:35, Augie Fackler <raf at durin42.com <mailto:raf at durin42.com>> wrote:
> >>> +if pycompat.ispy3:
> >>> +    _unicodes = [bytes([c]).decode('latin1') for c in range(256)]
> >>> +    _notalnum = [s.encode('latin1') for s in _unicodes if not s.isalnum()]
> >> 
> >> ...
> >>> +_keepalnum = ''.join(_notalnum)
> >> 
> >> This could be more cheaply calculated as
> >> 
> >>    _keepalnum = bytes(c for c in range(256) if not chr(c).isalnum())
> >> 
> >> This takes a third of the time.
> > 
> > Good catch, but I found both of them are incorrect since str.isalnum() is
> > unicode aware on Python3. We'll need to use bytes.isalnum() or string.*
> > constants.
> 
> Oh, gross. I missed that. I think this patch fixes it, though not with the perf wins Martijn suggested:
> 
> diff --git a/mercurial/ui.py b/mercurial/ui.py
> --- a/mercurial/ui.py
> +++ b/mercurial/ui.py
> @@ -40,8 +40,8 @@ urlreq = util.urlreq
>  
>  # for use with str.translate(None, _keepalnum), to keep just alphanumerics
>  if pycompat.ispy3:
> -    _unicodes = [bytes([c]).decode('latin1') for c in range(256)]
> -    _notalnum = [s.encode('latin1') for s in _unicodes if not s.isalnum()]
> +    _bytes = [bytes([c]) for c in range(256)]
> +    _notalnum = [s for s in _bytes if not s.isalnum()]
>  else:
>      _notalnum = [c for c in map(chr, range(256)) if not c.isalnum()]
>  _keepalnum = ''.join(_notalnum)
> 
> Feel free to amend that into what’s already queued, or I can do a followup or resend as feels appropriate.

Applied this and rebased the other patches, thanks.


More information about the Mercurial-devel mailing list