[PATCH 2 of 2] py3: handle multiple arguments in .encode() and .decode()
Yuya Nishihara
yuya at tcha.org
Fri Oct 7 00:58:47 EDT 2016
On Wed, 05 Oct 2016 20:05:18 +0530, Pulkit Goyal wrote:
> # HG changeset patch
> # User Pulkit Goyal <7895pulkit at gmail.com>
> # Date 1475596407 -19800
> # Tue Oct 04 21:23:27 2016 +0530
> # Node ID 535c77a356a09c0319c9a794bdbec18e9ebb57b2
> # Parent 51e49c041614b463953b3973d5b58d8bbdcbbab3
> py3: handle multiple arguments in .encode() and .decode()
>
> There is a case and more can be present where these functions have
> multiple arguments. Our transformer used to handle the first argument, so
> added a loop to handle more arguments if present.
>
> diff -r 51e49c041614 -r 535c77a356a0 mercurial/__init__.py
> --- a/mercurial/__init__.py Tue Oct 04 20:56:03 2016 +0530
> +++ b/mercurial/__init__.py Tue Oct 04 21:23:27 2016 +0530
> @@ -278,24 +278,30 @@
> # .encode() and .decode() on str/bytes/unicode don't accept
> # byte strings on Python 3. Rewrite the token to include the
> # unicode literal prefix so the string transformer above doesn't
> - # add the byte prefix.
> + # add the byte prefix. The loop helps in handling multiple
> + # arguments to them.
> if (fn in ('encode', 'decode') and
> prevtoken.type == token.OP and prevtoken.string == '.'):
> # (OP, '.')
> # (NAME, 'encode')
> # (OP, '(')
> # (STRING, 'utf-8')
> + # [(OP, ',')]
> + # [(STRING, 'ascii')]
> # (OP, ')')
> - try:
> - st = tokens[i + 2]
> - if (st.type == token.STRING and
> - st.string[0] in ("'", '"')):
> - rt = tokenize.TokenInfo(st.type, 'u%s' % st.string,
> - st.start, st.end, st.line)
> - tokens[i + 2] = rt
> - except IndexError:
> - pass
> -
> + j = i
> + while (tokens[j + 1].string != ')'):
> + try:
> + st = tokens[j + 2]
> + if (st.type == token.STRING and
> + st.string[0] in ("'", '"')):
> + rt = tokenize.TokenInfo(st.type,
> + 'u%s' % st.string,
> + st.start, st.end, st.line)
> + tokens[j + 2] = rt
> + except IndexError:
> + pass
Perhaps IndexError could be raised at the first tokens[j + 1]. Since we have
"while", it could be written as j + 2 < len(tokens).
Also, we'll need to check the existence of ',' token.
> # ``replacetoken`` or any mechanism that changes semantics of module
> # loading is changed. Otherwise cached bytecode may get loaded without
> # the new transformation mechanisms applied.
> - BYTECODEHEADER = b'HG\x00\x02'
> + BYTECODEHEADER = b'HG\x00\x04'
Just curious, why not '\x03' ?
More information about the Mercurial-devel
mailing list