[PATCH 8 of 8] py3: have bytes version of sys.argv

Yuya Nishihara yuya at tcha.org
Mon Nov 7 07:56:50 EST 2016


On Sun, 6 Nov 2016 23:55:50 +0530, Pulkit Goyal wrote:
> On Sun, Nov 6, 2016 at 9:04 AM, Yuya Nishihara <yuya at tcha.org> wrote:
> > On Sun, 06 Nov 2016 04:46:25 +0530, Pulkit Goyal wrote:
> >> # HG changeset patch
> >> # User Pulkit Goyal <7895pulkit at gmail.com>
> >> # Date 1478387186 -19800
> >> #      Sun Nov 06 04:36:26 2016 +0530
> >> # Node ID b5fc4e71286dd4f33336e4f38e0b9fb17f51f1e3
> >> # Parent  6eed3ee0df425da61d03bfe024dd082f3176ce5d
> >> py3: have bytes version of sys.argv
> >>
> >> sys.argv returns unicodes on Python 3. We need a bytes version for us.
> >> There was also a python bug/feature request which wanted then to implement
> >> one. They rejected and it is quoted in one of the comments that we can use
> >> fsencode() to get a bytes version of sys.argv. Though not sure about its
> >> correctness.
> >>
> >> Link to the comment: http://bugs.python.org/issue8776#msg217416
> >>
> >> After this patch we will have pycompat.sysargv which will return us bytes
> >> version of sys.argv. If this patch goes in, i will like to make transformer
> >> rewrite sys.argv with pycompat.argv because there are lot of occurences.
> >>
> >> diff -r 6eed3ee0df42 -r b5fc4e71286d mercurial/pycompat.py
> >> --- a/mercurial/pycompat.py   Sun Nov 06 04:17:19 2016 +0530
> >> +++ b/mercurial/pycompat.py   Sun Nov 06 04:36:26 2016 +0530
> >> @@ -41,6 +41,7 @@
> >>      osname = os.name.encode('ascii')
> >>      ospathsep = os.pathsep.encode('ascii')
> >>      ossep = os.sep.encode('ascii')
> >> +    sysargv = list(map(os.fsencode, sys.argv))
> >
> > Looks good to me. Can you add a comment why we can use os.fsencode() here
> > (and the weirdness of Python 3 on Unix.) We might need a Windows workaround
> > because the situation is slightly different, but we wouldn't want to care
> > for now.
> 
> Well I will resend this patch because I am not sure about its
> correctness still. I followed that issue where Victor Stinner, one who
> wrote os.environb commented this. There are few doubts/confusions or
> maybe I want to just confirm it with MJ once.

It's generally wrong to assume argv is in filesystem encoding, but I think
that's okay for Python 3 on Unix. They builds "wchar_t argv" from "char argv"
by Py_DecodeLocale(), which would be identical to fsdecode() on Unix.

https://hg.python.org/cpython/file/v3.5.1/Programs/python.c#l55

On Windows, the native argv appears to be wchar_t, so we'll need a different
hack to simulate the Python 2 (i.e. ANSI Win32 API) behavior.


More information about the Mercurial-devel mailing list