[PATCH 3 of 3] py3: make a bytes version of getopt.getopt()

Pulkit Goyal 7895pulkit at gmail.com
Wed Dec 7 09:35:22 EST 2016


On Wed, Dec 7, 2016 at 7:37 PM, Yuya Nishihara <yuya at tcha.org> wrote:
> On Tue, 06 Dec 2016 07:26:41 +0530, Pulkit Goyal wrote:
>> # HG changeset patch
>> # User Pulkit Goyal <7895pulkit at gmail.com>
>> # Date 1480986396 -19800
>> #      Tue Dec 06 06:36:36 2016 +0530
>> # Node ID ef3b7f10f4ade315a42f5f9383a8ad1794bb1f01
>> # Parent  559b73b5d7b919da68bf2ce5b05dd9677ddc1c2d
>> py3: make a bytes version of getopt.getopt()
>
>> +    # getopt.getopt() on Python 3 deals with unicodes internally so we cannot
>> +    # pass bytes there. Passing unicodes will result in unicodes as return
>> +    # values which we need to convert again to bytes. This does all these
>> +    # decoding and encoding using fsdecode() and fsencode().
>> +    def getoptb(args, shortlist, namelist):
>> +        args = [fsdecode(a) for a in args]
>> +        shortlist = fsdecode(shortlist)
>> +        namelist = fsdecode(shortlist)
>> +        opts, args = getopt.getopt(args, shortlist, namelist)
>> +        opts = [fsencode(a) for a in opts]
>> +        args = [fsencode(a) for a in args]
>
> "opts" is a list of (option, value) pairs.

Yeah, drop this patch, I will send a new version of this one.
>
> I don't think using fsdecode/fsencode here is appropriate. Encoding conversion
> is lossy in general even if no error occurred. There's n:m mapping between
> some crazy encodings (read: Shift_JIS variants) and unicode, for example.
>
> Instead, maybe we can use 'latin1' to convince Python3 by abusing unicode as
> a fat bytes?

In that case pycompat.sysstr() is okay for decoding and encoding using
.encode('latin-1') then.


More information about the Mercurial-devel mailing list