[PATCH 3 of 3] py3: make a bytes version of getopt.getopt()

Pulkit Goyal 7895pulkit at gmail.com
Wed Dec 7 10:47:28 EST 2016


On Wed, Dec 7, 2016 at 8:05 PM, Pulkit Goyal <7895pulkit at gmail.com> wrote:
> On Wed, Dec 7, 2016 at 7:37 PM, Yuya Nishihara <yuya at tcha.org> wrote:
>> On Tue, 06 Dec 2016 07:26:41 +0530, Pulkit Goyal wrote:
>>> # HG changeset patch
>>> # User Pulkit Goyal <7895pulkit at gmail.com>
>>> # Date 1480986396 -19800
>>> #      Tue Dec 06 06:36:36 2016 +0530
>>> # Node ID ef3b7f10f4ade315a42f5f9383a8ad1794bb1f01
>>> # Parent  559b73b5d7b919da68bf2ce5b05dd9677ddc1c2d
>>> py3: make a bytes version of getopt.getopt()
>>
>>> +    # getopt.getopt() on Python 3 deals with unicodes internally so we cannot
>>> +    # pass bytes there. Passing unicodes will result in unicodes as return
>>> +    # values which we need to convert again to bytes. This does all these
>>> +    # decoding and encoding using fsdecode() and fsencode().
>>> +    def getoptb(args, shortlist, namelist):
>>> +        args = [fsdecode(a) for a in args]
>>> +        shortlist = fsdecode(shortlist)
>>> +        namelist = fsdecode(shortlist)
Ah, this is wrong, sorry for this, I just found it, please drop this version.
>>> +        opts, args = getopt.getopt(args, shortlist, namelist)
>>> +        opts = [fsencode(a) for a in opts]
>>> +        args = [fsencode(a) for a in args]
>>
>> "opts" is a list of (option, value) pairs.
>
> Yeah, drop this patch, I will send a new version of this one.
>>
>> I don't think using fsdecode/fsencode here is appropriate. Encoding conversion
>> is lossy in general even if no error occurred. There's n:m mapping between
>> some crazy encodings (read: Shift_JIS variants) and unicode, for example.
>>
>> Instead, maybe we can use 'latin1' to convince Python3 by abusing unicode as
>> a fat bytes?
>
> In that case pycompat.sysstr() is okay for decoding and encoding using
> .encode('latin-1') then.


More information about the Mercurial-devel mailing list