[PATCH] command options: handle unicode defaults gracefully

Christophe de Vienne christophe at cdevienne.info
Thu Aug 31 04:31:48 EDT 2017



Le 30/08/2017 à 20:32, Augie Fackler a écrit :
> On Wed, Aug 30, 2017 at 12:11:18AM +0200, Christophe de Vienne wrote:
>> # HG changeset patch
>> # User Christophe de Vienne <christophe at cdevienne.info>
>> # Date 1504023891 -7200
>> #      Tue Aug 29 18:24:51 2017 +0200
>> # Node ID afba1cb303db6797e565907c277fd4150e76c141
>> # Parent  b1f75d8e887a4c06e6b120807f3defc5c7b78d33
>> command options: handle unicode defaults gracefully
> 
> I'm conflicted here. I feel like we should just abort entirely rather
> than even trying to do something smart.

It makes sens indeed, and still fix the issue.


> What inspired working on this, out of curiousity? Trying to do
> something on Python 3?

Nope, I was just working on a small extension which had a 'from
__future__ import unicode_literals', and when I added a string option to
its command, I could never get the value passed on the cli.

It took me a while to find out what was wrong, and I want to prevent
other people from falling into the trap.

> 
>>
>> If the default value of an option is a unicode string (something
>> than happen easily when using a 'from __future__ import unicode_literals'),
>> any value passed on the command line will be ignored because the fancyopts
>> module only checks for byte strings and not unicode strings.
>>
>> Changing fancyopts behavior is easy but would make assumptions on how
>> the python3 port should be done, which is outside the scope of this patch.
> 
> In what ways might we change fancyopts to try and do something better?
> I'm curious.
> 

My initial patch was:

diff -r b1f75d8e887a mercurial/fancyopts.py
--- a/mercurial/fancyopts.py Tue Aug 22 16:59:02 2017 -0400
+++ b/mercurial/fancyopts.py Thu Aug 31 10:24:07 2017 +0200
@@ -148,7 +148,7 @@
             except ValueError:
                 raise error.Abort(_('invalid value %r for option %s, '
                                    'expected int') % (val, opt))
-        elif t is type(''):
+        elif t is type('') or t is type(u''):
             state[name] = val
         elif t is type([]):
             state[name].append(val)


But talking with Pulkit I realized it may mess with the python 3 port,
for which the choice seems to use bytes everywhere. In that context,
allowing unicode string can seem nice, but could encourage developers to
write non-future proof code.

>>
>> The chosen approach is to encode the default value to ascii when parsing
>> the command line, and issue a devel warning for the sake of the extension
>> developers.
>>
>> If encoding to ascii fails, the command is aborted.
> 
> Typically our development warnings have a time horizon: does it seem
> reasonable to just hard abort starting after the 4.4 release?

Yes is does. The warning does not seem necessary when I think about it.

If I remove the warning, is there a test file more suitable that
test-devel-warnings.t? Or should I create a new one?

--
Christophe


More information about the Mercurial-devel mailing list