The 'is' operator and check-code
kbullock+mercurial at ringworld.org
Tue Nov 16 13:59:20 CST 2010
> I'm throwing this out in case any Python experts have opinions.
> The 'is' operator is used in a bunch of places as a synonym for ==. This
> happens to work with short strings and small integers (0 to 256) in
> CPython, but that's by no means guaranteed by the language, so there are
> places where we're just getting lucky. So I propose we disallow using it
> for anything that's not one of the True/False/None singletons. Here's my
> regex hack:
> diff -r 340c46028ebd contrib/check-code.py
> --- a/contrib/check-code.py Tue Nov 16 12:55:07 2010 -0600
> +++ b/contrib/check-code.py Tue Nov 16 13:35:04 2010 -0600
> @@ -130,6 +130,7 @@
> (r'[\x80-\xff]', "non-ASCII character literal"),
> (r'("\')\.format\(', "str.format() not available in Python 2.4"),
> (r'^\s*with\s+', "with not available in Python 2.4"),
> + (r' is\s+(not\s+)?[^TFNn]', "object comparison with non-singleton"),
> "any/all/format not available in Python 2.4"),
I'm no python expert, but the first thing I wondered was the performance difference:
❧ time python -c $'for i in xrange(10000000):\n if "a" is "a": pass'
❧ time python -c $'for i in xrange(10000000):\n if "a" == "a": pass'
(Python 2.6.1, OS X 10.6.5)
Not much difference in a simple string case. Now what if the two operations return a different result? (a == b but not a is b):
>>> 'ab' is ''.join(['a','b'])
>>> 'ab' is 'a' + 'b'
❧ time python -c $'a="".join(["a","b"])\nb="ab"\nfor i in xrange(10000000):\n if a is b: pass'
❧ time python -c $'a="".join(["a","b"])\nb="ab"\nfor i in xrange(10000000):\n if a == b: pass'
Still not much difference.
pacem in terris / mir / shanti / salaam / heiwa
Kevin R. Bullock
More information about the Mercurial-devel