[PATCH] cleanup: use x in (a, b) instead of x == a or x == b
Matt Mackall
mpm at selenic.com
Thu Sep 23 11:41:22 CDT 2010
On Thu, 2010-09-23 at 13:09 +0200, Martin Geisler wrote:
> Masklinn <masklinn at masklinn.net> writes:
>
> > On 2010-09-23, at 13:24 , Martin Geisler wrote:
> >> Brodie Rao <brodie at bitheap.org> writes:
> >>> # HG changeset patch
> >>> # User Brodie Rao <brodie at bitheap.org>
> >>> # Date 1285218151 18000
> >>> # Node ID a9641ee7e37cf6da972ef07e9c8992316587c51a
> >>> # Parent f1e8d6f6e682a74a135ba6509b2de26cc8172046
> >>> cleanup: use x in (a, b) instead of x == a or x == b
> >>
> >> I believe I tested this at some point using timeit and found that the
> >> longer form is faster. Did you test is too?
> >
> > Unless the potentially containing sequence is cached, it probably
> > won't be faster. But in cases where performance on the order of a
> > fraction of a nanosecond per call[0] is not a very important matter, I
> > think the readability and DRY of the `in` form is preferable to the
> > double-eq. YMMV
>
> Yeah, I should not have focussed so much on the raw performance of this.
> I was just being silly, I'll push it in a second.
I should have chimed in last night.
First, I think this is basically negligible as a visual style
improvement.
Second, I think Masklinn's performance analysis is wrong because it was
run against literals. I've timed this in the last couple months, I think
in response to a patch from Nic. Remember, Python has basically no
optimization in its compiler and we can't really expect it to do well
with a change like:
- if p == a or p == b: # did we find a or b as a parent?
+ if p in (a, b): # did we find a or b as a parent?
(hey look, a completely pointless comment!)
The first the compiler translates into something approximately
equivalent to the C code (ok, really, it's interpreted bytecode and
completely different, but this is illustrative of what goes on in a
high-level language):
if (compareobjects(p, a) == 0 || compareobjects(p, b) == 0) {
while the second turns into something like:
t = buildtuple(2, a, b); # varargs func, trip to memory allocator
if (objectcontains(t, p)) { # double-indirect method lookup
...
deleteobject(t); # another trip to memory allocator
...
int tuplecontains(object *t, object *p)
{
int i;
for (i = 0; i < tuplelen(t); i++) # not a constant!
if (compareobjects(p, a))
return 1;
return 0;
}
In other words, this kind of construct is just making life hard for
Python. Wrapping something in parens is deceptively expensive.
And indeed this is quite visible in a simple test:
optimal:
$ python -m timeit -s 'a, b, c = 0, 0, 2' -c 'if a == b or a == c: pass'
10000000 loops, best of 3: 0.083 usec per loop
$ python -m timeit -s 'a, b, c = 0, 0, 2' -c 'if a == (b, c): pass'
1000000 loops, best of 3: 0.28 usec per loop
pessimal:
$ python -m timeit -s 'a, b, c = 0, 1, 2' -c 'if a == b or a == c: pass'
10000000 loops, best of 3: 0.134 usec per loop
$ python -m timeit -s 'a, b, c = 0, 1, 2' -c 'if a == (b, c): pass'
1000000 loops, best of 3: 0.29 usec per loop
Ouch! >2x to >3x.
I'll do a (linear) backout on this after breakfast.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list