[PATCH 3 of 4] copies: calculate 'bothnew' from manifestdict.filesnotin()
Durham Goode
durham at fb.com
Fri Feb 27 19:08:36 CST 2015
On 2/27/15, 2:46 PM, "Martin von Zweigbergk" <martinvonz at google.com> wrote:
># HG changeset patch
># User Martin von Zweigbergk <martinvonz at google.com>
># Date 1425074581 28800
># Fri Feb 27 14:03:01 2015 -0800
># Node ID b3bcf58446fdebf3672edbbc55c24509e549eb22
># Parent 89f810fb00184d3a1dd49412d0c3256a596ddca8
>copies: calculate 'bothnew' from manifestdict.filesnotin()
>
>In the same spirit as the previous change, let's now calculate the
>'bothnew' variable using manifestdict.filesnotin().5D
>
>diff -r 89f810fb0018 -r b3bcf58446fd mercurial/copies.py
>--- a/mercurial/copies.py Fri Feb 27 14:02:30 2015 -0800
>+++ b/mercurial/copies.py Fri Feb 27 14:03:01 2015 -0800
>@@ -302,7 +302,9 @@
> else:
> diverge2.update(fl) # reverse map for below
>
>- bothnew = sorted([d for d in m1 if d in m2 and d not in ma])
>+ addedinm1 = m1.filesnotin(ma)
>+ addedinm2 = m2.filesnotin(ma)
>+ bothnew = sorted(addedinm1 & addedinm2)
I was concerned about perf here (since we¹re constructing sets when we
used to not, and we¹re iterating over both m1 and m2 when we used to not),
but I ran the numbers and the new stuff is actually faster. To diff
million file manifests with 5k files different each, previously it was
0.36s and now it¹s 0.23.
The script for posterity:
#!/bin/env python
import time
fm1 = list(xrange(0, 990000))
fm2 = list(xrange(0, 990000))
fma = list(xrange(0, 990000))
fm1.extend(range(1000000, 1005000))
fm2.extend(range(1005000, 1010000))
fma.extend(range(1010000, 1015000))
m1 = set(fm1)
m2 = set(fm2)
ma = set(fma)
start = time.time()
old = sorted([d for d in fm1 if d in m2 and d not in ma])
print ("Old %s" % str(time.time() - start))
def doit(d1, d2):
missing = set(d1)
missing.difference_update(d2)
return missing
start = time.time()
addedinm1 = doit(fm1, fma)
addedinm2 = doit(fm2, fma)
bothnew = sorted(addedinm1 & addedinm2)
print ("New %s" % str(time.time() - start))
~/local> ./testset.py
Old 0.343271970749
New 0.231050014496
More information about the Mercurial-devel
mailing list