[PATCH 3 of 3 RFC] localrepo: use ctx.size comparisons to speed up status
Matt Mackall
mpm at selenic.com
Sun Jul 25 11:02:43 CDT 2010
On Sun, 2010-07-25 at 11:12 +0900, Nicolas Dumazet wrote:
> # HG changeset patch
> # User Nicolas Dumazet <nicdumz.commits at gmail.com>
> # Date 1278816052 -32400
> # Node ID 42b4ba8013abce794478c689201399ecf8294540
> # Parent dca39a137eaa3f107c6b6419540a0afca702d3eb
> localrepo: use ctx.size comparisons to speed up status
>
> Comparing sizes is cheaper than comparing file contents, as it does not
> involve reading the file on disk or from the filelog.
>
> It is however not always possible: some extensions, or encode filters,
> change data when extracting it to the working directory.
> _cancomparesize is meant to detect cases where such comparisons are not
> possible. A _cancomparesize() call is cheap, as _loadfilter is caching
> its results in filterpats.
>
> Unwrapping the complex inlined boolean comparisons produces longer code,
> but boolean logic has not been changed, except for the size check
> before ctx.cmp calls.
>
> diff --git a/hgext/keyword.py b/hgext/keyword.py
> --- a/hgext/keyword.py
> +++ b/hgext/keyword.py
> @@ -502,6 +502,11 @@
> False, True)
> return n
>
> + def _cancomparesize(self):
> + # keywords affect data size, comparing wdir and filelog size does
> + # not make sense
> + return False
> +
Somehow I think this would work out better as a helper function of some
sort that actually did the comparison. Possibly in filelog.
> +
> + if listclean:
> + appendclean = clean.append
> + else:
> + def appendclean(fn): pass
> + appendmodified = modified.append
> +
This is a bit too clever. Python function calls are slow:
$ python -m timeit -s 'a = []; x = False' -s 'def aa(x): pass' 'for i in
xrange(1000000): aa(1)'
10 loops, best of 3: 239 msec per loop
$ python -m timeit -s 'a = []; x = False' -s 'aa = lambda x: None' 'for
i in xrange(1000000): aa(1)'
10 loops, best of 3: 240 msec per loop
$ python -m timeit -s 'a = []; x = False; aa = a.append' 'for i in
xrange(1000000):
if x: aa(1)'
10 loops, best of 3: 62.6 msec per loop
$ python -m timeit -s 'a = []; x = True; aa = a.append' 'for i in
xrange(1000000):
if x: aa(1)'
10 loops, best of 3: 133 msec per loop
> for fn in mf2:
> if fn in mf1:
> - if (mf1.flags(fn) != mf2.flags(fn) or
> - (mf1[fn] != mf2[fn] and
> - (mf2[fn] or ctx1[fn].cmp(ctx2[fn].data())))):
> - modified.append(fn)
> - elif listclean:
> - clean.append(fn)
> + action = appendmodified
> + if mf1.flags(fn) == mf2.flags(fn):
> + if mf1[fn] == mf2[fn]:
> + action = appendclean
> + elif not mf2[fn]:
> + f1 = ctx1[fn]
> + f2 = ctx2[fn]
> + sizematch = not checksize or f1.size() == f2.size()
> + if sizematch and not f1.cmp(f2.data()):
> + action = appendclean
> + action(fn)
> del mf1[fn]
> else:
> added.append(fn)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list