[PATCH 2 of 2 v4] revset: use localrepo persistent branch name caching

Pierre-Yves David pierre-yves.david at ens-lyon.org
Thu Oct 16 02:49:43 CDT 2014



On 10/15/2014 06:48 PM, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski at unity3d.com>
> # Date 1413424007 -7200
> #      Thu Oct 16 03:46:47 2014 +0200
> # Node ID b0e40604be81ff0716e1254f7ce7884599457f71
> # Parent  d87009e720063a8d6d80afdbe6bb9323e2849030
> revset: use localrepo persistent branch name caching
>
> Branch name filtering in revsets was expensive. For every rev it created a
> changectx and called .branch(). (Using changelog.branchinfo() would make it a
> bit faster.)
>
> Instead, use the new localrepo caching branch lookup method.

Could you get ctx.branch() to use this cache too ? Would that benefit 
anything?

The branchmap code could use it too. and would prewarm it for you.

>
> On the small hg repo:
>    hg log --time -r 'branch(stable) & branch(default)'
> Before:
>    time: real 1.910 secs (user 1.890+0.000 sys 0.020+0.000)
> After:
>    time: real 1.240 secs (user 1.230+0.000 sys 0.010+0.000)
>    time: real 0.120 secs (user 0.110+0.000 sys 0.000+0.000)
>
> On mozilla-central with 210557:a280a03c9f3c :
>    hg --time log -r 'branch(mobile)' -T.
> Before:
>    time: real 10.450 secs (user 10.390+0.000 sys 0.060+0.000)
> After:
>    time: real 7.640 secs (user 7.480+0.000 sys 0.140+0.000)
>    time: real 0.520 secs (user 0.490+0.000 sys 0.030+0.000)
>
> First run is 35%/27% faster (primarily because the new code path uses
> changelog.branchinfo instead changectx.branch and we avoid messing with
> localrepo). Following runs will use the cache and are 16x/20x faster.

If you are benchmarking revset please use the perf extension to get your 
numbers.


>
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -478,6 +478,7 @@ def branch(repo, subset, x):
>       a regular expression. To match a branch that actually starts with `re:`,
>       use the prefix `literal:`.
>       """
> +    branch = repo.revbranchcache.branch
>       try:
>           b = getstring(x, '')
>       except error.ParseError:
> @@ -489,16 +490,16 @@ def branch(repo, subset, x):
>               # note: falls through to the revspec case if no branch with
>               # this name exists
>               if pattern in repo.branchmap():
> -                return subset.filter(lambda r: matcher(repo[r].branch()))
> +                return subset.filter(lambda r: matcher(branch(r)))
>           else:
> -            return subset.filter(lambda r: matcher(repo[r].branch()))
> +            return subset.filter(lambda r: matcher(branch(r)))
>
>       s = getset(repo, spanset(repo), x)
>       b = set()
>       for r in s:
> -        b.add(repo[r].branch())
> +        b.add(branch(r))
>       c = s.__contains__
> -    return subset.filter(lambda r: c(r) or repo[r].branch() in b)
> +    return subset.filter(lambda r: c(r) or branch(r) in b)
>
>   def bumped(repo, subset, x):
>       """``bumped()``
> @@ -1431,7 +1432,7 @@ def matching(repo, subset, x):
>       getfieldfuncs = []
>       _funcs = {
>           'user': lambda r: repo[r].user(),
> -        'branch': lambda r: repo[r].branch(),
> +        'branch': repo.revbranchcache.branch,
>           'date': lambda r: repo[r].date(),
>           'description': lambda r: repo[r].description(),
>           'files': lambda r: repo[r].files(),
> @@ -1532,9 +1533,9 @@ def sort(repo, subset, x):
>               elif k == '-rev':
>                   e.append(-r)
>               elif k == 'branch':
> -                e.append(c.branch())
> +                e.append(repo.revbranchcache.branch(r))
>               elif k == '-branch':
> -                e.append(invert(c.branch()))
> +                e.append(invert(repo.revbranchcache.branch(r)))
>               elif k == 'desc':
>                   e.append(c.description())
>               elif k == '-desc':
> diff --git a/tests/test-revset.t b/tests/test-revset.t
> --- a/tests/test-revset.t
> +++ b/tests/test-revset.t
> @@ -744,6 +744,64 @@ issue2437
>     $ log 'ancestors(8) and (heads(branch("-a-b-c-")) or heads(branch(é)))'
>     4
>
> +branchname caching

Test for branch name cachine should not be in the revset test. They 
should be in some branch related file. test-revset.t should be revset 
behavior themself. not internal implementation details.

> + cache creation
> +  $ rm .hg/cache/branchnames
> +  $ log 'branch("-a-b-c-")'
> +  4
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  c31b7e7bdc4e1f69133685cf82550eac  .hg/cache/branchnames
> + handling of invalid cache file
> +  $ echo > .hg/cache/branchnames
> +  $ log 'branch("-a-b-c-")'
> +  4
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  c31b7e7bdc4e1f69133685cf82550eac  .hg/cache/branchnames
> + handling of extra trailing data
> +  $ echo >> .hg/cache/branchnames
> +  $ log 'branch("-a-b-c-")'
> +  4
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  c31b7e7bdc4e1f69133685cf82550eac  .hg/cache/branchnames
> + update after commit
> +  $ hg tag tag
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  c31b7e7bdc4e1f69133685cf82550eac  .hg/cache/branchnames
> +  $ log 'branch("-a-b-c-")'
> +  4
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  77d75afbf592f045b609c3a68986a908  .hg/cache/branchnames
> + $ update after rollback
> +  $ hg up -qr '.^'
> +  $ hg rollback -qf
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  77d75afbf592f045b609c3a68986a908  .hg/cache/branchnames
> +  $ log 'branch("-a-b-c-")'
> +  4
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  c31b7e7bdc4e1f69133685cf82550eac  .hg/cache/branchnames
> + handle non-tip node change (could be a problem with some cache invalidation schemes)
> +  $ hg bundle -q --all bu.hg
> +  $ hg --config extensions.strip= strip --no-b -qr 8
> +  $ hg up -q 7
> +  $ hg branch -q hacked
> +  $ hg ci --amend -qm 'hacked'
> +  $ hg pull -q bu.hg -r 24286f4ae135
> +  $ log 'branch("hacked")'
> +  7
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  7e7547be03cd3badb5053899aea5c532  .hg/cache/branchnames
> +  $ hg --config extensions.strip= strip --no-b -qr 7:
> +  $ hg pull -q bu.hg
> +  $ rm bu.hg
> +  $ hg up -qr tip
> +  $ log 'branch("-a-b-c-")'
> +  4
> +  $ "$TESTDIR/md5sum.py" .hg/cache/branchnames
> +  753e486275e544c562da095fd116a441  .hg/cache/branchnames
> +(the cache file still contains the now unused 'hacked' branch name)
> +
>   issue2654: report a parse error if the revset was not completely parsed
>
>     $ log '1 OR 2'
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list