[PATCH 2 of 3] hgtagsfnodescache: inherit fnode from parent when possible

Pierre-Yves David pierre-yves.david at ens-lyon.org
Thu May 2 17:48:41 EDT 2019



On 5/2/19 8:24 PM, Martin von Zweigbergk wrote:
> 
> 
> On Thu, May 2, 2019 at 9:37 AM Pierre-Yves David 
> <pierre-yves.david at ens-lyon.org <mailto:pierre-yves.david at ens-lyon.org>> 
> wrote:
> 
>     # HG changeset patch
>     # User Pierre-Yves David <pierre-yves.david at octobus.net
>     <mailto:pierre-yves.david at octobus.net>>
>     # Date 1552263020 -3600
>     #      Mon Mar 11 01:10:20 2019 +0100
>     # Node ID eac353183daaef0a503da8cd72b8df43f54d7fb8
>     # Parent  a753bc019c1ad7c5661a050adce49e4c3cd5a786
>     # EXP-Topic fnodecache
>     # Available At https://bitbucket.org/octobus/mercurial-devel/
>     #              hg pull
>     https://bitbucket.org/octobus/mercurial-devel/ -r eac353183daa
>     hgtagsfnodescache: inherit fnode from parent when possible
> 
>     If a changeset does not update the content of `.hgtags`, it means it
>     will use
>     the same file-node (for `.hgtags`) than its parents. In this such
>     case we can
>     directly reuse the parent's file-node.
> 
>     We use this property when updating the `hgtagsfnodescache` taking a
>     faster path
>     if we already have a cached value for the parent's of the node we
>     are looking
>     at.
> 
>     Doing so provide a large performance boost when looking at a lot of
>     fnode,
>     especially on repository with very large manifest:
> 
>     timing for `tagsmod.fnoderevs(ui, repo, repo.changelog.revs())`
> 
> 
> What end-user command does this correspond to? `hg tags` with no 
> .hg/cache/tags?

hg debugupdatecache

> 
> 
>     mercurial: (41907 revisions, 1923 files)
> 
>          before: 6.9 seconds
>          after:  2.7 seconds (-54%)
> 
>     pypy: (96266 revisions, 5198 files)
> 
>          before: 80 seconds
>          after:  20 seconds (-75%)
> 
>     mozilla-central: (463411 revisions, 272080 files)
> 
>          before: 7166.4 seconds
>          after:    47.8 seconds (-99%, x150 speedup)
> 
> 
> Nice improvements :) How did people work with these repos before?

This is the timing for compute the information for all nodes. To 
retrieve current tags name we only need this data for all heads.

Getting it for all heads is still very slow to compute initially. (that 
is why we exchange them during clone now).

To illustrate the slowness, I started a tags computation from cold 
cache… This was 3 hours ago…


So currently we only use (and exchange) entry for the repository heads.
However, the speedup rely on reusing data from the parent. So warming 
all entries during a `hg debugupdatecache` turns out to be more 
efficient (with the new code).

I guess the next step from here is to warm all entry in all cases (not 
just `hg debugupdatecache`) and efficiently exchange them over the wire.



> 
> 
>     On a copy of mozilla-try with about 35K heads ans 1.7M changesets,
>     this move
>     the computation from many hours to a couple of minutes. Making it more
>     interresting to do a full warm up of this cache before computing
>     tags (from a
>     cold cache).
> 
>     There seems to be other performance low hanging fruits, like avoid
>     the used of
>     changectx or a more revision centric logic. However, the new code is
>     fast enough
>     for my needs right now.
> 
>     diff --git a/mercurial/tags.py b/mercurial/tags.py
>     --- a/mercurial/tags.py
>     +++ b/mercurial/tags.py
>     @@ -18,6 +18,7 @@ from .node import (
>           bin,
>           hex,
>           nullid,
>     +    nullrev,
>           short,
>       )
>       from .i18n import _
>     @@ -718,12 +719,33 @@ class hgtagsfnodescache(object):
>               if not computemissing:
>                   return None
> 
>     -        # Populate missing entry.
>     -        try:
>     -            fnode = ctx.filenode('.hgtags')
>     -        except error.LookupError:
>     -            # No .hgtags file on this revision.
>     -            fnode = nullid
>     +        fnode = None
>     +        cl = self._repo.changelog
>     +        p1rev, p2rev = cl._uncheckedparentrevs(rev)
>     +        p1node = cl.node(p1rev)
>     +        p1fnode = self.getfnode(p1node, computemissing=False)
>     +        if p2rev != nullrev:
>     +            # There is some no-merge changeset where p1 is null and
>     p2 is set
>     +            # Processing them are merge is just slower, but still
>     give a good
>     +            # result.
> 
> 
> I think you're thinking of file copies, see 
> https://www.mercurial-scm.org/repo/hg/file/fdbeacb9d456/mercurial/localrepo.py#l2348

I am lost here. were are iterating over the changelog and the manifest 
here. This code deal with "malformed" changelog entry. Why are file 
copies relevant here?

> 
>     +            p2node = cl.node(p1rev)
>     +            p2fnode = self.getfnode(p2node, computemissing=False)
>     +            if p1fnode != p2fnode:
>     +                # we cannot rely on readfast because we don't know
>     against what
>     +                # parent the readfast delta is computed
>     +                p1fnode = None
>     +        if p1fnode is not None:
>     +            mctx = ctx.manifestctx()
>     +            fnode = mctx.readfast().get('.hgtags')
>     +            if fnode is None:
>     +                fnode = p1fnode
>     +        if fnode is None:
>     +            # Populate missing entry.
>     +            try:
>     +                fnode = ctx.filenode('.hgtags')
>     +            except error.LookupError:
>     +                # No .hgtags file on this revision.
>     +                fnode = nullid
> 
>               self._writeentry(offset, properprefix, fnode)
>               return fnode
>     diff --git a/tests/test-tags.t b/tests/test-tags.t
>     --- a/tests/test-tags.t
>     +++ b/tests/test-tags.t
>     @@ -145,7 +145,7 @@ Tag cache debug info written to blackbox
>         $ hg blackbox -l 6
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing 48 bytes
>     to cache/hgtagsfnodes1
>     -  1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07
>     (5000)> 0/1 cache hits/lookups in * seconds (glob)
>     +  1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07
>     (5000)> 0/2 cache hits/lookups in * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing
>     .hg/cache/tags2-visible with 1 tags
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify exited 0
>     after * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> blackbox -l 6
>     @@ -159,7 +159,7 @@ Failure to acquire lock results in no wr
>         $ hg blackbox -l 6
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> not writing
>     .hg/cache/hgtagsfnodes1 because lock cannot be acquired
>     -  1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07
>     (5000)> 0/1 cache hits/lookups in * seconds (glob)
>     +  1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07
>     (5000)> 0/2 cache hits/lookups in * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing
>     .hg/cache/tags2-visible with 1 tags
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify exited 0
>     after * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> blackbox -l 6
>     @@ -363,7 +363,7 @@ Extra junk data at the end should get ov
>         $ hg blackbox -l 6
>         1970/01/01 00:00:00 bob
>     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> tags
>         1970/01/01 00:00:00 bob
>     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> writing 24 bytes
>     to cache/hgtagsfnodes1
>     -  1970/01/01 00:00:00 bob @8dbfe60eff306a54259cfe007db9e330e7ecf866
>     (5000)> 2/3 cache hits/lookups in * seconds (glob)
>     +  1970/01/01 00:00:00 bob @8dbfe60eff306a54259cfe007db9e330e7ecf866
>     (5000)> 3/4 cache hits/lookups in * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> writing
>     .hg/cache/tags2-visible with 1 tags
>         1970/01/01 00:00:00 bob
>     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> tags exited 0
>     after * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> blackbox -l 6
>     @@ -384,7 +384,7 @@ Errors writing to .hgtags fnodes cache a
>         $ hg blackbox -l 6
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> couldn't write
>     cache/hgtagsfnodes1: [Errno *] * (glob)
>     -  1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d
>     (5000)> 2/3 cache hits/lookups in * seconds (glob)
>     +  1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d
>     (5000)> 3/4 cache hits/lookups in * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing
>     .hg/cache/tags2-visible with 1 tags
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags exited 0
>     after * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> blackbox -l 6
>     @@ -399,7 +399,7 @@ Errors writing to .hgtags fnodes cache a
>         $ hg blackbox -l 6
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing 24 bytes
>     to cache/hgtagsfnodes1
>     -  1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d
>     (5000)> 2/3 cache hits/lookups in * seconds (glob)
>     +  1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d
>     (5000)> 3/4 cache hits/lookups in * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing
>     .hg/cache/tags2-visible with 1 tags
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags exited 0
>     after * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> blackbox -l 6
>     @@ -427,7 +427,7 @@ Stripping doesn't truncate the tags cach
> 
>         $ hg blackbox -l 5
>         1970/01/01 00:00:00 bob
>     @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> writing 24 bytes
>     to cache/hgtagsfnodes1
>     -  1970/01/01 00:00:00 bob @0c192d7d5e6b78a714de54a2e9627952a877e25a
>     (5000)> 2/3 cache hits/lookups in * seconds (glob)
>     +  1970/01/01 00:00:00 bob @0c192d7d5e6b78a714de54a2e9627952a877e25a
>     (5000)> 2/4 cache hits/lookups in * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> writing
>     .hg/cache/tags2-visible with 1 tags
>         1970/01/01 00:00:00 bob
>     @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> tags exited 0
>     after * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> blackbox -l 5
>     @@ -445,7 +445,7 @@ Stripping doesn't truncate the tags cach
>         $ hg blackbox -l 6
>         1970/01/01 00:00:00 bob
>     @035f65efb448350f4772141702a81ab1df48c465 (5000)> tags
>         1970/01/01 00:00:00 bob
>     @035f65efb448350f4772141702a81ab1df48c465 (5000)> writing 24 bytes
>     to cache/hgtagsfnodes1
>     -  1970/01/01 00:00:00 bob @035f65efb448350f4772141702a81ab1df48c465
>     (5000)> 2/3 cache hits/lookups in * seconds (glob)
>     +  1970/01/01 00:00:00 bob @035f65efb448350f4772141702a81ab1df48c465
>     (5000)> 3/4 cache hits/lookups in * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @035f65efb448350f4772141702a81ab1df48c465 (5000)> writing
>     .hg/cache/tags2-visible with 1 tags
>         1970/01/01 00:00:00 bob
>     @035f65efb448350f4772141702a81ab1df48c465 (5000)> tags exited 0
>     after * seconds (glob)
>         1970/01/01 00:00:00 bob
>     @035f65efb448350f4772141702a81ab1df48c465 (5000)> blackbox -l 6
>     _______________________________________________
>     Mercurial-devel mailing list
>     Mercurial-devel at mercurial-scm.org
>     <mailto:Mercurial-devel at mercurial-scm.org>
>     https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> 

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list