[PATCH 2 of 3] hgtagsfnodescache: inherit fnode from parent when possible

Martin von Zweigbergk martinvonz at google.com
Thu May 2 17:55:55 EDT 2019


On Thu, May 2, 2019 at 2:48 PM Pierre-Yves David <
pierre-yves.david at ens-lyon.org> wrote:

>
>
> On 5/2/19 8:24 PM, Martin von Zweigbergk wrote:
> >
> >
> > On Thu, May 2, 2019 at 9:37 AM Pierre-Yves David
> > <pierre-yves.david at ens-lyon.org <mailto:pierre-yves.david at ens-lyon.org>>
>
> > wrote:
> >
> >     # HG changeset patch
> >     # User Pierre-Yves David <pierre-yves.david at octobus.net
> >     <mailto:pierre-yves.david at octobus.net>>
> >     # Date 1552263020 -3600
> >     #      Mon Mar 11 01:10:20 2019 +0100
> >     # Node ID eac353183daaef0a503da8cd72b8df43f54d7fb8
> >     # Parent  a753bc019c1ad7c5661a050adce49e4c3cd5a786
> >     # EXP-Topic fnodecache
> >     # Available At https://bitbucket.org/octobus/mercurial-devel/
> >     #              hg pull
> >     https://bitbucket.org/octobus/mercurial-devel/ -r eac353183daa
> >     hgtagsfnodescache: inherit fnode from parent when possible
> >
> >     If a changeset does not update the content of `.hgtags`, it means it
> >     will use
> >     the same file-node (for `.hgtags`) than its parents. In this such
> >     case we can
> >     directly reuse the parent's file-node.
> >
> >     We use this property when updating the `hgtagsfnodescache` taking a
> >     faster path
> >     if we already have a cached value for the parent's of the node we
> >     are looking
> >     at.
> >
> >     Doing so provide a large performance boost when looking at a lot of
> >     fnode,
> >     especially on repository with very large manifest:
> >
> >     timing for `tagsmod.fnoderevs(ui, repo, repo.changelog.revs())`
> >
> >
> > What end-user command does this correspond to? `hg tags` with no
> > .hg/cache/tags?
>
> hg debugupdatecache
>
> >
> >
> >     mercurial: (41907 revisions, 1923 files)
> >
> >          before: 6.9 seconds
> >          after:  2.7 seconds (-54%)
> >
> >     pypy: (96266 revisions, 5198 files)
> >
> >          before: 80 seconds
> >          after:  20 seconds (-75%)
> >
> >     mozilla-central: (463411 revisions, 272080 files)
> >
> >          before: 7166.4 seconds
> >          after:    47.8 seconds (-99%, x150 speedup)
> >
> >
> > Nice improvements :) How did people work with these repos before?
>
> This is the timing for compute the information for all nodes. To
> retrieve current tags name we only need this data for all heads.
>
> Getting it for all heads is still very slow to compute initially. (that
> is why we exchange them during clone now).
>
> To illustrate the slowness, I started a tags computation from cold
> cache… This was 3 hours ago…
>
>
> So currently we only use (and exchange) entry for the repository heads.
> However, the speedup rely on reusing data from the parent. So warming
> all entries during a `hg debugupdatecache` turns out to be more
> efficient (with the new code).
>
> I guess the next step from here is to warm all entry in all cases (not
> just `hg debugupdatecache`) and efficiently exchange them over the wire.
>
>
>
> >
> >
> >     On a copy of mozilla-try with about 35K heads ans 1.7M changesets,
> >     this move
> >     the computation from many hours to a couple of minutes. Making it
> more
> >     interresting to do a full warm up of this cache before computing
> >     tags (from a
> >     cold cache).
> >
> >     There seems to be other performance low hanging fruits, like avoid
> >     the used of
> >     changectx or a more revision centric logic. However, the new code is
> >     fast enough
> >     for my needs right now.
> >
> >     diff --git a/mercurial/tags.py b/mercurial/tags.py
> >     --- a/mercurial/tags.py
> >     +++ b/mercurial/tags.py
> >     @@ -18,6 +18,7 @@ from .node import (
> >           bin,
> >           hex,
> >           nullid,
> >     +    nullrev,
> >           short,
> >       )
> >       from .i18n import _
> >     @@ -718,12 +719,33 @@ class hgtagsfnodescache(object):
> >               if not computemissing:
> >                   return None
> >
> >     -        # Populate missing entry.
> >     -        try:
> >     -            fnode = ctx.filenode('.hgtags')
> >     -        except error.LookupError:
> >     -            # No .hgtags file on this revision.
> >     -            fnode = nullid
> >     +        fnode = None
> >     +        cl = self._repo.changelog
> >     +        p1rev, p2rev = cl._uncheckedparentrevs(rev)
> >     +        p1node = cl.node(p1rev)
> >     +        p1fnode = self.getfnode(p1node, computemissing=False)
> >     +        if p2rev != nullrev:
> >     +            # There is some no-merge changeset where p1 is null and
> >     p2 is set
> >     +            # Processing them are merge is just slower, but still
> >     give a good
> >     +            # result.
> >
> >
> > I think you're thinking of file copies, see
> >
> https://www.mercurial-scm.org/repo/hg/file/fdbeacb9d456/mercurial/localrepo.py#l2348
>
> I am lost here. were are iterating over the changelog and the manifest
> here. This code deal with "malformed" changelog entry. Why are file
> copies relevant here?
>

I don't think you're lost. I think I just misunderstood what this was
about. I was not aware that some repos have commits broken in that way. Any
idea how that happened?


>
> >
> >     +            p2node = cl.node(p1rev)
> >     +            p2fnode = self.getfnode(p2node, computemissing=False)
> >     +            if p1fnode != p2fnode:
> >     +                # we cannot rely on readfast because we don't know
> >     against what
> >     +                # parent the readfast delta is computed
> >     +                p1fnode = None
> >     +        if p1fnode is not None:
> >     +            mctx = ctx.manifestctx()
> >     +            fnode = mctx.readfast().get('.hgtags')
> >     +            if fnode is None:
> >     +                fnode = p1fnode
> >     +        if fnode is None:
> >     +            # Populate missing entry.
> >     +            try:
> >     +                fnode = ctx.filenode('.hgtags')
> >     +            except error.LookupError:
> >     +                # No .hgtags file on this revision.
> >     +                fnode = nullid
> >
> >               self._writeentry(offset, properprefix, fnode)
> >               return fnode
> >     diff --git a/tests/test-tags.t b/tests/test-tags.t
> >     --- a/tests/test-tags.t
> >     +++ b/tests/test-tags.t
> >     @@ -145,7 +145,7 @@ Tag cache debug info written to blackbox
> >         $ hg blackbox -l 6
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing 48 bytes
> >     to cache/hgtagsfnodes1
> >     -  1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07
> >     (5000)> 0/1 cache hits/lookups in * seconds (glob)
> >     +  1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07
> >     (5000)> 0/2 cache hits/lookups in * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing
> >     .hg/cache/tags2-visible with 1 tags
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify exited 0
> >     after * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> blackbox -l 6
> >     @@ -159,7 +159,7 @@ Failure to acquire lock results in no wr
> >         $ hg blackbox -l 6
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> not writing
> >     .hg/cache/hgtagsfnodes1 because lock cannot be acquired
> >     -  1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07
> >     (5000)> 0/1 cache hits/lookups in * seconds (glob)
> >     +  1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07
> >     (5000)> 0/2 cache hits/lookups in * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing
> >     .hg/cache/tags2-visible with 1 tags
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify exited 0
> >     after * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> blackbox -l 6
> >     @@ -363,7 +363,7 @@ Extra junk data at the end should get ov
> >         $ hg blackbox -l 6
> >         1970/01/01 00:00:00 bob
> >     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> tags
> >         1970/01/01 00:00:00 bob
> >     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> writing 24 bytes
> >     to cache/hgtagsfnodes1
> >     -  1970/01/01 00:00:00 bob @8dbfe60eff306a54259cfe007db9e330e7ecf866
> >     (5000)> 2/3 cache hits/lookups in * seconds (glob)
> >     +  1970/01/01 00:00:00 bob @8dbfe60eff306a54259cfe007db9e330e7ecf866
> >     (5000)> 3/4 cache hits/lookups in * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> writing
> >     .hg/cache/tags2-visible with 1 tags
> >         1970/01/01 00:00:00 bob
> >     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> tags exited 0
> >     after * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> blackbox -l 6
> >     @@ -384,7 +384,7 @@ Errors writing to .hgtags fnodes cache a
> >         $ hg blackbox -l 6
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> couldn't write
> >     cache/hgtagsfnodes1: [Errno *] * (glob)
> >     -  1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d
> >     (5000)> 2/3 cache hits/lookups in * seconds (glob)
> >     +  1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d
> >     (5000)> 3/4 cache hits/lookups in * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing
> >     .hg/cache/tags2-visible with 1 tags
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags exited 0
> >     after * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> blackbox -l 6
> >     @@ -399,7 +399,7 @@ Errors writing to .hgtags fnodes cache a
> >         $ hg blackbox -l 6
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing 24 bytes
> >     to cache/hgtagsfnodes1
> >     -  1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d
> >     (5000)> 2/3 cache hits/lookups in * seconds (glob)
> >     +  1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d
> >     (5000)> 3/4 cache hits/lookups in * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing
> >     .hg/cache/tags2-visible with 1 tags
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags exited 0
> >     after * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> blackbox -l 6
> >     @@ -427,7 +427,7 @@ Stripping doesn't truncate the tags cach
> >
> >         $ hg blackbox -l 5
> >         1970/01/01 00:00:00 bob
> >     @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> writing 24 bytes
> >     to cache/hgtagsfnodes1
> >     -  1970/01/01 00:00:00 bob @0c192d7d5e6b78a714de54a2e9627952a877e25a
> >     (5000)> 2/3 cache hits/lookups in * seconds (glob)
> >     +  1970/01/01 00:00:00 bob @0c192d7d5e6b78a714de54a2e9627952a877e25a
> >     (5000)> 2/4 cache hits/lookups in * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> writing
> >     .hg/cache/tags2-visible with 1 tags
> >         1970/01/01 00:00:00 bob
> >     @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> tags exited 0
> >     after * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> blackbox -l 5
> >     @@ -445,7 +445,7 @@ Stripping doesn't truncate the tags cach
> >         $ hg blackbox -l 6
> >         1970/01/01 00:00:00 bob
> >     @035f65efb448350f4772141702a81ab1df48c465 (5000)> tags
> >         1970/01/01 00:00:00 bob
> >     @035f65efb448350f4772141702a81ab1df48c465 (5000)> writing 24 bytes
> >     to cache/hgtagsfnodes1
> >     -  1970/01/01 00:00:00 bob @035f65efb448350f4772141702a81ab1df48c465
> >     (5000)> 2/3 cache hits/lookups in * seconds (glob)
> >     +  1970/01/01 00:00:00 bob @035f65efb448350f4772141702a81ab1df48c465
> >     (5000)> 3/4 cache hits/lookups in * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @035f65efb448350f4772141702a81ab1df48c465 (5000)> writing
> >     .hg/cache/tags2-visible with 1 tags
> >         1970/01/01 00:00:00 bob
> >     @035f65efb448350f4772141702a81ab1df48c465 (5000)> tags exited 0
> >     after * seconds (glob)
> >         1970/01/01 00:00:00 bob
> >     @035f65efb448350f4772141702a81ab1df48c465 (5000)> blackbox -l 6
> >     _______________________________________________
> >     Mercurial-devel mailing list
> >     Mercurial-devel at mercurial-scm.org
> >     <mailto:Mercurial-devel at mercurial-scm.org>
> >     https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> >
>
> --
> Pierre-Yves David
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20190502/cc0338e2/attachment.html>


More information about the Mercurial-devel mailing list