<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, May 2, 2019 at 2:48 PM Pierre-Yves David <<a href="mailto:pierre-yves.david@ens-lyon.org">pierre-yves.david@ens-lyon.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
<br>
On 5/2/19 8:24 PM, Martin von Zweigbergk wrote:<br>
> <br>
> <br>
> On Thu, May 2, 2019 at 9:37 AM Pierre-Yves David <br>
> <<a href="mailto:pierre-yves.david@ens-lyon.org" target="_blank">pierre-yves.david@ens-lyon.org</a> <mailto:<a href="mailto:pierre-yves.david@ens-lyon.org" target="_blank">pierre-yves.david@ens-lyon.org</a>>> <br>
> wrote:<br>
> <br>
> # HG changeset patch<br>
> # User Pierre-Yves David <<a href="mailto:pierre-yves.david@octobus.net" target="_blank">pierre-yves.david@octobus.net</a><br>
> <mailto:<a href="mailto:pierre-yves.david@octobus.net" target="_blank">pierre-yves.david@octobus.net</a>>><br>
> # Date 1552263020 -3600<br>
> # Mon Mar 11 01:10:20 2019 +0100<br>
> # Node ID eac353183daaef0a503da8cd72b8df43f54d7fb8<br>
> # Parent a753bc019c1ad7c5661a050adce49e4c3cd5a786<br>
> # EXP-Topic fnodecache<br>
> # Available At <a href="https://bitbucket.org/octobus/mercurial-devel/" rel="noreferrer" target="_blank">https://bitbucket.org/octobus/mercurial-devel/</a><br>
> # hg pull<br>
> <a href="https://bitbucket.org/octobus/mercurial-devel/" rel="noreferrer" target="_blank">https://bitbucket.org/octobus/mercurial-devel/</a> -r eac353183daa<br>
> hgtagsfnodescache: inherit fnode from parent when possible<br>
> <br>
> If a changeset does not update the content of `.hgtags`, it means it<br>
> will use<br>
> the same file-node (for `.hgtags`) than its parents. In this such<br>
> case we can<br>
> directly reuse the parent's file-node.<br>
> <br>
> We use this property when updating the `hgtagsfnodescache` taking a<br>
> faster path<br>
> if we already have a cached value for the parent's of the node we<br>
> are looking<br>
> at.<br>
> <br>
> Doing so provide a large performance boost when looking at a lot of<br>
> fnode,<br>
> especially on repository with very large manifest:<br>
> <br>
> timing for `tagsmod.fnoderevs(ui, repo, repo.changelog.revs())`<br>
> <br>
> <br>
> What end-user command does this correspond to? `hg tags` with no <br>
> .hg/cache/tags?<br>
<br>
hg debugupdatecache<br>
<br>
> <br>
> <br>
> mercurial: (41907 revisions, 1923 files)<br>
> <br>
> before: 6.9 seconds<br>
> after: 2.7 seconds (-54%)<br>
> <br>
> pypy: (96266 revisions, 5198 files)<br>
> <br>
> before: 80 seconds<br>
> after: 20 seconds (-75%)<br>
> <br>
> mozilla-central: (463411 revisions, 272080 files)<br>
> <br>
> before: 7166.4 seconds<br>
> after: 47.8 seconds (-99%, x150 speedup)<br>
> <br>
> <br>
> Nice improvements :) How did people work with these repos before?<br>
<br>
This is the timing for compute the information for all nodes. To <br>
retrieve current tags name we only need this data for all heads.<br>
<br>
Getting it for all heads is still very slow to compute initially. (that <br>
is why we exchange them during clone now).<br>
<br>
To illustrate the slowness, I started a tags computation from cold <br>
cache… This was 3 hours ago…<br>
<br>
<br>
So currently we only use (and exchange) entry for the repository heads.<br>
However, the speedup rely on reusing data from the parent. So warming <br>
all entries during a `hg debugupdatecache` turns out to be more <br>
efficient (with the new code).<br>
<br>
I guess the next step from here is to warm all entry in all cases (not <br>
just `hg debugupdatecache`) and efficiently exchange them over the wire.<br>
<br>
<br>
<br>
> <br>
> <br>
> On a copy of mozilla-try with about 35K heads ans 1.7M changesets,<br>
> this move<br>
> the computation from many hours to a couple of minutes. Making it more<br>
> interresting to do a full warm up of this cache before computing<br>
> tags (from a<br>
> cold cache).<br>
> <br>
> There seems to be other performance low hanging fruits, like avoid<br>
> the used of<br>
> changectx or a more revision centric logic. However, the new code is<br>
> fast enough<br>
> for my needs right now.<br>
> <br>
> diff --git a/mercurial/tags.py b/mercurial/tags.py<br>
> --- a/mercurial/tags.py<br>
> +++ b/mercurial/tags.py<br>
> @@ -18,6 +18,7 @@ from .node import (<br>
> bin,<br>
> hex,<br>
> nullid,<br>
> + nullrev,<br>
> short,<br>
> )<br>
> from .i18n import _<br>
> @@ -718,12 +719,33 @@ class hgtagsfnodescache(object):<br>
> if not computemissing:<br>
> return None<br>
> <br>
> - # Populate missing entry.<br>
> - try:<br>
> - fnode = ctx.filenode('.hgtags')<br>
> - except error.LookupError:<br>
> - # No .hgtags file on this revision.<br>
> - fnode = nullid<br>
> + fnode = None<br>
> + cl = self._repo.changelog<br>
> + p1rev, p2rev = cl._uncheckedparentrevs(rev)<br>
> + p1node = cl.node(p1rev)<br>
> + p1fnode = self.getfnode(p1node, computemissing=False)<br>
> + if p2rev != nullrev:<br>
> + # There is some no-merge changeset where p1 is null and<br>
> p2 is set<br>
> + # Processing them are merge is just slower, but still<br>
> give a good<br>
> + # result.<br>
> <br>
> <br>
> I think you're thinking of file copies, see <br>
> <a href="https://www.mercurial-scm.org/repo/hg/file/fdbeacb9d456/mercurial/localrepo.py#l2348" rel="noreferrer" target="_blank">https://www.mercurial-scm.org/repo/hg/file/fdbeacb9d456/mercurial/localrepo.py#l2348</a><br>
<br>
I am lost here. were are iterating over the changelog and the manifest <br>
here. This code deal with "malformed" changelog entry. Why are file <br>
copies relevant here?<br></blockquote><div><br></div><div>I don't think you're lost. I think I just misunderstood what this was about. I was not aware that some repos have commits broken in that way. Any idea how that happened?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> <br>
> + p2node = cl.node(p1rev)<br>
> + p2fnode = self.getfnode(p2node, computemissing=False)<br>
> + if p1fnode != p2fnode:<br>
> + # we cannot rely on readfast because we don't know<br>
> against what<br>
> + # parent the readfast delta is computed<br>
> + p1fnode = None<br>
> + if p1fnode is not None:<br>
> + mctx = ctx.manifestctx()<br>
> + fnode = mctx.readfast().get('.hgtags')<br>
> + if fnode is None:<br>
> + fnode = p1fnode<br>
> + if fnode is None:<br>
> + # Populate missing entry.<br>
> + try:<br>
> + fnode = ctx.filenode('.hgtags')<br>
> + except error.LookupError:<br>
> + # No .hgtags file on this revision.<br>
> + fnode = nullid<br>
> <br>
> self._writeentry(offset, properprefix, fnode)<br>
> return fnode<br>
> diff --git a/tests/test-tags.t b/tests/test-tags.t<br>
> --- a/tests/test-tags.t<br>
> +++ b/tests/test-tags.t<br>
> @@ -145,7 +145,7 @@ Tag cache debug info written to blackbox<br>
> $ hg blackbox -l 6<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing 48 bytes<br>
> to cache/hgtagsfnodes1<br>
> - 1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07<br>
> (5000)> 0/1 cache hits/lookups in * seconds (glob)<br>
> + 1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07<br>
> (5000)> 0/2 cache hits/lookups in * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing<br>
> .hg/cache/tags2-visible with 1 tags<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify exited 0<br>
> after * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> blackbox -l 6<br>
> @@ -159,7 +159,7 @@ Failure to acquire lock results in no wr<br>
> $ hg blackbox -l 6<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> not writing<br>
> .hg/cache/hgtagsfnodes1 because lock cannot be acquired<br>
> - 1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07<br>
> (5000)> 0/1 cache hits/lookups in * seconds (glob)<br>
> + 1970/01/01 00:00:00 bob @b9154636be938d3d431e75a7c906504a079bfe07<br>
> (5000)> 0/2 cache hits/lookups in * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> writing<br>
> .hg/cache/tags2-visible with 1 tags<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> identify exited 0<br>
> after * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @b9154636be938d3d431e75a7c906504a079bfe07 (5000)> blackbox -l 6<br>
> @@ -363,7 +363,7 @@ Extra junk data at the end should get ov<br>
> $ hg blackbox -l 6<br>
> 1970/01/01 00:00:00 bob<br>
> @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> tags<br>
> 1970/01/01 00:00:00 bob<br>
> @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> writing 24 bytes<br>
> to cache/hgtagsfnodes1<br>
> - 1970/01/01 00:00:00 bob @8dbfe60eff306a54259cfe007db9e330e7ecf866<br>
> (5000)> 2/3 cache hits/lookups in * seconds (glob)<br>
> + 1970/01/01 00:00:00 bob @8dbfe60eff306a54259cfe007db9e330e7ecf866<br>
> (5000)> 3/4 cache hits/lookups in * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> writing<br>
> .hg/cache/tags2-visible with 1 tags<br>
> 1970/01/01 00:00:00 bob<br>
> @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> tags exited 0<br>
> after * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @8dbfe60eff306a54259cfe007db9e330e7ecf866 (5000)> blackbox -l 6<br>
> @@ -384,7 +384,7 @@ Errors writing to .hgtags fnodes cache a<br>
> $ hg blackbox -l 6<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> couldn't write<br>
> cache/hgtagsfnodes1: [Errno *] * (glob)<br>
> - 1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d<br>
> (5000)> 2/3 cache hits/lookups in * seconds (glob)<br>
> + 1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d<br>
> (5000)> 3/4 cache hits/lookups in * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing<br>
> .hg/cache/tags2-visible with 1 tags<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags exited 0<br>
> after * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> blackbox -l 6<br>
> @@ -399,7 +399,7 @@ Errors writing to .hgtags fnodes cache a<br>
> $ hg blackbox -l 6<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing 24 bytes<br>
> to cache/hgtagsfnodes1<br>
> - 1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d<br>
> (5000)> 2/3 cache hits/lookups in * seconds (glob)<br>
> + 1970/01/01 00:00:00 bob @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d<br>
> (5000)> 3/4 cache hits/lookups in * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> writing<br>
> .hg/cache/tags2-visible with 1 tags<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> tags exited 0<br>
> after * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @b968051b5cf3f624b771779c6d5f84f1d4c3fb5d (5000)> blackbox -l 6<br>
> @@ -427,7 +427,7 @@ Stripping doesn't truncate the tags cach<br>
> <br>
> $ hg blackbox -l 5<br>
> 1970/01/01 00:00:00 bob<br>
> @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> writing 24 bytes<br>
> to cache/hgtagsfnodes1<br>
> - 1970/01/01 00:00:00 bob @0c192d7d5e6b78a714de54a2e9627952a877e25a<br>
> (5000)> 2/3 cache hits/lookups in * seconds (glob)<br>
> + 1970/01/01 00:00:00 bob @0c192d7d5e6b78a714de54a2e9627952a877e25a<br>
> (5000)> 2/4 cache hits/lookups in * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> writing<br>
> .hg/cache/tags2-visible with 1 tags<br>
> 1970/01/01 00:00:00 bob<br>
> @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> tags exited 0<br>
> after * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @0c192d7d5e6b78a714de54a2e9627952a877e25a (5000)> blackbox -l 5<br>
> @@ -445,7 +445,7 @@ Stripping doesn't truncate the tags cach<br>
> $ hg blackbox -l 6<br>
> 1970/01/01 00:00:00 bob<br>
> @035f65efb448350f4772141702a81ab1df48c465 (5000)> tags<br>
> 1970/01/01 00:00:00 bob<br>
> @035f65efb448350f4772141702a81ab1df48c465 (5000)> writing 24 bytes<br>
> to cache/hgtagsfnodes1<br>
> - 1970/01/01 00:00:00 bob @035f65efb448350f4772141702a81ab1df48c465<br>
> (5000)> 2/3 cache hits/lookups in * seconds (glob)<br>
> + 1970/01/01 00:00:00 bob @035f65efb448350f4772141702a81ab1df48c465<br>
> (5000)> 3/4 cache hits/lookups in * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @035f65efb448350f4772141702a81ab1df48c465 (5000)> writing<br>
> .hg/cache/tags2-visible with 1 tags<br>
> 1970/01/01 00:00:00 bob<br>
> @035f65efb448350f4772141702a81ab1df48c465 (5000)> tags exited 0<br>
> after * seconds (glob)<br>
> 1970/01/01 00:00:00 bob<br>
> @035f65efb448350f4772141702a81ab1df48c465 (5000)> blackbox -l 6<br>
> _______________________________________________<br>
> Mercurial-devel mailing list<br>
> <a href="mailto:Mercurial-devel@mercurial-scm.org" target="_blank">Mercurial-devel@mercurial-scm.org</a><br>
> <mailto:<a href="mailto:Mercurial-devel@mercurial-scm.org" target="_blank">Mercurial-devel@mercurial-scm.org</a>><br>
> <a href="https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel" rel="noreferrer" target="_blank">https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel</a><br>
> <br>
<br>
-- <br>
Pierre-Yves David<br>
</blockquote></div></div>