Our performance test suite worker detected a regression in perfchangegroupchangelog (http://perf.octobus.net/#regressions?sort=3&dir=desc): others.PerfTestSuite.track_changegroupchangelog('netbeans-2018-08-01') 2018-08-09 18:28 913ca175..b9f94d67, d99468d2..e793e11e 1.75x 8.75 15.3 others.PerfTestSuite.track_changegroupchangelog('pypy-2018-08-01') 2018-08-09 18:28 913ca175..b9f94d67, d99468d2..e793e11e 1.66x 2.75 4.58 others.PerfTestSuite.track_changegroupchangelog('mercurial-2018-08-01') 2018-08-09 18:28 913ca175..b9f94d67, d99468d2..e793e11e 1.66x 1.08 1.79 others.PerfTestSuite.track_changegroupchangelog('mozilla-central-2018-08-01') 2018-08-09 02:03 913ca175..b9f94d67 1.15x 12.5 14.4 The bisect tool identified two responsible commits: - https://www.mercurial-scm.org/repo/hg/rev/23d582ca - https://www.mercurial-scm.org/repo/hg/rev/e793e11e
This regression is somewhat expected (the commit message noted the overhead). I will try to claw this back before 4.8 is released. FWIW the regression is proportional to the number of revisions being added to a changegroup. This disproportionately affects changelog revisions because changelog revisions are small and any overhead in changegroup generation code is felt more significantly. The overhead is still there for manifest and file revisions. But it's percentage of execution time is lower because generation of this data is often dwarfed by data retrieval and diffing.
Here are the numbers we could gather on a private repo with ~1M changesets: # Perfchangegroupchangelog with 4.6.2 HGRCPATH= hg --config profiling.time-track=real --config extensions.perf=.../contrib/perf.py perfchangegroupchangelog --profile ! wall 110.213339 comb 114.640000 user 108.800000 sys 5.840000 (best of 3) | 99.9% dispatch.py: _callcatch line 41: dispatch.run() | 99.9% scmutil.py: callcatch line 359: return scmutil.callcatch(ui... | 99.9% dispatch.py: _runcatchfunc line 160: return func() | 99.9% hg.py: repository line 341: return _dispatch(req) | 99.3% perf.py: _timer line 642: timer(d) | 99.3% perf.py: d line 244: r = func() | 99.2% changegroup.py: group line 638: for chunk in bundler.group(... \ 57.1% perf.py: lookup line 581: linknode = lookup(revlog.no... | 56.8% changelog.py: read line 634: cl.read(node) | 55.9% changelog.py: __new__ line 486: c = changelogrevision(self.... \ 47.8% revlog.py: _chunks line 1787: bins = self._chunks(chain, ... | 44.6% revlog.py: decompress line 1706: ladd(decomp(buffer(data, ch... \ 5.7% revlog.py: checkhash line 1802: self.checkhash(text, node, ... | 5.3% revlog.py: hash line 1885: if node != self.hash(text, ... | 5.3% revlog.py: hash line 1812: return hash(text, p1, p2) \ 39.7% changegroup.py: revchunk line 582: for c in self.revchunk(revl... | 35.1% revlog.py: revdiff line 808: delta = revlog.revdiff(base... | 34.3% revlog.py: _chunk line 1731: return bytes(self._chunk(re... | 34.0% revlog.py: decompress line 1655: return self.decompress(self... --- Sample count: 264912 Total time: 346.570000 seconds # Perfchangegroupchangelog with 6268fed317d0 HGRCPATH= hg --config profiling.time-track=real --config extensions.perf=.../contrib/perf.py perfchangegroupchangelog --profile ! wall 150.886031 comb 156.070000 user 148.570000 sys 7.500000 (best of 3) | 100.0% dispatch.py: _callcatch line 41: dispatch.run() | 100.0% scmutil.py: callcatch line 367: return scmutil.callcatch(ui... | 100.0% dispatch.py: _runcatchfunc line 164: return func() | 100.0% extensions.py: loadall line 350: return _dispatch(req) | 99.3% perf.py: _timer line 678: timer(d) | 99.2% perf.py: d line 269: r = func() | 99.2% changegroup.py: deltagroup line 671: for chunk in chunks: \ 65.5% changegroup.py: lookupcl line 708: linknode = lookup(node) | 65.2% changelog.py: changelogrevision line 955: c = cl.changelogrevision(x) | 65.2% changelog.py: __new__ line 491: return changelogrevision(se... | 59.6% revlog.py: _chunks line 1623: bins = self._chunks(chain, ... | 56.4% revlog.py: _getsegmentforrevs line 1518: ladd(decomp(buffer(data, ch... \ 31.2% revlog.py: emitrevisiondeltas line 749: for i, delta in enumerate(s... \ 16.3% revlog.py: revision line 2335: delta = self.revdiff(basere... | 16.1% revlog.py: _chunk line 1562: return bytes(self._chunk(re... | 16.1% revlog.py: _getsegmentforrevs line 1466: return self.decompress(self... \ 12.5% revlog.py: revision line 2333: revision = self.revision(no... | 10.9% revlog.py: _chunks line 1623: bins = self._chunks(chain, ... | 10.0% revlog.py: _getsegmentforrevs line 1518: ladd(decomp(buffer(data, ch... --- Sample count: 323902 Total time: 474.410000 seconds (459.620000 wall) # Bundling 1000 changesets with 4.6.2 HGRCPATH= hg --config profiling.time-track=real --config progress.disable=True --profile bundle -t none-v2 --base "all() - last(all(), 1000)" test.bundle > /dev/null | 100.0% dispatch.py: _callcatch line 41: dispatch.run() | 100.0% scmutil.py: callcatch line 359: return scmutil.callcatch(ui... | 100.0% dispatch.py: _runcatchfunc line 160: return func() | 100.0% hg.py: repository line 341: return _dispatch(req) \ 49.8% bundle2.py: writenewbundle line 1278: compopts=compopts) | 49.4% changegroup.py: writechunks line 1574: return changegroup.writechu... | 48.2% bundle2.py: getchunks line 86: for c in chunks: | 48.2% bundle2.py: _getcorechunk line 673: self._compopts): | 48.2% bundle2.py: getchunks line 694: for chunk in part.getchunks... | 48.2% bundle2.py: _payloadchunks line 1087: for chunk in self._payloadc... | 48.1% util.py: read line 1131: chunk = buff.read(preferedc... | 48.1% util.py: splitbig line 2235: for chunk in self.iter: | 48.0% changegroup.py: getchunks line 2207: for chunk in chunks: | 48.0% changegroup.py: getchunk line 219: chunk = getchunk(self) \ 41.4% util.py: readexactly line 49: d = readexactly(stream, 4) | 41.4% changegroup.py: read line 3711: s = stream.read(n) | 41.4% util.py: read line 142: return self._stream.read(l) | 41.3% util.py: splitbig line 2235: for chunk in self.iter: | 41.3% changegroup.py: generate line 2207: for chunk in chunks: \ 25.4% changegroup.py: generatefiles line 675: source): | 21.3% localrepo.py: file line 759: filerevlog = repo.file(fname) | 21.3% repoview.py: __getattr__ line 1115: return filelog.filelog(self... | 21.3% revlog.py: __init__ line 24: censorable=True) | 14.9% revlog.py: _indexfp line 663: with self._indexfp() as f: | 14.9% store.py: __call__ line 730: return self.opener(self.ind... | 14.9% vfs.py: __call__ line 495: return self.vfs(self.encode... \ 15.6% changegroup.py: generatemanifestsline 658: fastpathlinkrev, mfs, fnode... | 15.2% changegroup.py: _packmanifests line 743: makelookupmflinknode(dir, n... | 15.2% changegroup.py: group line 598: lookuplinknode, units=_('ma... | 14.8% changegroup.py: revchunk line 582: for c in self.revchunk(revl... | 14.8% revlog.py: revdiff line 808: delta = revlog.revdiff(base... | 11.7% revlog.py: revision line 1733: return mdiff.textdiff(self.... | 11.1% revlog.py: _chunks line 1787: bins = self._chunks(chain, ... | 6.2% revlog.py: _getsegmentforrevsline 1694: offset, data = self._getseg... | 6.2% revlog.py: _getsegment line 1644: return start, self._getsegm... | 6.2% revlog.py: _readsegment line 1610: return self._readsegment(of... \ 6.6% util.py: readexactly line 55: return readexactly(stream, ... | 6.6% changegroup.py: read line 3711: s = stream.read(n) | 6.5% util.py: read line 142: return self._stream.read(l) | 6.5% util.py: splitbig line 2235: for chunk in self.iter: | 6.5% changegroup.py: generate line 2207: for chunk in chunks: \ 39.6% context.py: node line 1228: common = [repo[rev].node() ... | 27.9% context.py: __init__ line 839: return context.changectx(se... | 20.3% repoview.py: changelog line 415: self._node = repo.changelog... | 5.8% localrepo.py: __get__ line 216: unfichangelog = unfi.changelog \ 7.6% util.py: __get__ line 1243: if not outgoing.missing: | 7.6% discovery.py: missing line 1437: result = self.func(obj) | 7.6% discovery.py: _computecommonmissing line 122: self._computecommonmissing() | 7.6% revlog.py: findcommonmissing line 110: self.missingheads) | 5.4% changelog.py: rev line 1033: common = [self.rev(n) for n... --- Sample count: 16277 Total time: 39.820000 seconds # Bundling 1000 changesets with 6268fed317d0 | 100.0% dispatch.py: _callcatch line 41: dispatch.run() | 100.0% scmutil.py: callcatch line 367: return scmutil.callcatch(ui... | 100.0% dispatch.py: _runcatchfunc line 164: return func() | 100.0% commands.py: bundle line 350: return _dispatch(req) \ 62.8% bundle2.py: writenewbundle line 1286: compopts=compopts) | 62.5% changegroup.py: writechunks line 1575: return changegroup.writechu... | 61.6% bundle2.py: getchunks line 93: for c in chunks: | 61.6% bundle2.py: _getcorechunk line 674: self._compopts): | 61.6% bundle2.py: getchunks line 695: for chunk in part.getchunks... | 61.6% bundle2.py: _payloadchunks line 1088: for chunk in self._payloadc... | 61.0% util.py: read line 1132: chunk = buff.read(preferedc... | 61.0% util.py: splitbig line 2214: for chunk in self.iter: | 61.0% changegroup.py: getchunks line 2186: for chunk in chunks: | 61.0% changegroup.py: getchunk line 226: chunk = getchunk(self) \ 52.4% util.py: readexactly line 52: d = readexactly(stream, 4) | 52.4% changegroup.py: read line 3802: s = stream.read(n) | 52.4% util.py: read line 149: return self._stream.read(l) | 52.3% util.py: splitbig line 2214: for chunk in self.iter: | 52.3% changegroup.py: generate line 2186: for chunk in chunks: \ 37.4% changegroup.py: generatefiles line 916: for path, deltas in it: | 35.1% localrepo.py: file line 1144: filerevlog = repo.file(fname) | 35.0% repoview.py: __getattr__ line 1143: return filelog.filelog(self... | 35.0% revlog.py: __init__ line 24: censorable=True) | 17.7% revlog.py: _indexfp line 446: with self._indexfp() as f: | 17.7% store.py: __call__ line 513: return self.opener(self.ind... | 17.7% vfs.py: __call__ line 507: return self.vfs(encoded, mo... \ 14.1% changegroup.py: deltagroup line 891: for delta in deltas: | 13.9% revlog.py: emitrevisiondeltasline 749: for i, delta in enumerate(s... | 13.9% revlog.py: revdiff line 2335: delta = self.revdiff(basere... | 10.9% revlog.py: revision line 1564: return mdiff.textdiff(self.... | 10.4% revlog.py: _chunks line 1623: bins = self._chunks(chain, ... \ 5.3% revlog.py: decompress line 1518: ladd(decomp(buffer(data, ch... \ 5.1% revlog.py: _getsegmentforrevsline 1506: offset, data = self._getseg... | 5.1% revlog.py: _getsegmentline 1455: return start, self._getsegm... | 5.1% revlog.py: _readsegmentline 1421: return self._readsegment(of... \ 8.6% util.py: readexactly line 58: return readexactly(stream, ... | 8.6% changegroup.py: read line 3802: s = stream.read(n) | 8.6% util.py: read line 149: return self._stream.read(l) | 8.5% util.py: splitbig line 2214: for chunk in self.iter: | 8.5% changegroup.py: generate line 2186: for chunk in chunks: \ 28.5% localrepo.py: __getitem__ line 1236: common = [repo[rev].node() ... | 22.5% context.py: __init__ line 870: return context.changectx(se... | 18.8% repoview.py: changelog line 395: self._node = repo.changelog... | 7.1% localrepo.py: __get__ line 213: unfichangelog = unfi.changelog \ 5.9% util.py: __get__ line 1251: if not outgoing.missing: | 5.9% discovery.py: missing line 1416: result = self.func(obj) | 5.9% discovery.py: _computecommonmissing line 122: self._computecommonmissing() | 5.9% revlog.py: findcommonmissing line 110: self.missingheads) --- Sample count: 21350 Total time: 40.520000 seconds (39.640000 wall) Unfortunately I can't paste the raw output of `hg debugrevlog -c`. Here is a redacted version of the revision size information: uncompressed data size (min/max/avg) : ~100 / ~5M / ~5K If you need more information from debugrevlog, please tell me. I will see if I can paste a redacted version of it
I have a patch series deep in my unpublished queue that will claw back the perf. It does so by replacing emitrevisiondeltas() with a replacement storage layer API that doesn't require the (as-it-turns-out moderately expensive) process of building a list of request objects: you just feed it nodes and parameters to influence delta generation and it does its thing. On my mozilla-unified clone: $ hg perfchangegroupchangelog 4.7: ! wall 15.824165 comb 15.830000 user 15.760000 sys 0.070000 (best of 3) @: ! wall 24.348077 comb 24.330000 user 24.140000 sys 0.190000 (best of 3) patch: ! wall 18.245911 comb 18.240000 user 18.100000 sys 0.140000 (best of 3) The remaining overhead versus 4.7 is probably due to 23d582ca. I'm not sure if we can recapture that perf while supporting generic storage backends.
Fixed by https://mercurial-scm.org/repo/hg/rev/31b7e8e7132e Gregory Szorc <gregory.szorc@gmail.com> changegroup: port to emitrevisions() (issue5976) We now have a unified API for emitting revision data from a storage backend. It handles sorting nodes and the complicated delta versus revision decisions for us. This commit ports changegroup to that API. There should be no behavior changes for changegroups not using ellipsis. And lack of test changes seems to confirm that. There are some changes for ellipsis mode, however. Before, when sending an ellipsis revision, we would always send a fulltext revision (as opposed to a delta). There was a TODO tracking this open item. One of the things the emitrevisions() API does for us is figure out whether we can safely emit a delta. So, it is now possible for ellipsis revisions to be sent as deltas! (It does this by not assuming parent/ancestor revisions are available and tracking which revisions have been sent out.) Because we eliminated the list of revision delta request objects, performance has improved substantially: $ hg perfchangegroupchangelog before: ! wall 24.348077 comb 24.330000 user 24.140000 sys 0.190000 (best of 3) after: ! wall 18.245911 comb 18.240000 user 18.100000 sys 0.140000 (best of 3) That's a lot of overhead for creating a few hundred thousand Python objects! This is still a little slower than 4.7. Probably due to 23d582ca introducing a type for the revision/delta results. There is potentially room to optimize. But at some point we need to abstract storage in order to support alternate storage backends. Unfortunately that means using a Python data structure to represent results. And unfortunately there is overhead with every new Python object created. Differential Revision: https://phab.mercurial-scm.org/D4725 (please test the fix)
Bug was set to TESTING for 7 days, resolving