Bug 6001 - Performance regression on bundle in 4.7
Summary: Performance regression on bundle in 4.7
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: default branch
Hardware: PC Linux
: urgent bug
Assignee: Bugzilla
URL:
Keywords: perfregression, releaseblocker
Depends on:
Blocks:
 
Reported: 2018-10-16 11:56 UTC by Boris Feld
Modified: 2018-11-08 00:00 UTC (History)
4 users (show)

See Also:
Python Version: ---


Attachments
Profile before the regression (3.11 KB, text/plain)
2018-10-17 06:00 UTC, Boris Feld
Details
Profile after the regression (3.11 KB, text/plain)
2018-10-17 06:00 UTC, Boris Feld
Details
Profile after the regression and after the conversion (3.98 KB, text/plain)
2018-10-17 06:00 UTC, Boris Feld
Details

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 Gregory Szorc 2018-10-16 12:09 UTC
The commit message of https://www.mercurial-scm.org/repo/hg/rev/db5501d9 is very detailed about the expected performance implications, including regressions in some cases.

What concerns me most is the degree of perf regression. e.g. on http://perf.octobus.net/#basic_commands.PushPullTimeSuite.time_push?branch=default&commits=bce1c1af-41fcdfe3&p-repo='netbeans-2018-08-01'&p-repo_type='local'&p-strip='last-thousand'&p-revset=None we go from ~1.65s to ~12.0s, over a 6x increase!

I would *really* like to know where those extra ~10s of CPU are being spent. I'm assuming it is on the producer end, as my measurements showed that the DAG ordered changegroups were *faster* to apply, not slower. But with e.g. the test measuring push performance, it would be really nice to confirm that.

I know the Netbeans repo has a lot of merges and it is possible the non-linear "shape" of it is causing problems for various algorithms - possibly even the DAG sorting itself.

I'd love to see `hg --profile` output from before and after showing which functions the regression is in...
Comment 2 Gregory Szorc 2018-10-16 12:10 UTC
If this is pinned on DAG sorting being the slowdown, we should definitely get a `hg perfdagsort` or something to isolate that operation.
Comment 3 Boris Feld 2018-10-17 03:33 UTC
The repository contains everything needed to reproduce those issues: https://bitbucket.org/octobus/bighgperf/src/default/

You can use the exact same repository that we use for the test using the make file: https://bitbucket.org/octobus/bighgperf/src/b8cb829cb993f2081983127d01e705b1733847f2/repos.make#lines-27.

You can also download the repository using this command line:
    
    curl https://static.octobus.net/asv/netbeans-2018-08-01-reference.tar | tar x; hg -R netbeans-2018-08-01-reference update tip

The easiest way to reproduce is trying to launch a bundle operation:

    HGRCPATH= hg bundle --config profiling.time_track=real --base ":-1000" /tmp/bundle.bundle --profile

I did the before and after profiles of the bundle command, I will attach them to the issue.

I've seen the performance in the commit message and I launched a full bundle during the night (with `hg bundle -t none-v2 -a`). It took 2.31h to finish.

I've unbundle it in a new repository and the performance is back to baseline.

Here are the number for reference:
  - 7.18s before
  - 37.77s after
  - 7.53s after conversion

We're still checking the impact on some of our customers clients, what would be the possible options if we don't want to ship this performance regression in 4.8?
Comment 4 Boris Feld 2018-10-17 06:00 UTC
Created attachment 2022 [details]
Profile before the regression
Comment 5 Boris Feld 2018-10-17 06:00 UTC
Created attachment 2023 [details]
Profile after the regression
Comment 6 Boris Feld 2018-10-17 06:00 UTC
Created attachment 2024 [details]
Profile after the regression and after the conversion
Comment 7 Boris Feld 2018-10-29 16:13 UTC
The regression impacts both push and pull on the netbeans repository:

Pushing 100 revisions is 20% slower (+~200ms).
Pulling 100 revisions is ~20% slower (+~500ms).
Pushing 1000 revisions is 600% slower (+~10s). 
Pulling 1000 revisions is ~640% slower (+~13s).

We have identified two issues:
- The base rev selection algorithm is less good (~40% of the regression)
- We don't linearize the emitted revisions by default anymore (~40% of the regression)

Due to the high impact of the regression, I would like to propose tagging this issue as a release blocker.
Comment 8 Augie Fackler 2018-10-30 11:40 UTC
(marking as confirmed because I have no reason to distrust this detailed analysis - I have not done my own investigation)
Comment 9 HG Bot 2018-10-31 18:15 UTC
Fixed by https://mercurial-scm.org/repo/hg/rev/634b45317459
Boris Feld <boris.feld@octobus.net>
changegroup: restore default node ordering (issue6001)

Changeset db5501d9 changed the default node ordering from "storage" to
"linearize".

While the new API is more explicit and cleaner, the "linearize" order is
problematic on certain repositories like netbeans where it makes bundling
slower the more nodes we bundle.

Pushing and pulling 100 changesets was ~20% slower and pushing and pulling
1000 changesets was ~600% slower.

A very quick analysis of profile traces showed that the pull operation was
taking more time creating the delta.

Putting back the old default order seems to be the safe option. With more time
during the next cycle, we can understand better the impact of sorting with the
DAG order by default, the source of the regression and how to mitigate it.

/!\ We are still waiting for the full performance impact but with this patch,
bundling and pulling locally (not on the performance workstation) 1000
changesets on the netbeans repository is as fast as before the regression.

Differential Revision: https://phab.mercurial-scm.org/D5196

(please test the fix)
Comment 10 Bugzilla 2018-11-08 00:00 UTC
Bug was set to TESTING for 7 days, resolving