[PATCH STABLE] revlog: add an experimental option to mitigated delta issues (issue5480)
Gábor STEFANIK
Gabor.STEFANIK at nng.com
Thu Jun 29 05:23:19 EDT 2017
> -----Original Message-----
> From: Pierre-Yves David [mailto:pierre-yves.david at ens-lyon.org]
> Sent: Wednesday, June 28, 2017 10:48 PM
> To: Gábor STEFANIK <Gabor.STEFANIK at nng.com>; mercurial-
> devel at mercurial-scm.org
> Subject: Re: [PATCH STABLE] revlog: add an experimental option to mitigated
> delta issues (issue5480)
>
>
>
> On 06/28/2017 02:26 PM, Gábor STEFANIK wrote:
> >> -----Original Message-----
> >> From: Mercurial-devel [mailto:mercurial-devel-bounces at mercurial-
> scm.org]
> >> On Behalf Of Pierre-Yves David
> >> Sent: Tuesday, June 27, 2017 1:16 PM
> >> To: mercurial-devel at mercurial-scm.org
> >> Subject: [PATCH STABLE] revlog: add an experimental option to mitigated
> >> delta issues (issue5480)
> >>
> >> # HG changeset patch
> >> # User Pierre-Yves David <pierre-yves.david at octobus.net>
> >> # Date 1498218574 -7200
> >> # Fri Jun 23 13:49:34 2017 +0200
> >> # Branch stable
> >> # Node ID 33998dea4a10b09502bf458e458daca273a3f29a
> >> # Parent 231690dba9b4d31b5ad2c93284e454135f2763ca
> >> # EXP-Topic manifest
> >> # Available At https://www.mercurial-
> >> scm.org/repo/users/marmoute/mercurial/
> >> # hg pull https://www.mercurial-
> >> scm.org/repo/users/marmoute/mercurial/ -r 33998dea4a10
> >> revlog: add an experimental option to mitigated delta issues (issue5480)
> >>
> >> The general delta heuristic to select a delta do not scale with the number
> of
> >> branch. The delta base is frequently too far away to be able to reuse a
> chain
> >> according to the "distance" criteria. This leads to insertion of larger delta
> (or
> >> even full text) that themselves push the bases for the next delta further
> >> away
> >> leading to more large deltas and full texts. This full text and frequent
> >> recomputation throw Mercurial performance in disarray.
> >>
> >> For example of a slightly large repository
> >>
> >> 280 000 files (2 150 000 versions)
> >> 430 000 changesets (10 000 topological heads)
> >>
> >> Number below compares repository with and without the distance
> criteria:
> >>
> >> manifest size:
> >> with: 21.4 GB
> >> without: 0.3 GB
> >>
> >> store size:
> >> with: 28.7 GB
> >> without 7.4 GB
> >>
> >> bundle last 15 00 revisions:
> >> with: 800 seconds
> >> 971 MB
> >> without: 50 seconds
> >> 73 MB
> >>
> >> unbundle time (of the last 15K revisions):
> >> with: 1150 seconds (~19 minutes)
> >> without: 35 seconds
> >>
> >> Similar issues has been observed in other repositories.
> >>
> >>
> >> Adding a new option or "feature" on stable is uncommon. However, given
> >> that this
> >> issues is making Mercurial practically unusable, I'm exceptionally targeting
> >> this patch for stable.
> >>
> >> What is actually needed is a full rework of the delta building and reading
> >> logic. However, that will be a longer process and churn not suitable for
> stable.
> >>
> >> In the meantime, we introduces a quick and dirty mitigation of this in the
> >> 'experimental' config space. The new option introduces a way to set the
> >> maximum
> >> amount of memory usable to store a diff in memory. This extend the
> ability
> >> for
> >> Mercurial to create chains without removing all safe guard regarding
> memory
> >> access. The option should be phased out when core has a more proper
> >> solution
> >> available.
> >>
> >> Setting the limit to '0' remove all limits, setting it to '-1' use the default
> >> limit (textsize x 4).
> >>
> >> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> >> --- a/mercurial/localrepo.py
> >> +++ b/mercurial/localrepo.py
> >> @@ -432,6 +432,9 @@ class localrepository(object):
> >> 'aggressivemergedeltas', False)
> >> self.svfs.options['aggressivemergedeltas'] = aggressivemergedeltas
> >> self.svfs.options['lazydeltabase'] = not scmutil.gddeltaconfig(self.ui)
> >> + chainspan = self.ui.configbytes('experimental', 'maxdeltachainspan',
> -1)
> >> + if 0 <= chainspan:
> >> + self.svfs.options['maxdeltachainspan'] = chainspan
> >>
> >> for r in self.requirements:
> >> if r.startswith('exp-compression-'):
> >> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> >> --- a/mercurial/revlog.py
> >> +++ b/mercurial/revlog.py
> >> @@ -282,6 +282,7 @@ class revlog(object):
> >> self._nodecache = {nullid: nullrev}
> >> self._nodepos = None
> >> self._compengine = 'zlib'
> >> + self._maxdeltachainspan = -1
> >>
> >> v = REVLOG_DEFAULT_VERSION
> >> opts = getattr(opener, 'options', None)
> >> @@ -300,6 +301,8 @@ class revlog(object):
> >> self._lazydeltabase = bool(opts.get('lazydeltabase', False))
> >> if 'compengine' in opts:
> >> self._compengine = opts['compengine']
> >> + if 'maxdeltachainspan' in opts:
> >> + self._maxdeltachainspan = opts['maxdeltachainspan']
> >>
> >> if self._chunkcachesize <= 0:
> >> raise RevlogError(_('revlog chunk cache size %r is not greater '
> >> @@ -1596,7 +1599,13 @@ class revlog(object):
> >> # - 'compresseddeltalen' is the sum of the total size of deltas we
> need
> >> # to apply -- bounding it limits the amount of CPU we consume.
> >> dist, l, data, base, chainbase, chainlen, compresseddeltalen = d
> >> - if (dist > textlen * 4 or l > textlen or
> >> +
> >> + defaultmax = textlen * 4
> >> + maxdist = self._maxdeltachainspan
> >> + if not maxdist:
> >> + maxdist = dist # ensure the conditional pass
> >> + maxdist = max(maxdist, defaultmax)
> >> + if (dist > maxdist or l > textlen or
> >
> > Perhaps it would be cleaner if we could configure the multiplier directly,
> rather than setting a fixed span limit.
>
> I play a bit with that approach before but it give less good result on
> various aspect (both in compression and in usability):
>
> - it is harder to tune,
> - it is less explicit on the memory Mercurial might actually spike to,
> - as the limit apply to all revlog it can get out of control for larger
> file (not manifest) (eg x100 can be appropriate for the manifest, but
> not for a 10MB file)
>
> Overall, having a simple way to specify sage boundary for mercurial
> memory usage is simpler to deal with for a simple mitigation.
>
> Overall, we need to rework the reading and heuristic to solve this
> properly so that factor is unlikely to survive on the long run.
I was under the impression that this limit is not about memory usage, but rather I/O amplification and resulting loss of speed.
ACK if we need this limit primarily to limit memory usage.
>
>
> >
> >> compresseddeltalen > textlen * 2 or
> >> (self._maxchainlen and chainlen > self._maxchainlen)):
> >> return False
> >> diff --git a/tests/test-generaldelta.t b/tests/test-generaldelta.t
> >> --- a/tests/test-generaldelta.t
> >> +++ b/tests/test-generaldelta.t
> >> @@ -159,3 +159,191 @@ Test that strip bundle use bundle2
> >> 1c5d4dc9a8b8d6e1750966d343e94db665e7a1e9
> >>
> >> $ cd ..
> >> +
> >> +test maxdeltachainspan
> >> +
> >> + $ hg init source-repo
> >> + $ cd source-repo
> >> + $ hg debugbuilddag --new-file
> >> '.+5:brancha$.+11:branchb$.+30:branchc<brancha+2<branchb+2'
> >> + $ cd ..
> >> + $ hg -R source-repo debugindex -m
> >> + rev offset length delta linkrev nodeid p1 p2
> >> + 0 0 46 -1 0 19deeef41503 000000000000 000000000000
> >> + 1 46 57 0 1 fffc37b38c40 19deeef41503 000000000000
> >> + 2 103 57 1 2 5822d75c83d9 fffc37b38c40 000000000000
> >> + 3 160 57 2 3 19cf2273e601 5822d75c83d9 000000000000
> >> + 4 217 57 3 4 d45ead487afe 19cf2273e601 000000000000
> >> + 5 274 57 4 5 96e0c2ce55ed d45ead487afe 000000000000
> >> + 6 331 46 -1 6 0c2ea5222c74 000000000000 000000000000
> >> + 7 377 57 6 7 4ca08a89134d 0c2ea5222c74 000000000000
> >> + 8 434 57 7 8 c973dbfd30ac 4ca08a89134d 000000000000
> >> + 9 491 57 8 9 d81d878ff2cd c973dbfd30ac 000000000000
> >> + 10 548 58 9 10 dbee7f0dd760 d81d878ff2cd 000000000000
> >> + 11 606 58 10 11 474be9f1fd4e dbee7f0dd760 000000000000
> >> + 12 664 58 11 12 594a27502c85 474be9f1fd4e 000000000000
> >> + 13 722 58 12 13 a7d25307d6a9 594a27502c85 000000000000
> >> + 14 780 58 13 14 3eb53082272e a7d25307d6a9 000000000000
> >> + 15 838 58 14 15 d1e94c85caf6 3eb53082272e 000000000000
> >> + 16 896 58 15 16 8933d9629788 d1e94c85caf6 000000000000
> >> + 17 954 58 16 17 a33416e52d91 8933d9629788 000000000000
> >> + 18 1012 47 -1 18 4ccbf31021ed 000000000000 000000000000
> >> + 19 1059 58 18 19 dcad7a25656c 4ccbf31021ed 000000000000
> >> + 20 1117 58 19 20 617c4f8be75f dcad7a25656c 000000000000
> >> + 21 1175 58 20 21 975b9c1d75bb 617c4f8be75f 000000000000
> >> + 22 1233 58 21 22 74f09cd33b70 975b9c1d75bb 000000000000
> >> + 23 1291 58 22 23 54e79bfa7ef1 74f09cd33b70 000000000000
> >> + 24 1349 58 23 24 c556e7ff90af 54e79bfa7ef1 000000000000
> >> + 25 1407 58 24 25 42daedfe9c6b c556e7ff90af 000000000000
> >> + 26 1465 58 25 26 f302566947c7 42daedfe9c6b 000000000000
> >> + 27 1523 58 26 27 2346959851cb f302566947c7 000000000000
> >> + 28 1581 58 27 28 ca8d867106b4 2346959851cb 000000000000
> >> + 29 1639 58 28 29 fd9152decab2 ca8d867106b4 000000000000
> >> + 30 1697 58 29 30 3fe34080a79b fd9152decab2 000000000000
> >> + 31 1755 58 30 31 bce61a95078e 3fe34080a79b 000000000000
> >> + 32 1813 58 31 32 1dd9ba54ba15 bce61a95078e 000000000000
> >> + 33 1871 58 32 33 3cd9b90a9972 1dd9ba54ba15 000000000000
> >> + 34 1929 58 33 34 5db8c9754ef5 3cd9b90a9972 000000000000
> >> + 35 1987 58 34 35 ee4a240cc16c 5db8c9754ef5 000000000000
> >> + 36 2045 58 35 36 9e1d38725343 ee4a240cc16c 000000000000
> >> + 37 2103 58 36 37 3463f73086a8 9e1d38725343 000000000000
> >> + 38 2161 58 37 38 88af72fab449 3463f73086a8 000000000000
> >> + 39 2219 58 38 39 472f5ce73785 88af72fab449 000000000000
> >> + 40 2277 58 39 40 c91b8351e5b8 472f5ce73785 000000000000
> >> + 41 2335 58 40 41 9c8289c5c5c0 c91b8351e5b8 000000000000
> >> + 42 2393 58 41 42 a13fd4a09d76 9c8289c5c5c0 000000000000
> >> + 43 2451 58 42 43 2ec2c81cafe0 a13fd4a09d76 000000000000
> >> + 44 2509 58 43 44 f27fdd174392 2ec2c81cafe0 000000000000
> >> + 45 2567 58 44 45 a539ec59fe41 f27fdd174392 000000000000
> >> + 46 2625 58 45 46 5e98b9ecb738 a539ec59fe41 000000000000
> >> + 47 2683 58 46 47 31e6b47899d0 5e98b9ecb738 000000000000
> >> + 48 2741 58 47 48 2cf25d6636bd 31e6b47899d0 000000000000
> >> + 49 2799 197 -1 49 9fff62ea0624 96e0c2ce55ed 000000000000
> >> + 50 2996 58 49 50 467f8e30a066 9fff62ea0624 000000000000
> >> + 51 3054 356 50 51 346db97283df a33416e52d91 000000000000
> >> + 52 3410 58 51 52 4e003fd4d5cd 346db97283df 000000000000
> >> + $ hg clone --pull source-repo --config
> >> experimental.maxdeltachainspan=2800 relax-chain --config
> >> format.generaldelta=yes
> >> + requesting all changes
> >> + adding changesets
> >> + adding manifests
> >> + adding file changes
> >> + added 53 changesets with 53 changes to 53 files (+2 heads)
> >> + updating to branch default
> >> + 14 files updated, 0 files merged, 0 files removed, 0 files unresolved
> >> + $ hg -R relax-chain debugindex -m
> >> + rev offset length delta linkrev nodeid p1 p2
> >> + 0 0 46 -1 0 19deeef41503 000000000000 000000000000
> >> + 1 46 57 0 1 fffc37b38c40 19deeef41503 000000000000
> >> + 2 103 57 1 2 5822d75c83d9 fffc37b38c40 000000000000
> >> + 3 160 57 2 3 19cf2273e601 5822d75c83d9 000000000000
> >> + 4 217 57 3 4 d45ead487afe 19cf2273e601 000000000000
> >> + 5 274 57 4 5 96e0c2ce55ed d45ead487afe 000000000000
> >> + 6 331 46 -1 6 0c2ea5222c74 000000000000 000000000000
> >> + 7 377 57 6 7 4ca08a89134d 0c2ea5222c74 000000000000
> >> + 8 434 57 7 8 c973dbfd30ac 4ca08a89134d 000000000000
> >> + 9 491 57 8 9 d81d878ff2cd c973dbfd30ac 000000000000
> >> + 10 548 58 9 10 dbee7f0dd760 d81d878ff2cd 000000000000
> >> + 11 606 58 10 11 474be9f1fd4e dbee7f0dd760 000000000000
> >> + 12 664 58 11 12 594a27502c85 474be9f1fd4e 000000000000
> >> + 13 722 58 12 13 a7d25307d6a9 594a27502c85 000000000000
> >> + 14 780 58 13 14 3eb53082272e a7d25307d6a9 000000000000
> >> + 15 838 58 14 15 d1e94c85caf6 3eb53082272e 000000000000
> >> + 16 896 58 15 16 8933d9629788 d1e94c85caf6 000000000000
> >> + 17 954 58 16 17 a33416e52d91 8933d9629788 000000000000
> >> + 18 1012 47 -1 18 4ccbf31021ed 000000000000 000000000000
> >> + 19 1059 58 18 19 dcad7a25656c 4ccbf31021ed 000000000000
> >> + 20 1117 58 19 20 617c4f8be75f dcad7a25656c 000000000000
> >> + 21 1175 58 20 21 975b9c1d75bb 617c4f8be75f 000000000000
> >> + 22 1233 58 21 22 74f09cd33b70 975b9c1d75bb 000000000000
> >> + 23 1291 58 22 23 54e79bfa7ef1 74f09cd33b70 000000000000
> >> + 24 1349 58 23 24 c556e7ff90af 54e79bfa7ef1 000000000000
> >> + 25 1407 58 24 25 42daedfe9c6b c556e7ff90af 000000000000
> >> + 26 1465 58 25 26 f302566947c7 42daedfe9c6b 000000000000
> >> + 27 1523 58 26 27 2346959851cb f302566947c7 000000000000
> >> + 28 1581 58 27 28 ca8d867106b4 2346959851cb 000000000000
> >> + 29 1639 58 28 29 fd9152decab2 ca8d867106b4 000000000000
> >> + 30 1697 58 29 30 3fe34080a79b fd9152decab2 000000000000
> >> + 31 1755 58 30 31 bce61a95078e 3fe34080a79b 000000000000
> >> + 32 1813 58 31 32 1dd9ba54ba15 bce61a95078e 000000000000
> >> + 33 1871 58 32 33 3cd9b90a9972 1dd9ba54ba15 000000000000
> >> + 34 1929 58 33 34 5db8c9754ef5 3cd9b90a9972 000000000000
> >> + 35 1987 58 34 35 ee4a240cc16c 5db8c9754ef5 000000000000
> >> + 36 2045 58 35 36 9e1d38725343 ee4a240cc16c 000000000000
> >> + 37 2103 58 36 37 3463f73086a8 9e1d38725343 000000000000
> >> + 38 2161 58 37 38 88af72fab449 3463f73086a8 000000000000
> >> + 39 2219 58 38 39 472f5ce73785 88af72fab449 000000000000
> >> + 40 2277 58 39 40 c91b8351e5b8 472f5ce73785 000000000000
> >> + 41 2335 58 40 41 9c8289c5c5c0 c91b8351e5b8 000000000000
> >> + 42 2393 58 41 42 a13fd4a09d76 9c8289c5c5c0 000000000000
> >> + 43 2451 58 42 43 2ec2c81cafe0 a13fd4a09d76 000000000000
> >> + 44 2509 58 43 44 f27fdd174392 2ec2c81cafe0 000000000000
> >> + 45 2567 58 44 45 a539ec59fe41 f27fdd174392 000000000000
> >> + 46 2625 58 45 46 5e98b9ecb738 a539ec59fe41 000000000000
> >> + 47 2683 58 46 47 31e6b47899d0 5e98b9ecb738 000000000000
> >> + 48 2741 58 47 48 2cf25d6636bd 31e6b47899d0 000000000000
> >> + 49 2799 197 -1 49 9fff62ea0624 96e0c2ce55ed 000000000000
> >> + 50 2996 58 49 50 467f8e30a066 9fff62ea0624 000000000000
> >> + 51 3054 58 17 51 346db97283df a33416e52d91 000000000000
> >> + 52 3112 369 -1 52 4e003fd4d5cd 346db97283df 000000000000
> >> + $ hg clone --pull source-repo --config
> experimental.maxdeltachainspan=0
> >> noconst-chain --config format.generaldelta=yes
> >> + requesting all changes
> >> + adding changesets
> >> + adding manifests
> >> + adding file changes
> >> + added 53 changesets with 53 changes to 53 files (+2 heads)
> >> + updating to branch default
> >> + 14 files updated, 0 files merged, 0 files removed, 0 files unresolved
> >> + $ hg -R noconst-chain debugindex -m
> >> + rev offset length delta linkrev nodeid p1 p2
> >> + 0 0 46 -1 0 19deeef41503 000000000000 000000000000
> >> + 1 46 57 0 1 fffc37b38c40 19deeef41503 000000000000
> >> + 2 103 57 1 2 5822d75c83d9 fffc37b38c40 000000000000
> >> + 3 160 57 2 3 19cf2273e601 5822d75c83d9 000000000000
> >> + 4 217 57 3 4 d45ead487afe 19cf2273e601 000000000000
> >> + 5 274 57 4 5 96e0c2ce55ed d45ead487afe 000000000000
> >> + 6 331 46 -1 6 0c2ea5222c74 000000000000 000000000000
> >> + 7 377 57 6 7 4ca08a89134d 0c2ea5222c74 000000000000
> >> + 8 434 57 7 8 c973dbfd30ac 4ca08a89134d 000000000000
> >> + 9 491 57 8 9 d81d878ff2cd c973dbfd30ac 000000000000
> >> + 10 548 58 9 10 dbee7f0dd760 d81d878ff2cd 000000000000
> >> + 11 606 58 10 11 474be9f1fd4e dbee7f0dd760 000000000000
> >> + 12 664 58 11 12 594a27502c85 474be9f1fd4e 000000000000
> >> + 13 722 58 12 13 a7d25307d6a9 594a27502c85 000000000000
> >> + 14 780 58 13 14 3eb53082272e a7d25307d6a9 000000000000
> >> + 15 838 58 14 15 d1e94c85caf6 3eb53082272e 000000000000
> >> + 16 896 58 15 16 8933d9629788 d1e94c85caf6 000000000000
> >> + 17 954 58 16 17 a33416e52d91 8933d9629788 000000000000
> >> + 18 1012 47 -1 18 4ccbf31021ed 000000000000 000000000000
> >> + 19 1059 58 18 19 dcad7a25656c 4ccbf31021ed 000000000000
> >> + 20 1117 58 19 20 617c4f8be75f dcad7a25656c 000000000000
> >> + 21 1175 58 20 21 975b9c1d75bb 617c4f8be75f 000000000000
> >> + 22 1233 58 21 22 74f09cd33b70 975b9c1d75bb 000000000000
> >> + 23 1291 58 22 23 54e79bfa7ef1 74f09cd33b70 000000000000
> >> + 24 1349 58 23 24 c556e7ff90af 54e79bfa7ef1 000000000000
> >> + 25 1407 58 24 25 42daedfe9c6b c556e7ff90af 000000000000
> >> + 26 1465 58 25 26 f302566947c7 42daedfe9c6b 000000000000
> >> + 27 1523 58 26 27 2346959851cb f302566947c7 000000000000
> >> + 28 1581 58 27 28 ca8d867106b4 2346959851cb 000000000000
> >> + 29 1639 58 28 29 fd9152decab2 ca8d867106b4 000000000000
> >> + 30 1697 58 29 30 3fe34080a79b fd9152decab2 000000000000
> >> + 31 1755 58 30 31 bce61a95078e 3fe34080a79b 000000000000
> >> + 32 1813 58 31 32 1dd9ba54ba15 bce61a95078e 000000000000
> >> + 33 1871 58 32 33 3cd9b90a9972 1dd9ba54ba15 000000000000
> >> + 34 1929 58 33 34 5db8c9754ef5 3cd9b90a9972 000000000000
> >> + 35 1987 58 34 35 ee4a240cc16c 5db8c9754ef5 000000000000
> >> + 36 2045 58 35 36 9e1d38725343 ee4a240cc16c 000000000000
> >> + 37 2103 58 36 37 3463f73086a8 9e1d38725343 000000000000
> >> + 38 2161 58 37 38 88af72fab449 3463f73086a8 000000000000
> >> + 39 2219 58 38 39 472f5ce73785 88af72fab449 000000000000
> >> + 40 2277 58 39 40 c91b8351e5b8 472f5ce73785 000000000000
> >> + 41 2335 58 40 41 9c8289c5c5c0 c91b8351e5b8 000000000000
> >> + 42 2393 58 41 42 a13fd4a09d76 9c8289c5c5c0 000000000000
> >> + 43 2451 58 42 43 2ec2c81cafe0 a13fd4a09d76 000000000000
> >> + 44 2509 58 43 44 f27fdd174392 2ec2c81cafe0 000000000000
> >> + 45 2567 58 44 45 a539ec59fe41 f27fdd174392 000000000000
> >> + 46 2625 58 45 46 5e98b9ecb738 a539ec59fe41 000000000000
> >> + 47 2683 58 46 47 31e6b47899d0 5e98b9ecb738 000000000000
> >> + 48 2741 58 47 48 2cf25d6636bd 31e6b47899d0 000000000000
> >> + 49 2799 58 5 49 9fff62ea0624 96e0c2ce55ed 000000000000
> >> + 50 2857 58 49 50 467f8e30a066 9fff62ea0624 000000000000
> >> + 51 2915 58 17 51 346db97283df a33416e52d91 000000000000
> >> + 52 2973 58 51 52 4e003fd4d5cd 346db97283df 000000000000
> >> _______________________________________________
> >> Mercurial-devel mailing list
> >> Mercurial-devel at mercurial-scm.org
> >> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> > ________________________________
> > This message, including its attachments, is confidential and the property of
> NNG Llc. For more information please read NNG's email policy here:
> > http://www.nng.com/emailpolicy/
> > By responding to this email you accept the email policy.
> >
>
> --
> Pierre-Yves David
________________________________
This message, including its attachments, is confidential and the property of NNG Llc. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.
More information about the Mercurial-devel
mailing list