[PATCH STABLE] revlog: add an experimental option to mitigated delta issues (issue5480)

Gábor STEFANIK Gabor.STEFANIK at nng.com
Thu Jun 29 05:23:19 EDT 2017


> -----Original Message-----
> From: Pierre-Yves David [mailto:pierre-yves.david at ens-lyon.org]
> Sent: Wednesday, June 28, 2017 10:48 PM
> To: Gábor STEFANIK <Gabor.STEFANIK at nng.com>; mercurial-
> devel at mercurial-scm.org
> Subject: Re: [PATCH STABLE] revlog: add an experimental option to mitigated
> delta issues (issue5480)
>
>
>
> On 06/28/2017 02:26 PM, Gábor STEFANIK wrote:
> >> -----Original Message-----
> >> From: Mercurial-devel [mailto:mercurial-devel-bounces at mercurial-
> scm.org]
> >> On Behalf Of Pierre-Yves David
> >> Sent: Tuesday, June 27, 2017 1:16 PM
> >> To: mercurial-devel at mercurial-scm.org
> >> Subject: [PATCH STABLE] revlog: add an experimental option to mitigated
> >> delta issues (issue5480)
> >>
> >> # HG changeset patch
> >> # User Pierre-Yves David <pierre-yves.david at octobus.net>
> >> # Date 1498218574 -7200
> >> #      Fri Jun 23 13:49:34 2017 +0200
> >> # Branch stable
> >> # Node ID 33998dea4a10b09502bf458e458daca273a3f29a
> >> # Parent  231690dba9b4d31b5ad2c93284e454135f2763ca
> >> # EXP-Topic manifest
> >> # Available At https://www.mercurial-
> >> scm.org/repo/users/marmoute/mercurial/
> >> #              hg pull https://www.mercurial-
> >> scm.org/repo/users/marmoute/mercurial/ -r 33998dea4a10
> >> revlog: add an experimental option to mitigated delta issues (issue5480)
> >>
> >> The general delta heuristic to select a delta do not scale with the number
> of
> >> branch. The delta base is frequently too far away to be able to reuse a
> chain
> >> according to the "distance" criteria. This leads to insertion of larger delta
> (or
> >> even full text) that themselves push the bases for the next delta further
> >> away
> >> leading to more large deltas and full texts. This full text and frequent
> >> recomputation throw Mercurial performance in disarray.
> >>
> >> For example of a slightly large repository
> >>
> >>    280 000 files (2 150 000 versions)
> >>    430 000 changesets (10 000 topological heads)
> >>
> >> Number below compares repository with and without the distance
> criteria:
> >>
> >> manifest size:
> >>      with:    21.4 GB
> >>      without:  0.3 GB
> >>
> >> store size:
> >>      with:    28.7 GB
> >>      without   7.4 GB
> >>
> >> bundle last 15 00 revisions:
> >>      with:    800 seconds
> >>               971 MB
> >>      without:  50 seconds
> >>                73 MB
> >>
> >> unbundle time (of the last 15K revisions):
> >>      with:    1150 seconds (~19 minutes)
> >>      without:   35 seconds
> >>
> >> Similar issues has been observed in other repositories.
> >>
> >>
> >> Adding a new option or "feature" on stable is uncommon. However, given
> >> that this
> >> issues is making Mercurial practically unusable, I'm exceptionally targeting
> >> this patch for stable.
> >>
> >> What is actually needed is a full rework of the delta building and reading
> >> logic. However, that will be a longer process and churn not suitable for
> stable.
> >>
> >> In the meantime, we introduces a quick and dirty mitigation of this in the
> >> 'experimental' config space. The new option introduces a way to set the
> >> maximum
> >> amount of memory usable to store a diff in memory. This extend the
> ability
> >> for
> >> Mercurial to create chains without removing all safe guard regarding
> memory
> >> access. The option should be phased out when core has a more proper
> >> solution
> >> available.
> >>
> >> Setting the limit to '0' remove all limits, setting it to '-1' use the default
> >> limit (textsize x 4).
> >>
> >> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> >> --- a/mercurial/localrepo.py
> >> +++ b/mercurial/localrepo.py
> >> @@ -432,6 +432,9 @@ class localrepository(object):
> >>               'aggressivemergedeltas', False)
> >>           self.svfs.options['aggressivemergedeltas'] = aggressivemergedeltas
> >>           self.svfs.options['lazydeltabase'] = not scmutil.gddeltaconfig(self.ui)
> >> +        chainspan = self.ui.configbytes('experimental', 'maxdeltachainspan',
> -1)
> >> +        if 0 <= chainspan:
> >> +            self.svfs.options['maxdeltachainspan'] = chainspan
> >>
> >>           for r in self.requirements:
> >>               if r.startswith('exp-compression-'):
> >> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> >> --- a/mercurial/revlog.py
> >> +++ b/mercurial/revlog.py
> >> @@ -282,6 +282,7 @@ class revlog(object):
> >>           self._nodecache = {nullid: nullrev}
> >>           self._nodepos = None
> >>           self._compengine = 'zlib'
> >> +        self._maxdeltachainspan = -1
> >>
> >>           v = REVLOG_DEFAULT_VERSION
> >>           opts = getattr(opener, 'options', None)
> >> @@ -300,6 +301,8 @@ class revlog(object):
> >>               self._lazydeltabase = bool(opts.get('lazydeltabase', False))
> >>               if 'compengine' in opts:
> >>                   self._compengine = opts['compengine']
> >> +            if 'maxdeltachainspan' in opts:
> >> +                self._maxdeltachainspan = opts['maxdeltachainspan']
> >>
> >>           if self._chunkcachesize <= 0:
> >>               raise RevlogError(_('revlog chunk cache size %r is not greater '
> >> @@ -1596,7 +1599,13 @@ class revlog(object):
> >>           # - 'compresseddeltalen' is the sum of the total size of deltas we
> need
> >>           #   to apply -- bounding it limits the amount of CPU we consume.
> >>           dist, l, data, base, chainbase, chainlen, compresseddeltalen = d
> >> -        if (dist > textlen * 4 or l > textlen or
> >> +
> >> +        defaultmax = textlen * 4
> >> +        maxdist = self._maxdeltachainspan
> >> +        if not maxdist:
> >> +            maxdist = dist # ensure the conditional pass
> >> +        maxdist = max(maxdist, defaultmax)
> >> +        if (dist > maxdist or l > textlen or
> >
> > Perhaps it would be cleaner if we could configure the multiplier directly,
> rather than setting a fixed span limit.
>
> I play a bit with that approach before but it give less good result on
> various aspect (both in compression and in usability):
>
> - it is harder to tune,
> - it is less explicit on the memory Mercurial might actually spike to,
> - as the limit apply to all revlog it can get out of control for larger
> file (not manifest) (eg x100 can be appropriate for the manifest, but
> not for a 10MB file)
>
> Overall, having a simple way to specify sage boundary for mercurial
> memory usage is simpler to deal with for a simple mitigation.
>
> Overall, we need to rework the reading and heuristic to solve this
> properly so that factor is unlikely to survive on the long run.

I was under the impression that this limit is not about memory usage, but rather I/O amplification and resulting loss of speed.

ACK if we need this limit primarily to limit memory usage.

>
>
> >
> >>               compresseddeltalen > textlen * 2 or
> >>               (self._maxchainlen and chainlen > self._maxchainlen)):
> >>               return False
> >> diff --git a/tests/test-generaldelta.t b/tests/test-generaldelta.t
> >> --- a/tests/test-generaldelta.t
> >> +++ b/tests/test-generaldelta.t
> >> @@ -159,3 +159,191 @@ Test that strip bundle use bundle2
> >>         1c5d4dc9a8b8d6e1750966d343e94db665e7a1e9
> >>
> >>     $ cd ..
> >> +
> >> +test maxdeltachainspan
> >> +
> >> +  $ hg init source-repo
> >> +  $ cd source-repo
> >> +  $ hg debugbuilddag --new-file
> >> '.+5:brancha$.+11:branchb$.+30:branchc<brancha+2<branchb+2'
> >> +  $ cd ..
> >> +  $ hg -R source-repo debugindex -m
> >> +     rev    offset  length  delta linkrev nodeid       p1           p2
> >> +       0         0      46     -1       0 19deeef41503 000000000000 000000000000
> >> +       1        46      57      0       1 fffc37b38c40 19deeef41503 000000000000
> >> +       2       103      57      1       2 5822d75c83d9 fffc37b38c40 000000000000
> >> +       3       160      57      2       3 19cf2273e601 5822d75c83d9 000000000000
> >> +       4       217      57      3       4 d45ead487afe 19cf2273e601 000000000000
> >> +       5       274      57      4       5 96e0c2ce55ed d45ead487afe 000000000000
> >> +       6       331      46     -1       6 0c2ea5222c74 000000000000 000000000000
> >> +       7       377      57      6       7 4ca08a89134d 0c2ea5222c74 000000000000
> >> +       8       434      57      7       8 c973dbfd30ac 4ca08a89134d 000000000000
> >> +       9       491      57      8       9 d81d878ff2cd c973dbfd30ac 000000000000
> >> +      10       548      58      9      10 dbee7f0dd760 d81d878ff2cd 000000000000
> >> +      11       606      58     10      11 474be9f1fd4e dbee7f0dd760 000000000000
> >> +      12       664      58     11      12 594a27502c85 474be9f1fd4e 000000000000
> >> +      13       722      58     12      13 a7d25307d6a9 594a27502c85 000000000000
> >> +      14       780      58     13      14 3eb53082272e a7d25307d6a9 000000000000
> >> +      15       838      58     14      15 d1e94c85caf6 3eb53082272e 000000000000
> >> +      16       896      58     15      16 8933d9629788 d1e94c85caf6 000000000000
> >> +      17       954      58     16      17 a33416e52d91 8933d9629788 000000000000
> >> +      18      1012      47     -1      18 4ccbf31021ed 000000000000 000000000000
> >> +      19      1059      58     18      19 dcad7a25656c 4ccbf31021ed 000000000000
> >> +      20      1117      58     19      20 617c4f8be75f dcad7a25656c 000000000000
> >> +      21      1175      58     20      21 975b9c1d75bb 617c4f8be75f 000000000000
> >> +      22      1233      58     21      22 74f09cd33b70 975b9c1d75bb 000000000000
> >> +      23      1291      58     22      23 54e79bfa7ef1 74f09cd33b70 000000000000
> >> +      24      1349      58     23      24 c556e7ff90af 54e79bfa7ef1 000000000000
> >> +      25      1407      58     24      25 42daedfe9c6b c556e7ff90af 000000000000
> >> +      26      1465      58     25      26 f302566947c7 42daedfe9c6b 000000000000
> >> +      27      1523      58     26      27 2346959851cb f302566947c7 000000000000
> >> +      28      1581      58     27      28 ca8d867106b4 2346959851cb 000000000000
> >> +      29      1639      58     28      29 fd9152decab2 ca8d867106b4 000000000000
> >> +      30      1697      58     29      30 3fe34080a79b fd9152decab2 000000000000
> >> +      31      1755      58     30      31 bce61a95078e 3fe34080a79b 000000000000
> >> +      32      1813      58     31      32 1dd9ba54ba15 bce61a95078e 000000000000
> >> +      33      1871      58     32      33 3cd9b90a9972 1dd9ba54ba15 000000000000
> >> +      34      1929      58     33      34 5db8c9754ef5 3cd9b90a9972 000000000000
> >> +      35      1987      58     34      35 ee4a240cc16c 5db8c9754ef5 000000000000
> >> +      36      2045      58     35      36 9e1d38725343 ee4a240cc16c 000000000000
> >> +      37      2103      58     36      37 3463f73086a8 9e1d38725343 000000000000
> >> +      38      2161      58     37      38 88af72fab449 3463f73086a8 000000000000
> >> +      39      2219      58     38      39 472f5ce73785 88af72fab449 000000000000
> >> +      40      2277      58     39      40 c91b8351e5b8 472f5ce73785 000000000000
> >> +      41      2335      58     40      41 9c8289c5c5c0 c91b8351e5b8 000000000000
> >> +      42      2393      58     41      42 a13fd4a09d76 9c8289c5c5c0 000000000000
> >> +      43      2451      58     42      43 2ec2c81cafe0 a13fd4a09d76 000000000000
> >> +      44      2509      58     43      44 f27fdd174392 2ec2c81cafe0 000000000000
> >> +      45      2567      58     44      45 a539ec59fe41 f27fdd174392 000000000000
> >> +      46      2625      58     45      46 5e98b9ecb738 a539ec59fe41 000000000000
> >> +      47      2683      58     46      47 31e6b47899d0 5e98b9ecb738 000000000000
> >> +      48      2741      58     47      48 2cf25d6636bd 31e6b47899d0 000000000000
> >> +      49      2799     197     -1      49 9fff62ea0624 96e0c2ce55ed 000000000000
> >> +      50      2996      58     49      50 467f8e30a066 9fff62ea0624 000000000000
> >> +      51      3054     356     50      51 346db97283df a33416e52d91 000000000000
> >> +      52      3410      58     51      52 4e003fd4d5cd 346db97283df 000000000000
> >> +  $ hg clone --pull source-repo --config
> >> experimental.maxdeltachainspan=2800 relax-chain --config
> >> format.generaldelta=yes
> >> +  requesting all changes
> >> +  adding changesets
> >> +  adding manifests
> >> +  adding file changes
> >> +  added 53 changesets with 53 changes to 53 files (+2 heads)
> >> +  updating to branch default
> >> +  14 files updated, 0 files merged, 0 files removed, 0 files unresolved
> >> +  $ hg -R relax-chain debugindex -m
> >> +     rev    offset  length  delta linkrev nodeid       p1           p2
> >> +       0         0      46     -1       0 19deeef41503 000000000000 000000000000
> >> +       1        46      57      0       1 fffc37b38c40 19deeef41503 000000000000
> >> +       2       103      57      1       2 5822d75c83d9 fffc37b38c40 000000000000
> >> +       3       160      57      2       3 19cf2273e601 5822d75c83d9 000000000000
> >> +       4       217      57      3       4 d45ead487afe 19cf2273e601 000000000000
> >> +       5       274      57      4       5 96e0c2ce55ed d45ead487afe 000000000000
> >> +       6       331      46     -1       6 0c2ea5222c74 000000000000 000000000000
> >> +       7       377      57      6       7 4ca08a89134d 0c2ea5222c74 000000000000
> >> +       8       434      57      7       8 c973dbfd30ac 4ca08a89134d 000000000000
> >> +       9       491      57      8       9 d81d878ff2cd c973dbfd30ac 000000000000
> >> +      10       548      58      9      10 dbee7f0dd760 d81d878ff2cd 000000000000
> >> +      11       606      58     10      11 474be9f1fd4e dbee7f0dd760 000000000000
> >> +      12       664      58     11      12 594a27502c85 474be9f1fd4e 000000000000
> >> +      13       722      58     12      13 a7d25307d6a9 594a27502c85 000000000000
> >> +      14       780      58     13      14 3eb53082272e a7d25307d6a9 000000000000
> >> +      15       838      58     14      15 d1e94c85caf6 3eb53082272e 000000000000
> >> +      16       896      58     15      16 8933d9629788 d1e94c85caf6 000000000000
> >> +      17       954      58     16      17 a33416e52d91 8933d9629788 000000000000
> >> +      18      1012      47     -1      18 4ccbf31021ed 000000000000 000000000000
> >> +      19      1059      58     18      19 dcad7a25656c 4ccbf31021ed 000000000000
> >> +      20      1117      58     19      20 617c4f8be75f dcad7a25656c 000000000000
> >> +      21      1175      58     20      21 975b9c1d75bb 617c4f8be75f 000000000000
> >> +      22      1233      58     21      22 74f09cd33b70 975b9c1d75bb 000000000000
> >> +      23      1291      58     22      23 54e79bfa7ef1 74f09cd33b70 000000000000
> >> +      24      1349      58     23      24 c556e7ff90af 54e79bfa7ef1 000000000000
> >> +      25      1407      58     24      25 42daedfe9c6b c556e7ff90af 000000000000
> >> +      26      1465      58     25      26 f302566947c7 42daedfe9c6b 000000000000
> >> +      27      1523      58     26      27 2346959851cb f302566947c7 000000000000
> >> +      28      1581      58     27      28 ca8d867106b4 2346959851cb 000000000000
> >> +      29      1639      58     28      29 fd9152decab2 ca8d867106b4 000000000000
> >> +      30      1697      58     29      30 3fe34080a79b fd9152decab2 000000000000
> >> +      31      1755      58     30      31 bce61a95078e 3fe34080a79b 000000000000
> >> +      32      1813      58     31      32 1dd9ba54ba15 bce61a95078e 000000000000
> >> +      33      1871      58     32      33 3cd9b90a9972 1dd9ba54ba15 000000000000
> >> +      34      1929      58     33      34 5db8c9754ef5 3cd9b90a9972 000000000000
> >> +      35      1987      58     34      35 ee4a240cc16c 5db8c9754ef5 000000000000
> >> +      36      2045      58     35      36 9e1d38725343 ee4a240cc16c 000000000000
> >> +      37      2103      58     36      37 3463f73086a8 9e1d38725343 000000000000
> >> +      38      2161      58     37      38 88af72fab449 3463f73086a8 000000000000
> >> +      39      2219      58     38      39 472f5ce73785 88af72fab449 000000000000
> >> +      40      2277      58     39      40 c91b8351e5b8 472f5ce73785 000000000000
> >> +      41      2335      58     40      41 9c8289c5c5c0 c91b8351e5b8 000000000000
> >> +      42      2393      58     41      42 a13fd4a09d76 9c8289c5c5c0 000000000000
> >> +      43      2451      58     42      43 2ec2c81cafe0 a13fd4a09d76 000000000000
> >> +      44      2509      58     43      44 f27fdd174392 2ec2c81cafe0 000000000000
> >> +      45      2567      58     44      45 a539ec59fe41 f27fdd174392 000000000000
> >> +      46      2625      58     45      46 5e98b9ecb738 a539ec59fe41 000000000000
> >> +      47      2683      58     46      47 31e6b47899d0 5e98b9ecb738 000000000000
> >> +      48      2741      58     47      48 2cf25d6636bd 31e6b47899d0 000000000000
> >> +      49      2799     197     -1      49 9fff62ea0624 96e0c2ce55ed 000000000000
> >> +      50      2996      58     49      50 467f8e30a066 9fff62ea0624 000000000000
> >> +      51      3054      58     17      51 346db97283df a33416e52d91 000000000000
> >> +      52      3112     369     -1      52 4e003fd4d5cd 346db97283df 000000000000
> >> +  $ hg clone --pull source-repo --config
> experimental.maxdeltachainspan=0
> >> noconst-chain --config format.generaldelta=yes
> >> +  requesting all changes
> >> +  adding changesets
> >> +  adding manifests
> >> +  adding file changes
> >> +  added 53 changesets with 53 changes to 53 files (+2 heads)
> >> +  updating to branch default
> >> +  14 files updated, 0 files merged, 0 files removed, 0 files unresolved
> >> +  $ hg -R noconst-chain debugindex -m
> >> +     rev    offset  length  delta linkrev nodeid       p1           p2
> >> +       0         0      46     -1       0 19deeef41503 000000000000 000000000000
> >> +       1        46      57      0       1 fffc37b38c40 19deeef41503 000000000000
> >> +       2       103      57      1       2 5822d75c83d9 fffc37b38c40 000000000000
> >> +       3       160      57      2       3 19cf2273e601 5822d75c83d9 000000000000
> >> +       4       217      57      3       4 d45ead487afe 19cf2273e601 000000000000
> >> +       5       274      57      4       5 96e0c2ce55ed d45ead487afe 000000000000
> >> +       6       331      46     -1       6 0c2ea5222c74 000000000000 000000000000
> >> +       7       377      57      6       7 4ca08a89134d 0c2ea5222c74 000000000000
> >> +       8       434      57      7       8 c973dbfd30ac 4ca08a89134d 000000000000
> >> +       9       491      57      8       9 d81d878ff2cd c973dbfd30ac 000000000000
> >> +      10       548      58      9      10 dbee7f0dd760 d81d878ff2cd 000000000000
> >> +      11       606      58     10      11 474be9f1fd4e dbee7f0dd760 000000000000
> >> +      12       664      58     11      12 594a27502c85 474be9f1fd4e 000000000000
> >> +      13       722      58     12      13 a7d25307d6a9 594a27502c85 000000000000
> >> +      14       780      58     13      14 3eb53082272e a7d25307d6a9 000000000000
> >> +      15       838      58     14      15 d1e94c85caf6 3eb53082272e 000000000000
> >> +      16       896      58     15      16 8933d9629788 d1e94c85caf6 000000000000
> >> +      17       954      58     16      17 a33416e52d91 8933d9629788 000000000000
> >> +      18      1012      47     -1      18 4ccbf31021ed 000000000000 000000000000
> >> +      19      1059      58     18      19 dcad7a25656c 4ccbf31021ed 000000000000
> >> +      20      1117      58     19      20 617c4f8be75f dcad7a25656c 000000000000
> >> +      21      1175      58     20      21 975b9c1d75bb 617c4f8be75f 000000000000
> >> +      22      1233      58     21      22 74f09cd33b70 975b9c1d75bb 000000000000
> >> +      23      1291      58     22      23 54e79bfa7ef1 74f09cd33b70 000000000000
> >> +      24      1349      58     23      24 c556e7ff90af 54e79bfa7ef1 000000000000
> >> +      25      1407      58     24      25 42daedfe9c6b c556e7ff90af 000000000000
> >> +      26      1465      58     25      26 f302566947c7 42daedfe9c6b 000000000000
> >> +      27      1523      58     26      27 2346959851cb f302566947c7 000000000000
> >> +      28      1581      58     27      28 ca8d867106b4 2346959851cb 000000000000
> >> +      29      1639      58     28      29 fd9152decab2 ca8d867106b4 000000000000
> >> +      30      1697      58     29      30 3fe34080a79b fd9152decab2 000000000000
> >> +      31      1755      58     30      31 bce61a95078e 3fe34080a79b 000000000000
> >> +      32      1813      58     31      32 1dd9ba54ba15 bce61a95078e 000000000000
> >> +      33      1871      58     32      33 3cd9b90a9972 1dd9ba54ba15 000000000000
> >> +      34      1929      58     33      34 5db8c9754ef5 3cd9b90a9972 000000000000
> >> +      35      1987      58     34      35 ee4a240cc16c 5db8c9754ef5 000000000000
> >> +      36      2045      58     35      36 9e1d38725343 ee4a240cc16c 000000000000
> >> +      37      2103      58     36      37 3463f73086a8 9e1d38725343 000000000000
> >> +      38      2161      58     37      38 88af72fab449 3463f73086a8 000000000000
> >> +      39      2219      58     38      39 472f5ce73785 88af72fab449 000000000000
> >> +      40      2277      58     39      40 c91b8351e5b8 472f5ce73785 000000000000
> >> +      41      2335      58     40      41 9c8289c5c5c0 c91b8351e5b8 000000000000
> >> +      42      2393      58     41      42 a13fd4a09d76 9c8289c5c5c0 000000000000
> >> +      43      2451      58     42      43 2ec2c81cafe0 a13fd4a09d76 000000000000
> >> +      44      2509      58     43      44 f27fdd174392 2ec2c81cafe0 000000000000
> >> +      45      2567      58     44      45 a539ec59fe41 f27fdd174392 000000000000
> >> +      46      2625      58     45      46 5e98b9ecb738 a539ec59fe41 000000000000
> >> +      47      2683      58     46      47 31e6b47899d0 5e98b9ecb738 000000000000
> >> +      48      2741      58     47      48 2cf25d6636bd 31e6b47899d0 000000000000
> >> +      49      2799      58      5      49 9fff62ea0624 96e0c2ce55ed 000000000000
> >> +      50      2857      58     49      50 467f8e30a066 9fff62ea0624 000000000000
> >> +      51      2915      58     17      51 346db97283df a33416e52d91 000000000000
> >> +      52      2973      58     51      52 4e003fd4d5cd 346db97283df 000000000000
> >> _______________________________________________
> >> Mercurial-devel mailing list
> >> Mercurial-devel at mercurial-scm.org
> >> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> > ________________________________
> >   This message, including its attachments, is confidential and the property of
> NNG Llc. For more information please read NNG's email policy here:
> > http://www.nng.com/emailpolicy/
> > By responding to this email you accept the email policy.
> >
>
> --
> Pierre-Yves David
________________________________
 This message, including its attachments, is confidential and the property of NNG Llc. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.


More information about the Mercurial-devel mailing list