[PATCH 4 of 8 "] compression: introduce a `storage.revlog.zstd.level` configuration

Gregory Szorc gregory.szorc at gmail.com
Tue Apr 2 17:56:37 UTC 2019


On Sun, Mar 31, 2019 at 8:39 AM Pierre-Yves David <
pierre-yves.david at ens-lyon.org> wrote:

> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david at octobus.net>
> # Date 1553708159 -3600
> #      Wed Mar 27 18:35:59 2019 +0100
> # Node ID bcc4ba4c53b44dc6013b89f8c85b0f1967dfaebb
> # Parent  df7c537a8d07d6c1d4e7aa7604af30a57717bcf6
> # EXP-Topic zstd-revlog
> # Available At https://bitbucket.org/octobus/mercurial-devel/
> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r
> bcc4ba4c53b4
> compression: introduce a `storage.revlog.zstd.level` configuration
>

> This option control the zstd compression level used when compressing revlog
> chunk. The usage of zstd for revlog compression has not graduated from
> experimental yet, but we intend to fix that soon.
>
> The option name for the compression level is more straight forward to
> pick, so
> this changesets comes first.  Having a dedicated option for each
> compression
> engine is useful because they don't support the same range of values.
>
> I ran the same measurement as for the zlib compression level (in the parent
> changesets). The variation in repository size is stay mostly in the same
> (small)
> range. The "read/write" performance see smallish variation, but are
> overall much
> better than zlib. Write performance show the same tend of having better
> write
> performance for when reaching high-end compression.
>
> Again, we don't intend to change the default zstd compression level
> (currently:
> 3) in this series. However this is worth investigating in the future.
>
> The Performance comparison of zlib vs zstd is quite impressive. The
> repository
> size stay in the same range, but the performance are much better in all
> situations.
>
> Comparison summary
> ==================
>
> We are looking at:
> - performance range for zlib
> - performance range for zstd
> - comparison of default zstd (level-3) to default zlib (level 6)
> - comparison of the slowest zstd time to the fastest zlib time
>
> Read performance:
> -----------------
>           |           zlib          |           zstd          | cmp | f2s
> mercurial |   0.170159 -   0.189219 |   0.144127 -   0.149624 | 80% | 88%
> pypy      |   2.679217 -   2.768691 |   1.532317 -   1.705044 | 60% | 63%
> netbeans  | 122.477027 - 141.620281 |  72.996346 -  89.731560 | 58% | 73%
> mozilla   | 147.867662 - 170.572118 |  91.700995 - 105.853099 | 56% | 71%
>
> Write performance:
> ------------------
>           |           zlib          |           zstd          | cmp | f2s
> mercurial |  53.250304 - 56.2936129 |  40.877025 -  45.677286 | 75% | 86%
> pypy      | 460.721984 - 476.589918 | 270.545409 - 301.002219 | 63% | 65%
> netbeans  | 520.560316 - 715.930400 | 370.356311 - 428.329652 | 55% | 82%
> mozilla   | 739.803002 - 987.056093 | 505.152906 - 591.930683 | 57% | 80%
>
> Raw data
> --------
>
> repo      alg lvl  .hg/store size  00manifest.d read       write
>
> mercurial zlib  1      49,402,813     5,963,475   0.170159  53.250304
> mercurial zlib  6      47,197,397     5,875,730   0.182820  56.264320
> mercurial zlib  9      47,121,596     5,849,781   0.189219  56.293612
>
> mercurial zstd  1      49,737,084     5,966,355   0.144127  40.877025
> mercurial zstd  3      48,961,867     5,895,208   0.146376  42.268142
> mercurial zstd  5      48,200,592     5,938,676   0.149624  43.162875
> mercurial zstd 10      47,833,520     5,913,353   0.145185  44.012489
> mercurial zstd 15      47,314,604     5,728,679   0.147686  45.677286
> mercurial zstd 20      47,330,502     5,830,539   0.145789  45.025407
> mercurial zstd 22      47,330,076     5,830,539   0.143996  44.690460
>
>
> pypy      zlib  1     370,830,572    28,462,425   2.679217 460.721984
> pypy      zlib  6     340,112,317    27,648,747   2.768691 467.537158
> pypy      zlib  9     338,360,736    27,639,003   2.763495 476.589918
>
> pypy      zstd  1     362,377,479    27,916,214   1.532317 270.545409
> pypy      zstd  3     354,137,693    27,905,988   1.686718 294.951509
> pypy      zstd  5     342,640,043    27,655,774   1.705044 301.002219
> pypy      zstd 10     334,224,327    27,164,493   1.567287 285.186239
> pypy      zstd 15     329,000,363    26,645,965   1.637729 299.561332
> pypy      zstd 20     324,534,039    26,199,547   1.526813 302.149827
> pypy      zstd 22     324,530,595    26,198,932   1.525718 307.821218
>
>
> netbeans  zlib  1   1,281,847,810   165,495,457 122.477027 520.560316
> netbeans  zlib  6   1,205,284,353   159,161,207 139.876147 715.930400
> netbeans  zlib  9   1,197,135,671   155,034,586 141.620281 678.297064
>
> netbeans  zstd  1   1,259,581,737   160,840,613  72.996346 370.356311
> netbeans  zstd  3   1,232,978,122   157,691,551  81.622317 396.733087
> netbeans  zstd  5   1,208,034,075   160,246,880  83.080549 364.342626
> netbeans  zstd 10   1,188,624,176   156,083,417  79.323935 403.594602
> netbeans  zstd 15   1,176,973,589   153,859,477  89.731560 428.329652
> netbeans  zstd 20   1,162,958,258   151,147,535  82.842667 392.335349
> netbeans  zstd 22   1,162,707,029   151,150,220  82.565695 402.840655
>
>
> mozilla   zlib  1   2,775,497,186   298,527,987 147.867662 751.263721
> mozilla   zlib  6   2,596,856,420   286,597,671 170.572118 987.056093
> mozilla   zlib  9   2,587,542,494   287,018,264 163.622338 739.803002
>
> mozilla   zstd  1   2,723,159,348   286,617,532  91.700995 570.042751
> mozilla   zstd  3   2,665,055,001   286,152,013  95.240155 561.412805
> mozilla   zstd  5   2,607,819,817   288,060,030 101.978048 505.152906
> mozilla   zstd 10   2,558,761,085   283,967,648 104.113481 497.771202
> mozilla   zstd 15   2,526,216,060   275,581,300 105.853099 591.930683
> mozilla   zstd 20   2,485,114,806   266,478,859  95.268795 576.515389
> mozilla   zstd 22   2,484,869,080   266,456,505  94.429282 572.785537
>
> diff --git a/mercurial/configitems.py b/mercurial/configitems.py
> --- a/mercurial/configitems.py
> +++ b/mercurial/configitems.py
> @@ -995,6 +995,9 @@ coreconfigitem('storage', 'revlog.reuse-
>  coreconfigitem('storage', 'revlog.zlib.level',
>      default=None,
>  )
> +coreconfigitem('storage', 'revlog.zstd.level',
> +    default=None,
> +)
>  coreconfigitem('server', 'bookmarks-pushkey-compat',
>      default=True,
>  )
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1886,6 +1886,12 @@ category impact performance and reposito
>      Value range from 1 (lowest compression) to 9 (highest compression).
> Zlib
>      default value is 6.
>
> +
> +``revlog.zstd.level``
> +    zstd compression level used when storing data into the repository.
> Accepted
> +    Value range from 1 (lowest compression) to 22 (highest compression).
> +    (default 3)
> +
>  ``server``
>  ----------
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -802,6 +802,11 @@ def resolverevlogstorevfsoptions(ui, req
>          if not (0 <= options[b'zlib.level'] <= 9):
>              msg = _('invalid value for `storage.revlog.zlib.level`
> config: %d')
>              raise error.Abort(msg % options[b'zlib.level'])
> +    options[b'zstd.level'] = ui.configint(b'storage',
> b'revlog.zstd.level')
> +    if options[b'zstd.level'] is not None:
> +        if not (0 <= options[b'zstd.level'] <= 22):
> +            msg = _('invalid value for `storage.revlog.zstd.level`
> config: %d')
> +            raise error.Abort(msg % options[b'zstd.level'])
>

I'm probably going to queue this. However, zstd supports negative
compression levels. When you go negative, zstd approaches lz4's performance.

Instead of trying to screen for the allowed levels here, I would catch the
ValueError raised when constructing the ZstdCompressor and turn it into an
error.Abort. This can be done as a follow-up.



>      if repository.NARROW_REQUIREMENT in requirements:
>          options[b'enableellipsis'] = True
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -419,6 +419,8 @@ class revlog(object):
>              self._compengine = opts['compengine']
>          if 'zlib.level' in opts:
>              self._compengineopts['zlib.level'] = opts['zlib.level']
> +        if 'zstd.level' in opts:
> +            self._compengineopts['zstd.level'] = opts['zstd.level']
>          if 'maxdeltachainspan' in opts:
>              self._maxdeltachainspan = opts['maxdeltachainspan']
>          if self._mmaplargeindex and 'mmapindexthreshold' in opts:
> diff --git a/mercurial/utils/compression.py
> b/mercurial/utils/compression.py
> --- a/mercurial/utils/compression.py
> +++ b/mercurial/utils/compression.py
> @@ -721,8 +721,12 @@ class _zstdengine(compressionengine):
>
>      def revlogcompressor(self, opts=None):
>          opts = opts or {}
> -        return self.zstdrevlogcompressor(self._module,
> -                                         level=opts.get('level', 3))
> +        level = opts.get('zstd.level')
> +        if level is None:
> +            level = opts.get('level')
> +        if level is None:
> +            level = 3
> +        return self.zstdrevlogcompressor(self._module, level=level)
>
>  compengines.register(_zstdengine())
>
> diff --git a/tests/test-repo-compengines.t b/tests/test-repo-compengines.t
> --- a/tests/test-repo-compengines.t
> +++ b/tests/test-repo-compengines.t
> @@ -138,3 +138,58 @@ Test error cases
>    abort: invalid value for `storage.revlog.zlib.level` config: 42
>    [255]
>
> +checking zstd options
> +=====================
> +
> +  $ hg init zstd-level-default --config
> experimental.format.compression=zstd
> +  $ hg init zstd-level-1 --config experimental.format.compression=zstd
> +  $ cat << EOF >> zstd-level-1/.hg/hgrc
> +  > [storage]
> +  > revlog.zstd.level=1
> +  > EOF
> +  $ hg init zstd-level-22 --config experimental.format.compression=zstd
> +  $ cat << EOF >> zstd-level-22/.hg/hgrc
> +  > [storage]
> +  > revlog.zstd.level=22
> +  > EOF
> +
> +
> +  $ commitone() {
> +  >    repo=$1
> +  >    cp $RUNTESTDIR/bundles/issue4438-r1.hg $repo/a
> +  >    hg -R $repo add $repo/a
> +  >    hg -R $repo commit -m some-commit
> +  > }
> +
> +  $ for repo in zstd-level-default zstd-level-1 zstd-level-22; do
> +  >     commitone $repo
> +  > done
> +
> +  $ $RUNTESTDIR/f -s zstd-*/.hg/store/data/*
> +  zstd-level-1/.hg/store/data/a.i: size=4097
> +  zstd-level-22/.hg/store/data/a.i: size=4091
> +  zstd-level-default/.hg/store/data/a.i: size=4094
> +
> +Test error cases
> +
> +  $ hg init zstd-level-invalid --config
> experimental.format.compression=zstd
> +  $ cat << EOF >> zstd-level-invalid/.hg/hgrc
> +  > [storage]
> +  > revlog.zstd.level=foobar
> +  > EOF
> +  $ commitone zstd-level-invalid
> +  abort: storage.revlog.zstd.level is not a valid integer ('foobar')
> +  abort: storage.revlog.zstd.level is not a valid integer ('foobar')
> +  [255]
> +
> +  $ hg init zstd-level-out-of-range --config
> experimental.format.compression=zstd
> +  $ cat << EOF >> zstd-level-out-of-range/.hg/hgrc
> +  > [storage]
> +  > revlog.zstd.level=42
> +  > EOF
> +
> +  $ commitone zstd-level-out-of-range
> +  abort: invalid value for `storage.revlog.zstd.level` config: 42
> +  abort: invalid value for `storage.revlog.zstd.level` config: 42
> +  [255]
> +
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20190402/4e513b03/attachment.html>


More information about the Mercurial-devel mailing list