Differences between revisions 3 and 11 (spanning 8 versions)
Revision 3 as of 2015-03-08 04:24:45
Size: 3211
Editor: KevinBullock
Comment: add to CategoryInternals
Revision 11 as of 2018-01-26 14:12:50
Size: 3419
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
Line 4: Line 3:
Line 10: Line 8:

The original Mercurial compression format has a particular weakness in storing and transmitting deltas for branches that are heavily interleaved.
In some instances, this can make the size of the manifest data (stored in '''00manifest.d''') balloon by 10x or more. The generaldelta option is an effort to mitigate that, while still maintaining Mercurial's O(1)-bounded performance.
The original Mercurial compression format has a particular weakness in storing and transmitting deltas for branches that are heavily interleaved. In some instances, this can make the size of the manifest data (stored in '''00manifest.d''') balloon by 10x or more. The generaldelta option is an effort to mitigate that, while still maintaining Mercurial's O(1)-bounded performance.
Line 17: Line 13:
The generaldelta feature is enabled by default in Mercurial 3.7
Line 18: Line 15:
The generaldelta feature can be enabled for new clones with:
For
older release can be enabled for new clones with:
Line 24: Line 22:
This will actually enable three features:
Line 25: Line 24:
This will actually enable two features:

 * generaldelta
 * generaldelta storage
 * recomputation of delta on pull (to be stored as "optimised" general delta)
Line 33: Line 31:
Line 39: Line 36:
Line 45: Line 41:
Line 78: Line 73:
Line 83: Line 77:
Mercurial's bundle protocol doesn't yet fully support generaldelta. This creates two barriers to making this Mercurial's default format: Mercurial
Line 85: Line 79:
 * pulling from a generaldelta repo uses more server CPU as the server has to recalculate some deltas
 * more than an optimal amount of bandwidth is still used due to sending old-style deltas

We intend to eventually address this by updating Mercurial's bundle protocol with [[BundleFormat2]], after which the generaldelta feature will be enabled by default on new clones.
 * (./) Support for exchanging general delta over the wire (Mercurial 3.5 and above),
 * (./) Support for storing general delta changegroup on disk (Mercurial 3.6 and above),
 * (./) Support for pulling from an old server without triggering a recomputation of all delta ((future) Mercurial 3.7 and above),
 * {X} Client side warning when the exchange is sub-optimal,
 * {X} gathering data about the general delta efficiency and re-computation cost,
 * {X} provide user with a way to do on site upgrade,
 * {X} switch in core to disallow non bundle2 pull/push.
Line 91: Line 88:
Line 98: Line 94:
[[CategoryInternals]] CategoryInternals CategoryOldFeatures

GeneralDelta

Using the generaldelta compression option.

1. Introduction

The original Mercurial compression format has a particular weakness in storing and transmitting deltas for branches that are heavily interleaved. In some instances, this can make the size of the manifest data (stored in 00manifest.d) balloon by 10x or more. The generaldelta option is an effort to mitigate that, while still maintaining Mercurial's O(1)-bounded performance.

The generaldelta feature is available in Mercurial 1.9 and later.

2. Enabling generaldelta

The generaldelta feature is enabled by default in Mercurial 3.7

For older release can be enabled for new clones with:

[format]
generaldelta = true

This will actually enable three features:

  • generaldelta storage
  • recomputation of delta on pull (to be stored as "optimised" general delta)
  • delta reordering on pulls when this is enabled on the server side

The latter feature will let clients without generaldelta enabled experience some of the disk space and bandwidth benefits.

3. Converting a repo to generaldelta

This is as simple as:

$ hg clone -U --config format.generaldelta=1 --pull project project-generaldelta

The aforementioned reordering can also marginally improve compression for generaldelta clients, which can be tried with a second pass:

$ hg clone -U --config format.generaldelta=1 --pull project-generaldelta project-generaldelta-pass2

Detailed compression statistics for the manifest can be checked with debugrevlog:

$ hg debugrevlog -m
format : 1
flags  : generaldelta

revisions     :   14932
    merges    :    1763 (11.81%)
    normal    :   13169 (88.19%)
revisions     :   14932
    full      :      61 ( 0.41%)
    deltas    :   14871 (99.59%)
revision size : 3197528
    full      :  744577 (23.29%)
    deltas    : 2452951 (76.71%)

avg chain length  : 172
compression ratio : 229

uncompressed data size (min/max/avg) : 125 / 80917 / 49156
full revision size (min/max/avg)     : 113 / 37284 / 12206
delta size (min/max/avg)             : 0 / 27029 / 164

deltas against prev  : 13770 (92.60%)
    where prev = p1  : 13707     (99.54%)
    where prev = p2  :     8     ( 0.06%)
    other            :    55     ( 0.40%)
deltas against p1    :  1097 ( 7.38%)
deltas against p2    :     4 ( 0.03%)
deltas against other :     0 ( 0.00%)

Of particular interest are the number of full revisions and the average delta size.

4. Further work

Mercurial

  • (./) Support for exchanging general delta over the wire (Mercurial 3.5 and above),

  • (./) Support for storing general delta changegroup on disk (Mercurial 3.6 and above),

  • (./) Support for pulling from an old server without triggering a recomputation of all delta ((future) Mercurial 3.7 and above),

  • {X} Client side warning when the exchange is sub-optimal,

  • {X} gathering data about the general delta efficiency and re-computation cost,

  • {X} provide user with a way to do on site upgrade,

  • {X} switch in core to disallow non bundle2 pull/push.

5. See also


CategoryInternals CategoryOldFeatures

GeneralDelta (last edited 2018-01-26 14:12:50 by JoergSonnenberger)