[PATCH 2 of 2] revlog: establish default max chain length of 10,000

Gregory Szorc gregory.szorc at gmail.com
Mon Dec 21 02:15:56 CST 2015


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1450678715 28800
#      Sun Dec 20 22:18:35 2015 -0800
# Node ID d8d5ea387a5f13adbf35c38ba304c17abc8a96aa
# Parent  f020dc8028046ac0b433c1e8f11bfc3781cdd8d3
revlog: establish default max chain length of 10,000

Delta chain length can have a significant impact on performance. Before
this patch, the default behavior was to create new delta chains after
1 of the following:

* Total size of entries in delta chain > 2.00x size of fulltext of
  current entry
* Linear distance between first entry and last entry > 4.00x size of
  fulltext
* New entry larger than fulltext
* format.deltachainlen is defined and delta chain length exceeds it

For the overwhelming majority of repositories, this default heuristic
is sufficient. However, it can break down and produce very long,
performance sucking delta chains.

Long delta chains are likely only encountered on:

* manifests
* files that are large, heavily modified, and that diff well

Performance measuring demonstrates that the time to resolve the
fulltext for a revision increases linearly with delta chain length
Furthermore, decompression of the raw revlog entries holds steady at
70-85% of wall time for manifests in mozilla-central (nearly all of
this time spent in zlib decompression) - at least on my machine.

Further performance measurement reveals that zlib decompression time
increases linearly with the size of the input. On my machine, each
zlib decompression request for a typical 100-500 byte revlog entry
takes 3-8us, or around 50-60MB/s on average. This is on a i7-6700K
running at 4.00 GHz, which is one of the fastest consumer CPUs
available today.

On a generaldelta clone of mozilla-central with aggressive merge deltas
and no delta chain length cap, average delta size is 432 bytes. And,
2/3 of revlog entries are actually zlib compressed (some revlog
entries are stored uncompressed because the compressed bdiff is larger
than its raw form). If we assume the typical zlib compressed revlog
entry takes 7us to decompress:

(WARNING: hypothetical numbers)

chain length    compressed count    zlib time
         100                  66       0.46ms
         500                 333       2.33ms
       1,000                 666       4.66ms
       2,000               1,333       9.33ms
       5,000               3,333      23.33ms
      10,000               6,666      46.66ms
      15,000              10,000      70.00ms
      20,000              13,333      93.33ms
      25,000              16,666     116.66ms
      30,000              20,000     140.00ms

These times exclude:

* Decompress time for the base revision
* Python and garbage collection overhead
* Time to apply bdiffs

When you measure *actual* end-to-end revision resolution times for
manifest delta chains on mozilla-central, we find:

chain length    time
           1    50ms
       2,500    78ms
       5,000   104ms
       7,500   117ms
      10,000   142ms
      12,500   163ms
      15,000   164ms
      20,000   199ms
      25,000   233ms
      30,000   278ms
      35,000   320ms
      40,000   363ms
      45,000   407ms

These were taken from the same delta chain and the difference in
fulltext size between the initial and find entry is 4,252,003 bytes
(8,591,371 to 12,843,374 bytes). The final manifest has 130,558
entries. If you plot the chain depth against time, the result is
mostly linear with a slight positive first derivative corresponding
to the increasing fulltext size. zlib decompression accounts for
most of the total revision resolution time.

This repository is capable of generating delta chains longer than
55,000. At those lengths, we're looking at >0.5s/revision to
decode! That's significant and is enough for humans to feel that
an `hg` command is slow.

Limiting the length of delta chains keeps our resolve times in check.
But what does it do for revlog size? On this mozilla-central
repository, the 00manifest.d file has the following sizes with
different settings:

 gd?  aggressive merge deltas?   maxchainlen      size
   n                         n             -      473M
   y                         n             -      336M
   y                         n        10,000      355M
   y                         y             -      129M
   y                         y         2,500      393M
   y                         y         5,000      245M
   y                         y        10,000      190M
   y                         y        15,000      158M

This patch establishes a default max chain length of 10,000. This
value produces revlogs that are not super slow due to unchecked
delta chain length explosion but are still significantly smaller
than non-generaldelta revlogs. In this case "not super slow" means
"manifest resolving on mozilla-central doesn't take more than 100ms
on average."

The shorter delta chains do lose some very impressive compression wins
of generaldelta (mostly attributes to aggressive merge deltas). However,
the size reduction comes at the cost of read time (assuming I/O isn't a
bottleneck). Given how cheap and fast I/O is these days, I think this
is a fair trade off. Of course, wire size will still be small, as
changegroups consist of a single delta chain. i.e. this patch should have
no impact on wire size.

I was tempted to make the default max chain length a little shorter -
something like 7,500. However, I'm purposefully trying to be a bit
conservative because the current limit is infinity and I'd rather be
closer to that than 0.

It's worth noting that the delta chain limit established by this patch
will likely never be hit in the majority of repositories in the wild.
So, debate over it should likely be limited to "very large
repositories." For comparison, an aggressive merge deltas conversion
of the hg repository itself produced a max chain length of only 1,014.

The limit established in this patch may have adverse consequences for
manifests even larger than 130k entries - notably larger revlog size.
However, at or after this scale, it is arguably better to ditch zlib for
lz4 (see lz4revlog) or (long term) switch to tree manifests. Speaking
of lz4revlog, it will likely want to set a higher max chain length -
probably at least 4x this value. This is because delta chain limiting
is primarily limiting the impact of zlib decompression. And once you
remove zlib, the bottleneck changes completely.

Finally, I've been experimenting with a cache of decompressed revlog
chunks. If and when that lands, we may revisit the default max delta
chain length value. However, caching is only beneficial on repeated
reads: for random `hg` commands that access a single revision,
intra-process caching won't help much if at all. So I think
establishing a max chain length regardless of future cache work is
justified.

diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -227,17 +227,18 @@ class revlog(object):
         self._cache = None
         # 2-tuple of (rev, baserev) defining the base revision the delta chain
         # begins at for a revision.
         self._basecache = None
         # 2-tuple of (offset, data) of raw data from the revlog at an offset.
         self._chunkcache = (0, '')
         # How much data to read and cache into the raw revlog data cache.
         self._chunkcachesize = 65536
-        self._maxchainlen = None
+        # Cap delta chains at this length.
+        self._maxchainlen = 10000
         self._aggressivemergedeltas = False
         self.index = []
         # Mapping of partial identifiers to full nodes.
         self._pcache = {}
         # Mapping of revision integer to full node.
         self._nodecache = {nullid: nullrev}
         self._nodepos = None
 


More information about the Mercurial-devel mailing list