[PATCH 04 of 10] deltas: skip if projected delta size does not match text size constraint
Pierre-Yves David
pierre-yves.david at ens-lyon.org
Thu Jun 13 09:22:59 EDT 2019
# HG changeset patch
# User Valentin Gatien-Baron <vgatien-baron at janestreet.com>, Pierre-Yves David <pierre-yves.david at octobus.net>
# Date 1556224214 -7200
# Thu Apr 25 22:30:14 2019 +0200
# Node ID 614e5f26fcffcd74c601bb51b54bfb2378135f61
# Parent 057f04f3d9aee2818b41a540152e48cbd60a34ec
# EXP-Topic delta-extra
# Available At https://bitbucket.org/octobus/mercurial-devel/
# hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 614e5f26fcff
deltas: skip if projected delta size does not match text size constraint
Before computing any delta, we get a basic estimation of the delta size we can
expect and the resulted compressed value. We then checks this projected size
against the ½ⁿ size constraints. This allows to exclude potential base
candidates before doing any expensive computation.
This only apply to the intermediate-snapshot case since this constraint only
apply to them.
In practice we only perform this new checks for the manifestlog. Manifest log
combine two property: it is likely to have delta chain issue and its
diffing/compression is fairly predictable.
The initial author of this changeset is Valentin Gatien-Baron providing the
initial idea and initial testing, Pierre-Yves David later consolidated the code
in the right location and run more extensive testing.
diff --git a/mercurial/revlogutils/deltas.py b/mercurial/revlogutils/deltas.py
--- a/mercurial/revlogutils/deltas.py
+++ b/mercurial/revlogutils/deltas.py
@@ -679,6 +679,25 @@ def _candidategroups(revlog, textlen, p1
# if chain already have too much data, skip base
if deltas_limit < chainsize:
continue
+ if sparse and revlog.upperboundcomp is not None:
+ maxcomp = revlog.upperboundcomp
+ basenotsnap = (p1, p2, nullrev)
+ if rev not in basenotsnap and revlog.issnapshot(rev):
+ snapshotdepth = revlog.snapshotdepth(rev)
+ # If text is significantly larger than the base, we can
+ # expect the resulting delta to be proportional to the size
+ # difference
+ revsize = revlog.rawsize(rev)
+ rawsizedistance = max(textlen - revsize, 0)
+ # use an estimate of the compression upper bound.
+ lowestrealisticdeltalen = rawsizedistance // maxcomp
+
+ # check the absolute constraint on the delta size
+ snapshotlimit = textlen >> snapshotdepth
+ if snapshotlimit < lowestrealisticdeltalen:
+ # delta lower bound is larger than accepted upper bound
+ continue
+
group.append(rev)
if group:
# XXX: in the sparse revlog case, group can become large,
More information about the Mercurial-devel
mailing list