[PATCH 1 of 3 V3] changegroup: document deltaparent's choice of previous revision

Gregory Szorc gregory.szorc at gmail.com
Thu Oct 13 19:44:48 UTC 2016


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1476355787 -7200
#      Thu Oct 13 12:49:47 2016 +0200
# Node ID 4c98fccd5b70d2f89fe48bdb81288448daed758c
# Parent  c0410814002f467c24ef07ce73850ba15b306f8e
changegroup: document deltaparent's choice of previous revision

As part of debugging low-level changegroup generation, I came across
what I initially thought was a weird behavior: changegroup v2 is
choosing the previous revision in the changegroup as a delta base
instead of p1. I was tempted to rewrite this to use p1, as p1
will delta better than prev in the common case. However, I realized
that taking p1 as the base would potentially require resolving a
revision fulltext and thus require more CPU for e.g. server-side
processing of getbundle requests.

This patch tweaks the code comment to note the choice of behavior.
It also notes there is room for a flag or config option to tweak
this behavior later: using p1 as the delta base would likely make
changegroups smaller at the expense of more CPU, which could be
beneficial for things like clone bundles.

diff --git a/mercurial/changegroup.py b/mercurial/changegroup.py
--- a/mercurial/changegroup.py
+++ b/mercurial/changegroup.py
@@ -817,10 +817,17 @@ class cg2packer(cg1packer):
             self._reorder = False
 
     def deltaparent(self, revlog, rev, p1, p2, prev):
         dp = revlog.deltaparent(rev)
-        # avoid storing full revisions; pick prev in those cases
-        # also pick prev when we can't be sure remote has dp
+        # Avoid sending full revisions when delta parent is null. Pick
+        # prev in that case. It's tempting to pick p1 in this case, as p1
+        # will be smaller in the common case. However, computing a delta
+        # against p1 may require resolving the raw text of p1, which could
+        # be expensive. The revlog caches should have prev cached, meaning
+        # less CPU for changegroup generation. There is likely room to add
+        # a flag and/or config option to control this behavior.
+        #
+        # Pick prev when we can't be sure remote has the base revision.
         if dp == nullrev or (dp != p1 and dp != p2 and dp != prev):
             return prev
         return dp
 


More information about the Mercurial-devel mailing list