[PATCH 1 of 2] bundle: don't send too many changesets (Issue1704)
peter.arrenbrecht at gmail.com
Sat Nov 7 07:55:13 CST 2009
On Saturday, November 7, 2009, Peter Arrenbrecht
<peter.arrenbrecht at gmail.com> wrote:
> On Sat, Nov 7, 2009 at 12:08 PM, Benoit Boissinot <bboissin at gmail.com> wrote:
>> On Sat, Nov 07, 2009 at 09:59:13AM +0100, Peter Arrenbrecht wrote:
>>> # HG changeset patch
>>> # User Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
>>> # Date 1257583794 -3600
>>> bundle: don't send too many changesets (Issue1704)
>>> The fast path in changegroupsubset can send too many csets. This happens
>>> because it uses the parents of all bases as common nodes and then goes
>>> forward from this again. If a base has a parent that has another child,
>>> which is -not- a base, then this other child will nevertheless end up in
>>> the changegroup.
>>> The fix is to not use findmissing(), but use nodesbetween() instead, as
>>> do the slow path and incoming/outgoing.
>> If that's the case, then findmissing() should be fixed, since it does
>> exactly what you are describing (and since prepush() might be affected
>> From your description it's unclear if it's a bug in the "common" nodes
>> computation or a bug in findmissing() (ie. how does common look like
>> when calling _changegroup() for the fast-path).
> Hmm. Then we need to be more clever when computing the set of common
> nodes. Just taking all the parents of the desired base nodes is not
> good enough. Example:
> a -> (b, c -> d -> e) -> f
> Now if we want the changegroup with bases [b] and heads [f], we'd have
> to be smart enough to realize that common should be [a, e]. Meaning we
> have to scan children of the common nodes we added (here: [b,c]) and
> for all such children not in the original bases (here: c), we have to
> add their local heads (here: e) with within the subgraph formed by the
> original bases and heads to the set of common nodes.
> This seems rather intricate when nodesbetween already exists and does
> the job nicely.
In the end it seemed to me we would be very nearly reimplementing
nodesbetween to get the common set right.
>>> The change to test-notify.out is correct, because it actually hits this
>>> bug, as can be seen by glog'ing the two repos:
>>> @ 22c88
>>> | o 0a184
>>> | |
>>> o | 0647d
>>> o cb9a9
>>> o 0647d
>>> @ cb9a9
>>> It used to pull 0647d again, which is unnecessary.
>> Indeed, so the question is: was common composed of 0647d, or cb9a9 or
>> According to the docstring in findmissing() if common includes 0647d,
>> then findmissing() should return only [0a184, 22c88].
>> I'd actually like to have our internals based on "common" nodes instead
>> of "bases" nodes since it's a lot easier to reason about them (that's
>> why I introduced findmissing()). Ideally a new version of the discovery
>> algorithm should compute the "common", and send it to a new
>> remote changegroupcommon().
>>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>>> --- a/mercurial/localrepo.py
>>> +++ b/mercurial/localrepo.py
>>> @@ -1622,19 +1623,22 @@
>>> the linkrev.
>>> + # Set up some initial variables
>>> + # Make it easy to refer to self.changelog
>>> + cl = self.changelog
>>> + # msng is short for missing - compute the list of changesets in this
>>> + # changegroup.
>>> + if not bases:
>>> + bases = [nullid]
>> not needed
> Not so. Without it, I get a failure with test-clone-cgi (which
> admittedly calls changegroup in a slightly special way).
>>> + msng_cl_lst, bases, heads = cl.nodesbetween(bases, heads)
>>> if extranodes is None:
>>> # can we go through the fast path ?
>>> allheads = self.heads()
>>> if heads == allheads:
>>> - common = 
>>> - # parents of bases are known from both sides
>>> - for n in bases:
>>> - for p in self.changelog.parents(n):
>>> - if p != nullid:
>>> - common.append(p)
>>> - return self._changegroup(common, source)
>>> + return self._changegroup(msng_cl_lst, source)
>>> self.hook('preoutgoing', throw=True, source=source)
>> Can't you remove the nodesbetween() call that's right after this, since
>> you already computed it?
> Doh. I meant to move this bit of code, not duplicate it. Shall fix if
> you give me the OK for the general approach.
More information about the Mercurial-devel