[PATCH 1 of 2] bundle: don't send too many changesets (Issue1704)
peter.arrenbrecht at gmail.com
Sat Nov 7 05:25:49 CST 2009
On Sat, Nov 7, 2009 at 12:08 PM, Benoit Boissinot <bboissin at gmail.com> wrote:
> On Sat, Nov 07, 2009 at 09:59:13AM +0100, Peter Arrenbrecht wrote:
>> # HG changeset patch
>> # User Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
>> # Date 1257583794 -3600
>> bundle: don't send too many changesets (Issue1704)
>> The fast path in changegroupsubset can send too many csets. This happens
>> because it uses the parents of all bases as common nodes and then goes
>> forward from this again. If a base has a parent that has another child,
>> which is -not- a base, then this other child will nevertheless end up in
>> the changegroup.
>> The fix is to not use findmissing(), but use nodesbetween() instead, as
>> do the slow path and incoming/outgoing.
> If that's the case, then findmissing() should be fixed, since it does
> exactly what you are describing (and since prepush() might be affected
> From your description it's unclear if it's a bug in the "common" nodes
> computation or a bug in findmissing() (ie. how does common look like
> when calling _changegroup() for the fast-path).
Hmm. Then we need to be more clever when computing the set of common
nodes. Just taking all the parents of the desired base nodes is not
good enough. Example:
a -> (b, c -> d -> e) -> f
Now if we want the changegroup with bases [b] and heads [f], we'd have
to be smart enough to realize that common should be [a, e]. Meaning we
have to scan children of the common nodes we added (here: [b,c]) and
for all such children not in the original bases (here: c), we have to
add their local heads (here: e) with within the subgraph formed by the
original bases and heads to the set of common nodes.
This seems rather intricate when nodesbetween already exists and does
the job nicely.
>> The change to test-notify.out is correct, because it actually hits this
>> bug, as can be seen by glog'ing the two repos:
>> @ 22c88
>> | o 0a184
>> | |
>> o | 0647d
>> o cb9a9
>> o 0647d
>> @ cb9a9
>> It used to pull 0647d again, which is unnecessary.
> Indeed, so the question is: was common composed of 0647d, or cb9a9 or
> According to the docstring in findmissing() if common includes 0647d,
> then findmissing() should return only [0a184, 22c88].
> I'd actually like to have our internals based on "common" nodes instead
> of "bases" nodes since it's a lot easier to reason about them (that's
> why I introduced findmissing()). Ideally a new version of the discovery
> algorithm should compute the "common", and send it to a new
> remote changegroupcommon().
>> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
>> --- a/mercurial/localrepo.py
>> +++ b/mercurial/localrepo.py
>> @@ -1622,19 +1623,22 @@
>> the linkrev.
>> + # Set up some initial variables
>> + # Make it easy to refer to self.changelog
>> + cl = self.changelog
>> + # msng is short for missing - compute the list of changesets in this
>> + # changegroup.
>> + if not bases:
>> + bases = [nullid]
> not needed
Not so. Without it, I get a failure with test-clone-cgi (which
admittedly calls changegroup in a slightly special way).
>> + msng_cl_lst, bases, heads = cl.nodesbetween(bases, heads)
>> if extranodes is None:
>> # can we go through the fast path ?
>> allheads = self.heads()
>> if heads == allheads:
>> - common = 
>> - # parents of bases are known from both sides
>> - for n in bases:
>> - for p in self.changelog.parents(n):
>> - if p != nullid:
>> - common.append(p)
>> - return self._changegroup(common, source)
>> + return self._changegroup(msng_cl_lst, source)
>> self.hook('preoutgoing', throw=True, source=source)
> Can't you remove the nodesbetween() call that's right after this, since
> you already computed it?
Doh. I meant to move this bit of code, not duplicate it. Shall fix if
you give me the OK for the general approach.
More information about the Mercurial-devel