[PATCH 1 of 2 v8] graft: support grafting across move/copy (issue4028)

Gábor STEFANIK Gabor.STEFANIK at nng.com
Mon Sep 12 13:25:45 EDT 2016


>


--------------------------------------------------------------------------
This message, including its attachments, is confidential. For more information please read NNG's email policy here:
http://www.nng.com/emailpolicy/
By responding to this email you accept the email policy.


-----Original Message-----
> From: Yuya Nishihara [mailto:youjah at gmail.com] On Behalf Of Yuya
> Nishihara
> Sent: Monday, September 12, 2016 5:49 PM
> To: Gábor STEFANIK <Gabor.STEFANIK at nng.com>
> Cc: mercurial-devel at mercurial-scm.org
> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy
> (issue4028)
>
> On Mon, 12 Sep 2016 10:13:20 +0000, Gábor STEFANIK wrote:
> > > > +    # In certain scenarios (e.g. graft, update or rebase), ca can
> > > > + be
> > > overridden
> > > > +    # We still need to know a real common ancestor in this case
> > > > +    # We can't just compute _c1.ancestor(_c2) and compare it to
> > > > + ca,
> > > because
> > > > +    # there can be multiple common ancestors, e.g. in case of bidmerge.
> > > > +    cta = ca
> > > > +    # ca.descendant(wc) and ca.descendant(ca) are False, work around
> that
> > > > +    _c1 = c1.p1() if c1.rev() is None else c1
> > > > +    _c2 = c2.p1() if c2.rev() is None else c2
> > > > +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))
> > > > +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))
> > > > +    graft = dirty_c1 or dirty_c2
> > > > +    if graft:
> > > > +        cta = _c1.ancestor(_c2)
> > >
> > > Can we know if we're doing graft-type merge beforehand? I think
> > > ca.descendant() isn't fast so it should be avoided for normal merges.
> >
> > This has been argued repeatedly. Basically the only way you can know in
> advance that your merge is going to be graftlike is by  doing a few
> descendant() calls yourself. So, with the exception of the "hg merge"
> command (which is guaranteed to yield ungraftlike merges), all commands
> wishing to do a merge will have to run through this whole descendant game.
> >
> > In the first few versions of the patches, I actually did graftlikeness
> detection in mergemod.graft(), but it was a nightmare to get it to work
> properly, and as it turns out, graft() isn't the only thing doing a graftlike
> merge. I was informed that calling descendant() once in a command is fine,
> it's only too slow for calling in a loop.
>
> I thought callers could compute ca and cta by themselves since they explicitly
> pass a pseudo ca, but maybe I'm wrong. If that's already been discussed,
> there's no reason to complain. Sorry for the noise.

In certain "hg update" scenarios, mergemod.update may be called with ancestor=None,
and we can still end up with a fake ca (in a completely sane way). Also, there are quite a few extensions
both in and out of tree, that call mergemod.update with or without an explicit ca, which may or may not
be a real common ancestor.

Then there is also the case when "ca" is a common ancestor of both c1 and c2, but ca != c1.ancestor(c2)
This case is an ungraftlike merge, and should not use the graft logic. It can happen e.g. due to bidmerge.

Finally, there are cases of "hg graft" where ca==cta, and thus a graft command can involve an ungraftlike
merge. There is no way for the caller to detect this, except by calling descendant().

>
> > > According to Matt's comment, we need two copy traces split at 'ca',
> > > but we use ones split at 'cta' (and somewhat 'ca' is taken into
> > > account?), because it wouldn't be easy to track copies backwards.
> > >
> > > https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-
> > > August/086915.html
> > >
> > > I guess that are addressed these "incomplete" dicts and "if"s in
> > > checkcopies(), but I wonder if there are unhandled cases such as
> > > non-linear DAG to be rotated, which might include "ypoc". I'm not sure,
> though.
> >
> > See the tests. Virtually every imaginable case is tested.
> >
> > Matt actually wanted 3 copy traces, one between "cta" and "ca", one from
> one parent to "ca", and one from the other parent to "cta". The problem
> with this approach is that checkcopies can't just stop after going "behind"
> some cutoff revision, since it's operating in a low-level way in which "behind"
> doesn't really make sense. This is presumably for perf reasons. As a result,
> the checkcopies pass going from a parent to "ca" will actually go back to "cta",
> and find spurious copies. We could perhaps identify and remove those
> spurious copies by comparing the output to that of the ca->cta pass, but then
> we would need post-processing as complex as what this patch has to
> accomplish that, so we win nothing by going 3-pass.
>
> My vague concern is that we always walk graph from top to bottom keeping
> linear data for fixing up rotated copy trace.
>
> I'm not sure if this is a relevant example, but given the following history, I got
> different results by a) "graft 6 to 10" and b) "graft 7 to 10", which seems
> suspicious.
>
> @  10: a
> |
> | +  9: c
> | |
> | +  8: d->()
> | |
> | | +  7: d
> | |/
> | | +  6: c
> | |/
> | +    5: c,d
> | |\
> | | +  4: b->d
> | | |
> | + |  3: b->c
> | |/
> | +  2: a->b
> |/
> o  1: a
> |
> o  0: ()->a
>
>
> hg init foo
> cd foo
>
> echo 0 >> a
> hg add a
> hg ci -m '()->a'
>
> echo 1 >> a
> hg ci -m 'a'
>
> hg mv a b
> echo 2 >> b
> hg ci -m 'a->b'
>
> hg mv b c
> echo 3 >> c
> hg ci -m 'b->c'
>
> hg up 2
> hg mv b d
> echo 4 >> d
> hg ci -m 'b->d'
>
> hg up 3
> hg merge 4
> echo 5 >> c
> echo 5 >> d
> hg ci -m 'c,d'
>
> echo 6 >> c
> hg ci -m 'c'
>
> hg up 5
> echo 7 >> d
> hg ci -m 'd'
>
> hg up 5
> hg rm d
> hg ci -m 'd->()'
>
> echo 9 >> c
> hg ci -m 'c'
>
> hg up 1
> echo 10 >> a
> hg ci -m 'a'
>
> hg log -G -T '{rev}: {desc|firstline}\n'
>
>
> # a) c->a (but there is also d->a)
> hg graft 6
> other [graft] changed c which local [local] deleted
>
> # b) d->a (but there is also c->a)
> hg graft 7
> merging a and d to a
>
> # c) c->a (d was deleted)
> hg graft 9
> merging a and c to a


This looks like a separate (though related) issue to me. Turning the DAG around causes "a" to be both "moved from c" and "moved from d", a ypoc-like scenario. The problem is, mergecopies' callers expect the "copy" return value to be a many-to-one mapping of targets to sources, when it's actually many-to-many in this scenario. Fixable, but I would rather handle this in a followup issue (that I intend to do still in the 4.0 timeframe), since it's not a regression, and the patch has already grown quite big.
In fact, this may be something we want to support explicitly in the tree as well, to handle refactors that cause two files to be combined into one (e.g. allowing "module.h" to be moved from both "module_pub.h" and "module_priv.h"), but that's yet another issue, probably beyond 4.0.


More information about the Mercurial-devel mailing list