[PATCH 1 of 2 v8] graft: support grafting across move/copy (issue4028)

Pierre-Yves David pierre-yves.david at ens-lyon.org
Tue Sep 13 05:42:25 EDT 2016



On 09/12/2016 12:13 PM, Gábor STEFANIK wrote:
> -----Original Message-----
>> From: Yuya Nishihara [mailto:youjah at gmail.com] On Behalf Of Yuya
>> Nishihara
>> Sent: Sunday, September 11, 2016 6:29 PM
>> To: Gábor STEFANIK <Gabor.STEFANIK at nng.com>
>> Cc: mercurial-devel at mercurial-scm.org
>> Subject: Re: [PATCH 1 of 2 v8] graft: support grafting across move/copy
>> (issue4028)
>>
>> On Fri, 26 Aug 2016 11:16:31 -0500, Gábor Stefanik wrote:
>>> # HG changeset patch
>>> # User Gábor Stefanik <gabor.stefanik at nng.com> # Date 1472225958 -7200
>>> #      Fri Aug 26 17:39:18 2016 +0200
>>> # Node ID f32aa28298164aa38830e83369c57c9553c6ff08
>>> # Parent  318e2b600b80e4ed3c6f37df46ec7544f60d4c0b
>>> graft: support grafting across move/copy (issue4028)
>>
>> Sorry for late reply, and I don't fully understand how this patch handles
>> reverse copy tracking correctly yet. But weekend is over, so I just wrote
>> random comments.
>>
>>> +    # In certain scenarios (e.g. graft, update or rebase), ca can be
>> overridden
>>> +    # We still need to know a real common ancestor in this case
>>> +    # We can't just compute _c1.ancestor(_c2) and compare it to ca,
>> because
>>> +    # there can be multiple common ancestors, e.g. in case of bidmerge.
>>> +    cta = ca
>>> +    # ca.descendant(wc) and ca.descendant(ca) are False, work around that
>>> +    _c1 = c1.p1() if c1.rev() is None else c1
>>> +    _c2 = c2.p1() if c2.rev() is None else c2
>>> +    dirty_c1 = not (ca == _c1 or ca.descendant(_c1))
>>> +    dirty_c2 = not (ca == _c2 or ca.descendant(_c2))
>>> +    graft = dirty_c1 or dirty_c2
>>> +    if graft:
>>> +        cta = _c1.ancestor(_c2)
>>
>> Can we know if we're doing graft-type merge beforehand? I think
>> ca.descendant() isn't fast so it should be avoided for normal merges.
>
> This has been argued repeatedly. Basically the only way you can know in advance that your merge is going to be graftlike is by  doing a few descendant() calls yourself. So, with the exception of the "hg merge" command (which is guaranteed to yield ungraftlike merges), all commands wishing to do a merge will have to run through this whole descendant game.
>
> In the first few versions of the patches, I actually did graftlikeness detection in mergemod.graft(), but it was a nightmare to get it to work properly, and as it turns out, graft() isn't the only thing doing a graftlike merge. I was informed that calling descendant() once in a command is fine, it's only too slow for calling in a loop.
>
> I don't think complicating the code, potentially missing some edge cases of unusual commands doing graftlike merges, and breaking extension compatibility is warranted for a few ms speedup of "hg merge".

Given how often you had to explain this, it is probably worth adding 
something about this in the inline comment about it.

>>>      # find interesting file sets from manifests
>>> +    if graft:
>>> +        repo.ui.debug("  computing unmatched files in rotated DAG\n")
>>>      addedinm1 = m1.filesnotin(ma)
>>>      addedinm2 = m2.filesnotin(ma)
>>> -    u1, u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
>>> +    _u1, _u2 = _computenonoverlap(repo, c1, c2, addedinm1, addedinm2)
>>> +    if not graft:
>>> +        u1, u2 = _u1, _u2
>>> +    else: # need to recompute this for directory move handling when
>> grafting
>>> +        repo.ui.debug("  computing unmatched files in unrotated DAG\n")
>>> +        u1, u2 = _computenonoverlap(repo, c1, c2, m1.filesnotin(mta),
>>> +                                                  m2.filesnotin(mta))
>>> +
>>>      bothnew = sorted(addedinm1 & addedinm2)
>>>
>>>      for f in u1:
>>> -        checkcopies(c1, f, m1, m2, ca, limit, diverge, copy1, fullcopy1)
>>> +        checkcopies(c1, f, m1, m2, ca, cta, dirty_c1, limit, diverge, copy1,
>>> +                    fullcopy1, incomplete1, incompletediverge)
>>>
>>>      for f in u2:
>>> -        checkcopies(c2, f, m2, m1, ca, limit, diverge, copy2, fullcopy2)
>>> +        checkcopies(c2, f, m2, m1, ca, cta, dirty_c2, limit, diverge, copy2,
>>> +                    fullcopy2, incomplete2, incompletediverge)
>>
>> [snip]
>>
>>> +    # combine partial copy paths discovered in the previous step
>>> +    copyfrom, copyto = incomplete1, incomplete2
>>> +    if dirty_c1:
>>> +        copyfrom, copyto = incomplete2, incomplete1
>>> +    for f in copyfrom:
>>> +        if f in copyto:
>>> +            copy[copyto[f]] = copyfrom[f]
>>> +            del copyto[f]
>>> +    for f in incompletediverge:
>>> +        assert f not in diverge
>>> +        ic = incompletediverge[f]
>>> +        if ic[0] in copyto:
>>> +            diverge[f] = [copyto[ic[0]], ic[1]]
>>
>> According to Matt's comment, we need two copy traces split at 'ca', but we
>> use ones split at 'cta' (and somewhat 'ca' is taken into account?), because it
>> wouldn't be easy to track copies backwards.
>>
>> https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-
>> August/086915.html
>>
>> I guess that are addressed these "incomplete" dicts and "if"s in
>> checkcopies(), but I wonder if there are unhandled cases such as non-linear
>> DAG to be rotated, which might include "ypoc". I'm not sure, though.
>
> See the tests. Virtually every imaginable case is tested.
>
> Matt actually wanted 3 copy traces, one between "cta" and "ca", one from one parent to "ca", and one from the other parent to "cta". The problem with this approach is that checkcopies can't just stop after going "behind" some cutoff revision, since it's operating in a low-level way in which "behind" doesn't really make sense. This is presumably for perf reasons. As a result, the checkcopies pass going from a parent to "ca" will actually go back to "cta", and find spurious copies. We could perhaps identify and remove those spurious copies by comparing the output to that of the ca->cta pass, but then we would need post-processing as complex as what this patch has to accomplish that, so we win nothing by going 3-pass.

Okay this is the first time I read bout this 'behind' thing, I need to 
investigate that but that should probably extensibility documented as a 
comment.

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list