[PATCH 4 of 5 v3] rebase: refactor computerebase to case-by-case handling

Thu Dec 4 18:26:11 CST 2014

On 12/03/2014 07:12 PM, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski at unity3d.com>
> # Date 1416364731 -3600
> #      Wed Nov 19 03:38:51 2014 +0100
> # Node ID 742988c7a05cb1bd49f9be0f057a9c8ef4e20d95
> # Parent  e031c5201577b359af981396957a0b9a9a5f5300
> rebase: refactor computerebase to case-by-case handling
>
> This makes it simpler to review and tweak handling of different cases.

What was wrong with the previous implementation, what makes yours better?

> This implementation was created using the old implementation as black box. It
> started out with how I think it should be, and was tweaked until it matched how
> it actually is. Further tweaking can take us close to where we want to be, step
> by step.

Now that we have the full list of case, are we sure they are all 
properly tested? I would makes me more comfortable in such refactoring.

> The only case where the test suite coverage gives different result from
> computerebase is in test-rebase-obsolete.t in a case where None as ancestor
> would give the same ancestor as the one previously explicitly specified.

Why does it change? What does it mean? Is that bad? Why don't we get it 
right in the first place?

>
> diff --git a/hgext/rebase.py b/hgext/rebase.py
> --- a/hgext/rebase.py
> +++ b/hgext/rebase.py
> @@ -25,6 +25,7 @@ import os, errno
>
>   nullmerge = -2
>   revignored = -3
> +outside = -4 # in computerebase: not in state, thus outside rebase set
>
>   cmdtable = {}
>   command = cmdutil.command(cmdtable)
> @@ -586,88 +587,93 @@ def computerebase(repo, rev, target, sta
>       '''Return the merge revisions and new parent relationship of rev rebased
>       to target:
>           merge parent 1, merge parent 2, ancestor, new parent 1, new parent 2
> +
> +    The following table shows how it is - not necessarily how it should be.
> +
> +    Rebasing of rev with parents p1 and p2 onto target as rev' is written:
> +        m1,m2|a
> +        p1',p2'
> +
> +        \  p2     null         ancestor        rebased        outside
> +      p1 \
> +    null       target,rev|  target,rev|None  p2',rev|p2    target,rev|p2
> +                 target         target           p2'          target
> +
> +    ancestor target,rev|Non target,rev|None  p2',rev|p2   target,rev|None
> +                 target         target           p2'         target,p2
> +
> +    rebased    p1',rev|p1     p1',rev|p1    target,rev|p1   p1',rev|p1
> +                   p1'            p1'          p1',p2         p1',p2
> +
> +    outside   target,rev|p1 target,rev|None  p2',rev|p2       abort
> +                 target        target,p1       p2',p1       3 parents

This table is fairly hard to read can we more to a version with:

1) clearer boundary
2) fixed size, aligned value. There is various option for this I think 
my currently preferred is using T (target), R (rev), 1 (p1), 2 (p2) and 
- (no value, applies for None)

> -        # In case of merge, we need to pick the right parent as merge base.
> -        #
> -        # Imagine we have:
> -        # - M: currently rebase revision in this step
> -        # - A: one parent of M
> -        # - B: second parent of M
> -        # - D: destination of this merge step (p1 var)
> -        #
> -        # If we are rebasing on D, D is the successors of A or B. The right
> -        # merge base is the one D succeed to. We pretend it is B for the rest
> -        # of this comment
> -        #
> -        # If we pick B as the base, the merge involves:
> -        # - changes from B to M (actual changeset payload)
> -        # - changes from B to D (induced by rebase) as D is a rebased
> -        #   version of B)
> -        # Which exactly represent the rebase operation.
> -        #
> -        # If we pick the A as the base, the merge involves
> -        # - changes from A to M (actual changeset payload)
> -        # - changes from A to D (with include changes between unrelated A and B
> -        #   plus changes induced by rebase)
> -        # Which does not represent anything sensible and creates a lot of
> -        # conflicts.

This comment is very useful at explaining the rebase semantic and 
internal. It should not be dropped.

> -        for p in repo[rev].parents():
> -            if state.get(p.rev()) == p1:
> -                base = p.rev()
> -                break
> -        else: # fallback when base not found
> -            base = None
> -
> -            # Raise because this function is called wrong (see issue 4106)
> -            raise AssertionError('no base found to rebase on '
> -                                 '(computerebase called wrong)')
> -    return p1, rev, base, p1, p2
> +    if p1 == nullrev:
> +        if p2 == nullrev:
> +            # Currently not possible because 'nothing to rebase from ...'
> +            repo.ui.warn(' p1=nullrev, p2=nullrev - untested\n')
> +            return target, rev, nullrev, target, nullrev

-1 for using inline returns in such hairy code. Lets make the logic flow 
explicit with if/elif/else usage.

I'd trouble to compare the results of all the cases to the table value 
(mostly because the table is hard to read). I'm a bit concerned by the 
lack of factorization and mistake that can emerge from it, but lets wait 
for the next version.

-- 
Pierre-Yves David