[PATCH 3 of 3] rebase: use matcher to optimize manifestmerge

Tue Mar 21 10:36:20 EDT 2017

On Mon, 20 Mar 2017 19:10:08 -0700, Durham Goode wrote:
> On 3/20/17 6:29 PM, Yuya Nishihara wrote:
> > On Mon, 20 Mar 2017 17:14:38 -0700, Durham Goode wrote:
> >> On 3/20/17 1:14 AM, Yuya Nishihara wrote:
> >>> On Sun, 19 Mar 2017 12:00:58 -0700, Durham Goode wrote:
> >>>> # HG changeset patch
> >>>> # User Durham Goode <durham at fb.com>
> >>>> # Date 1489949694 25200
> >>>> #      Sun Mar 19 11:54:54 2017 -0700
> >>>> # Node ID 800c452bf1a44f9f817174c69443121f4ed4c3b8
> >>>> # Parent  d598e42fa629195ecf43f438b71603df9fb66d6d
> >>>> rebase: use matcher to optimize manifestmerge
> >>>>
> >>>> The old merge code would call manifestmerge and calculate the complete diff
> >>>> between the source to the destination. In many cases, like rebase, the vast
> >>>> majority of differences between the source and destination are irrelevant
> >>>> because they are differences between the destination and the common ancestor
> >>>> only, and therefore don't affect the merge. Since most actions are 'keep', all
> >>>> the effort to compute them is wasted.
> >>>>
> >>>> Instead, let's compute the difference between the source and the common ancestor
> >>>> and only perform the diff of those files against the merge destination. When
> >>>> using treemanifest, this lets us avoid loading almost the entire tree when
> >>>> rebasing from a very old ancestor. This speeds up rebase of an old stack of 27
> >>>> commits by 20x.
> >>>
> >>> Looks generally good to me, but this needs more eyes.
> >>>
> >>>> @@ -819,6 +819,27 @@ def manifestmerge(repo, wctx, p2, pa, br
> >>>>          if any(wctx.sub(s).dirty() for s in wctx.substate):
> >>>>              m1['.hgsubstate'] = modifiednodeid
> >>>>
> >>>> +    # Don't use m2-vs-ma optimization if:
> >>>> +    # - ma is the same as m1 or m2, which we're just going to diff again later
> >>>> +    # - The matcher is set already, so we can't override it
> >>>> +    # - The caller specifically asks for a full diff, which is useful during bid
> >>>> +    #   merge.
> >>>> +    if (pa not in ([wctx, p2] + wctx.parents()) and
> >>>> +        matcher is None and not forcefulldiff):
> >>>
> >>> Is this optimization better for normal merge where m2 might be far from m1?
> >>
> >> I'm not sure what you mean by 'normal merge'.  You mean like an 'hg
> >> merge'?  Or like an hg update?  Any merge where you are merging in a
> >> small branch will benefit from this (like hg up @ && hg merge
> >> myfeaturebranch).
> >
> > 'hg merge'. What I had in mind was 'hg up stable && hg merge default' where
> > pa::p2 would be likely to be as large as p1:p2.
> 
> In that case, it may not save any time, but the time would still be 
> O(number of incoming changes).

Right, that wouldn't be a reason to avoid m2-vs-ma optimization.