Adding merge --ancestor option?

Matt Mackall mpm at selenic.com
Thu Mar 22 10:21:29 CDT 2012


On Thu, 2012-03-22 at 15:44 +0100, Angel Ezquerra Moreu wrote:
> On Thu, Mar 22, 2012 at 2:37 PM, Matt Mackall <mpm at selenic.com> wrote:
> > On Wed, 2012-03-21 at 20:48 -0400, Greg Ward wrote:
> >> On 21 March 2012, Matt Mackall said:
> >> > > 1) Alice and Bob are working concurrently from the same changeset on
> >> > >    branch 1.0
> >> > > 2) Alice commits on 1.0
> >> > > 3) Alice merges to 1.1
> >> > > 4) Alice merges to default
> >> > > 5) Bob commits on 1.0
> >> > > 6) Bob merges to 1.1, gets a conflict, resolves it
> >> > > 7) Bob merges to default
> >> > > 8) Alice pushes and goes home: she's done her day's work
> >> > > 9) Bob attempts to push and fails: "push creates remote heads"
> >> > > 10) Bob pulls
> >> > > 11) Bob merges with Alice on 1.0, 1.1, and trunk
> >> > > 12) Bob pushes and goes home: he's done his day's work
> >> > > 13) Carl starts work at the tip of branch 1.0 (Bob's merge with Alice)
> >> > > 13) Carl merges 1.0 to 1.1: FAIL: he gets Bob's conflict!
> >> >
> >> > This is yet another case where we can't do any meaningful
> >> > differentiation between possible ancestors (the commits in (2) and (5)
> >> > in this case). We could perhaps walk the graph and notice that (5) has a
> >> > descendant merge with a conflict, and thus score it higher, but it'll
> >> > still be trivial to create scenarios with ties.
> >>
> >> I was confused at first by how you can detect conflict after-the-fact.
> >
> > Simple. A merge without conflicts will have no files listed in the
> > changeset. In this scheme, we'd try to pick the merge path that had the
> > most conflicts already resolved. So we'd notice that one of the choices
> > of ancestor implied merge 'legs' including Bob's conflict resolution
> > from (6) and choose it over the one with no resolutions in its legs.
> >
> > This tweak is much more work than its worth, though, as it nibbles only
> > a small chunk off the ambiguous domain.
> >
> >> > So there are two ways we can go:
> >> >
> >> > - allow manual ancestor selection (restricted to heads(::x and ::y))?
> >> > - invent a merge operator that's well-defined for multiple ancestors
> >> >
> >> > It's not too hard to see how the latter might work, if we ignore
> >> > renames.
> >>
> >> That would indeed be nifty. I'll have to screw on the old thinking cap
> >> and cogitate over this a bit.
> >
> > I'm starting to write up some design notes for this idea, which I'm
> > calling "concensus merge".
> >
> > A quick measurement on the Mercurial repo shows:
> >
> > 1911 merges
> > 83 with two or more merge ancestors
> > 1 with three
> 
> Matt,
> 
> is there a simple way (e.g. revset) to repeat that measurement? I
> suspect that mercurial's history is probably more linear than most,
> given the patch based workflow, the excellent review process and the
> high commit quality standards. The fact that there are only 2 named
> branches probably contributes to that as well.
> 
> I could repeat those measurements on some of our repos to give you
> another measurement point.

I did this:

hg log --template '{rev}\n' -r 'merge()' > merges
for f in `cat merges`; do echo -n "$f: "; hg log -r "heads(::p1($f) and ::p2($f))" --template "{rev} "; echo; done > merge-ancestors

You can also do something like this:

$ hg dbsh
loaded repo : /home/mpm/hg
using source: /home/mpm/hg/mercurial
>>> d = {}
>>> for m in repo.revs("merge()"):
...   d[m] = repo.revs('heads(::p1(%d) and ::p2(%d))', m, m)
... 
>>> len(d)
1911
>>> len([x for x in d if len(d[x]) >= 2])
83

It's actually not clear from this measurement that any of these merges
were 'ambiguous' based on the current algorithm, which picks the first
common ancestor furthest from root.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list