Adding merge --ancestor option?

Mathias De Maré mathias.demare at gmail.com
Fri Aug 3 02:48:24 CDT 2012


On Thu, Mar 22, 2012 at 5:33 PM, Angel Ezquerra Moreu <
angel.ezquerra at gmail.com> wrote:

> On Thu, Mar 22, 2012 at 4:21 PM, Matt Mackall <mpm at selenic.com> wrote:
> > On Thu, 2012-03-22 at 15:44 +0100, Angel Ezquerra Moreu wrote:
> >> On Thu, Mar 22, 2012 at 2:37 PM, Matt Mackall <mpm at selenic.com> wrote:
> >> > On Wed, 2012-03-21 at 20:48 -0400, Greg Ward wrote:
> >> >> On 21 March 2012, Matt Mackall said:
> >> >> > > 1) Alice and Bob are working concurrently from the same
> changeset on
> >> >> > >    branch 1.0
> >> >> > > 2) Alice commits on 1.0
> >> >> > > 3) Alice merges to 1.1
> >> >> > > 4) Alice merges to default
> >> >> > > 5) Bob commits on 1.0
> >> >> > > 6) Bob merges to 1.1, gets a conflict, resolves it
> >> >> > > 7) Bob merges to default
> >> >> > > 8) Alice pushes and goes home: she's done her day's work
> >> >> > > 9) Bob attempts to push and fails: "push creates remote heads"
> >> >> > > 10) Bob pulls
> >> >> > > 11) Bob merges with Alice on 1.0, 1.1, and trunk
> >> >> > > 12) Bob pushes and goes home: he's done his day's work
> >> >> > > 13) Carl starts work at the tip of branch 1.0 (Bob's merge with
> Alice)
> >> >> > > 13) Carl merges 1.0 to 1.1: FAIL: he gets Bob's conflict!
> >> >> >
> >> >> > This is yet another case where we can't do any meaningful
> >> >> > differentiation between possible ancestors (the commits in (2) and
> (5)
> >> >> > in this case). We could perhaps walk the graph and notice that (5)
> has a
> >> >> > descendant merge with a conflict, and thus score it higher, but
> it'll
> >> >> > still be trivial to create scenarios with ties.
> >> >>
> >> >> I was confused at first by how you can detect conflict
> after-the-fact.
> >> >
> >> > Simple. A merge without conflicts will have no files listed in the
> >> > changeset. In this scheme, we'd try to pick the merge path that had
> the
> >> > most conflicts already resolved. So we'd notice that one of the
> choices
> >> > of ancestor implied merge 'legs' including Bob's conflict resolution
> >> > from (6) and choose it over the one with no resolutions in its legs.
> >> >
> >> > This tweak is much more work than its worth, though, as it nibbles
> only
> >> > a small chunk off the ambiguous domain.
> >> >
> >> >> > So there are two ways we can go:
> >> >> >
> >> >> > - allow manual ancestor selection (restricted to heads(::x and
> ::y))?
> >> >> > - invent a merge operator that's well-defined for multiple
> ancestors
> >> >> >
> >> >> > It's not too hard to see how the latter might work, if we ignore
> >> >> > renames.
> >> >>
> >> >> That would indeed be nifty. I'll have to screw on the old thinking
> cap
> >> >> and cogitate over this a bit.
> >> >
> >> > I'm starting to write up some design notes for this idea, which I'm
> >> > calling "concensus merge".
> >> >
> >> > A quick measurement on the Mercurial repo shows:
> >> >
> >> > 1911 merges
> >> > 83 with two or more merge ancestors
> >> > 1 with three
> >>
> >> Matt,
> >>
> >> is there a simple way (e.g. revset) to repeat that measurement? I
> >> suspect that mercurial's history is probably more linear than most,
> >> given the patch based workflow, the excellent review process and the
> >> high commit quality standards. The fact that there are only 2 named
> >> branches probably contributes to that as well.
> >>
> >> I could repeat those measurements on some of our repos to give you
> >> another measurement point.
> >
> > I did this:
> >
> > hg log --template '{rev}\n' -r 'merge()' > merges
> > for f in `cat merges`; do echo -n "$f: "; hg log -r "heads(::p1($f) and
> ::p2($f))" --template "{rev} "; echo; done > merge-ancestors
> >
> > You can also do something like this:
> >
> > $ hg dbsh
> > loaded repo : /home/mpm/hg
> > using source: /home/mpm/hg/mercurial
> >>>> d = {}
> >>>> for m in repo.revs("merge()"):
> > ...   d[m] = repo.revs('heads(::p1(%d) and ::p2(%d))', m, m)
> > ...
> >>>> len(d)
> > 1911
> >>>> len([x for x in d if len(d[x]) >= 2])
> > 83
> >
> > It's actually not clear from this measurement that any of these merges
> > were 'ambiguous' based on the current algorithm, which picks the first
> > common ancestor furthest from root.
>
> Umm, I am a bit surprised. I tried this on 3 of our repos. Looking at
> the 3 corresponding merge-ancestors files, none of them has a line
> showing more than possible 1 ancestor (if I understood what you did
> properly in cases where there are more than 1 ancestor I should get a
> line such as "148: 124 131", right?)
>
> In particular, this is the data I got:
>
> - Repo 1: 1270 revisions, 158 merges, 27 branches (16 inactive, 4 closed)
> - Repo 2: 1054 revisions,  82 merges, 22 branches (11 inactive, 3 closed)
> - Repo 2: 513 revisions,  41 merges, 10 branches (2 inactive, 1 closed)
>
> In all cases the number of merges that may be ambiguous is 0.
>

We are sometimes seeing merge issues because of multiple common ancestors.
On our repository with 1453 revisions, we see the following numbers:
1 common ancestor: 397 merges
2 common ancestors: 15 merges
3 common ancestors: 1 merge

We have noticed it's possible to try to work around this by merging the
correct intermediate changesets first, but this is quite complicated (we
also have a number of users who are quite new to Mercurial and find it very
hard to understand).

I saw there is a wiki page with a proposal on resolving these merges:
http://mercurial.selenic.com/wiki/ConsensusMerge
Is someone looking at this already? We would be glad to help test such a
change.

Greetings,
Mathias


> Cheers,
>
> Angel
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20120803/dbc4aa24/attachment.html>


More information about the Mercurial-devel mailing list