[PATCH] merge: add automatic tag merge algorithm

Angel Ezquerra angel.ezquerra at gmail.com
Tue Feb 25 14:09:21 CST 2014


On Sun, Feb 23, 2014 at 7:10 PM, Matt Mackall <mpm at selenic.com> wrote:
> On Wed, 2014-02-19 at 00:28 +0100, Angel Ezquerra wrote:
>> # HG changeset patch
>> # User Angel Ezquerra <angel.ezquerra at gmail.com>
>> # Date 1392597815 -3600
>> #      Mon Feb 17 01:43:35 2014 +0100
>> # Node ID f3eb8304d9bb59e78b50b42a9341a2063e1cb451
>> # Parent  7648e9aef6eeab00a0946e877690e94fb12d389b
>> merge: add automatic tag merge algorithm
>
> FYI, this is perhaps the third time a patch to do this has been
> presented. Have you looked at the previous attempts? How does this
> compare?

Thanks for the heads up. This has led me into a nice archeological
expedition down mercurial's history. It is quite interesting to see
how we got to the current global tag resolution algorithm. I think
that now I understand it better and I appreciate the subtle corner
cases that must be taken into account when calculating the global tag
set.

Probably my search-fu failed me, but I did not find any actual patches
proposing a concrete hagtags merge algorithm. However I found a lot of
discussions on how one such algorithm should behave. I also read a lot
about the different ways in which a potential tag merge algorithm
could fail.

To answer your original question, the algorithm I propose has the
advantage that it does not try to resolve all possible tag merge
conflict scenarios. Instead it focuses and automatically merging a
couple of conflict types which have obvious solutions (i.e that could
be done mechanically by a user by following a simple recipe). One of
them is IMHO the most common type of tag merge conflict, so fixing it
will give a lot of bang for the buck. In particular, the proposed
algorithm limits itself to automatically handling the following two
scenarios:

1. Two (or more) _different_ tags are added (and potentially also
moved / removed) on top of the existing tags in the two different
topological branches being merged: Currently this _always_ results in
a silly merge conflict, one that a user can mechanically resolve. The
algorithm just detects that particular scenario and makes the obvious
decision, which is to put these new "tag histories" at the end of the
hgtags file.

2. The same tag is added / removed / moved back an forth in the two
branches that are being merged. However, the set of operations on that
particular tag in one branch is a subset of the operations done on the
other branch. In that case the algorithm will also do the obvious,
which is to keep the biggest set of tag ops.

In all other cases in which we detect a conflict we revert to a
regular text merge (i.e. the current behavior). In particular I
explicitly avoided handling the case in which a given tag diverges
between the two branches being merged. That is the case that seems to
pop up most often on all past discussions on this matter. I have a
couple ideas on how this could be handled in some particular cases
(e.g. we could take the base .hgtags file into account and/or we could
try to make sure that the merged .hgtags is a subset of the "global
tags" that is calculated in tags.py). However that can be left for
another time.

Note that contrary to what my V1 patch says the algorithm does not
depend at all on changing the ordering of the .hgtags file. My V1
patch does sort the tags alphabetically but that was only because
tags._readtags does not return the tag info in order. I am preparing
V2 of my patch which does not need to do this anymore (by using a
sortdict rather than a regular dict).

Cheers,

Angel


More information about the Mercurial-devel mailing list