[PATCH 3 of 3 V2] merge: add automatic tag merge algorithm

Sun Mar 30 08:14:20 CDT 2014

On Fri, Mar 28, 2014 at 11:12 PM, Angel Ezquerra <ezquerra at gmail.com> wrote:
> On Wed, Mar 26, 2014 at 7:46 PM, Matt Mackall <mpm at selenic.com> wrote:
>> On Mon, 2014-03-24 at 19:31 +0100, Angel Ezquerra wrote:
>>> On Mon, Mar 24, 2014 at 4:42 PM, Mads Kiilerich <mads at kiilerich.com> wrote:
>>> > On 03/24/2014 02:29 PM, Angel Ezquerra wrote:
>>> >>
>>> >> On Mon, Mar 24, 2014 at 1:52 PM, Mads Kiilerich <mads at kiilerich.com>
>>> >> wrote:
>>> >>>>
>>> >>>> 2. When the hgtagmerge algorithm fails we want to revert to a regular,
>>> >>>> old-style text merge, so that the user can use his merge tool of
>>> >>>> choice to do the merge. I don't think (but I may be wrong) that it is
>>> >>>> possible to configure mercurial to call a second merge tool if the
>>> >>>> first one that you specified via a merge pattern fails.
>>> >>>
>>> >>>
>>> >>> I would say that the tool should be so good that we think it can solve
>>> >>> all
>>> >>> common and not-so-common cases.
>>> >>>
>>> >>> In the rare case where it fails I think it would be fine if it failed
>>> >>> with
>>> >>> "error: .hgtags cannot be merged automatically - use 'hg merge --config
>>> >>> merge-patterns..hgtags=!' to resolve with your normal merge tool" ... or
>>> >>> suggest 'hg merge --tool kdiff3 .hgtags'. Users who end up here are
>>> doing
>>> >>> weird stuff. Let's give them the power and responsibility for working it
>>> >>> out
>>> >>> instead of adding more conceptual complexity.
>>> >>
>>> >> The thing is that users don't need to do very weird stuff for a fully
>>> >> automated merge to be impossible. Two different users may decide that
>>> >> a different revision must be tagged "A". That is an actual conflict
>>> >> that no automated merge tool can resolve.
>>> >
>>> >
>>> > Right. That is a common case I haven't considered/mentioned.
>>> >
>>> > The tool could ask "tag X was 01234, (l)ocal changed to 12345, (o)ther
>>> > changed to 23456 - pick one" (similar to largefile merging) (and similar
>>> > prompts for the 2 change+delete cases and add+add).
>>> >
>>> > Your tool has the necessary information ... and I think it would be
>>> > reasonable to expect it to handle these common cases too.
>>>
>>> I thought about this a bit before going for a simpler algorithm as a first
>>> step.
>>>
>>> It is true that the algorithm has the necessary information. However it is
>>> not as simple as just choosing one tag or the other. We must also combine
>>> the histories of the tags that are in conflict (or at least preserve what
>>> the tags.findglobaltags algorithm calls the tag "rank").
>>
>>> For example, the local revision may have the following entries in it's
>>> .hgtags file:
>>>
>>> HASH_A TAG_A
>>> HASH_B TAG_A
>>> HASH_C TAG_A
>>>
>>> While the remote .hgtags may be:
>>>
>>> HASH_A TAG_A
>>> HASH_D TAG_A
>>> HASH_E TAG_A
>>> HASH_F TAG_A
>>>
>>> In theory the merge algorithm should pick HASH_F as the hash of the
>>> revision pointed to by TAG_A since it is the tag with the longest history
>>> (i.e. the highest tag "rank"). However, what should the merged tag history
>>> be? It seems that we could just keep the tag history of the tag with the
>>> highest rank, i.e.:
>>>
>>> HASH_A TAG_A
>>> HASH_D TAG_A
>>> HASH_E TAG_A
>>> HASH_F TAG_A
>>>
>>> Would that be OK? It seems so, but we'd be "losing" the TAG_A history from
>>> the local revision branch... It does not seem to really be a problem and in
>>> fact there is no way around that I think.
>>
>> I don't think this is the right answer. I think instead we should keep
>> all of them as each on is an ordering assertion. We should probably
>> interleave them in order of distance from tip, so either:
>>
>> A < B < C        <- closer to tip, so B trumps D
>> A < D < E < F
>> -------------
>> A < D < B < E < C < F
>>
>> or
>>
>> A < D < E < F
>> A < B < C
>> -------------
>> A < B < D < C < E < F
>>
>> This basically says "at the time of merge, this is how things stood in
>> the tag ranking". As a side effect, the "rank" of F gets increased a
>> bunch. I think this is fine, as it's primarily a tie-breaker that says
>> "the tag with the most history wins".

Matt, sorry for coming back on this, but after thinking about your
suggestion a bit more I am not sure I understand it.
Which of the following are you suggesting?

1. Interleave the rank items, taking one from each head
alternativelly, (i.e. one from one head, then the next one from the
other head, and so on)
2. Put all the rank items (from both heads) on a single list, and
reorder them according to their distance from tip?

I am not sure #2 would be the right thing to do. For example, if  the
user ever moved a tag backwards we would now be reordering the tags
involved in that backwards move.

#1 seems better, but I think that interleaving the rank items (whether
it is using #1 or #2) would make it unnecessarily harder for users to
do a manual tag merge with the result of this merge later (for example
if they were using an older mercurial version).

Instead I think it would be better to do, for each conflicting tag:
1. Place the rank entries common to both heads first
2. Place the remaining rank entries of the lowest rank head
3. Place the remaining rank entries of the highest rank head

I think this would be simpler to implement, would avoid the tag
reordering issues and make interop with older mercurial versions
easier.

In the end I don't think this is a big deal, because we do not really
use the content of the rank history, just the rank itself, and as you
said people who end up in this scenario are already doing something
very wrong. That is why I'd rather go for the simpler solution that I
suggest or for #1 if you think that is better.

Cheers,

Angel