[PATCH 3 of 3] copies: do not track backward copies, only renames (issue3739)

Matt Mackall mpm at selenic.com
Fri Dec 21 17:42:56 CST 2012


On Fri, 2012-12-21 at 13:08 -0800, Siddharth Agarwal wrote:
> # HG changeset patch
> # User Siddharth Agarwal <sid0 at fb.com>
> # Date 1356123985 28800
> # Node ID af618ad15aa529d53062f7cf73d45f539f72ebfd
> # Parent  554d6cba20701d6b44870a79389cc89812173aed
> copies: do not track backward copies, only renames (issue3739)
> 
> The inverse of a rename is a rename back, but the inverse of a copy is a
> remove, not a "copy back".

Strictly speaking, it's not this simple. Insofar as a rename is "simply"
a copy+remove, there is a real sense that a copy does have an inverse. 

For instance, consider this scenario:

changeset 1:
 create a
changeset 2:
 copy a to b
changeset 3:
 delete a

Now we can clearly say what the following diffs should look like:

 1->2: show a copy
 2->3: show a delete
 1->3 = 1->2 + 2->3: show a rename

And we can even say what some of the reverse diffs should look like:

 3->1: show a rename(!)
 3->2: show a file add

But what should 2->1 look like? If it doesn't mention some kind of
relation between a and b, we've lost information relative to the the
forward diffs, nor can we construct 3->2 + 2->1 = 3->1. So we've got
some sort of relation here (let's call it a 'ypoc') such that 'add' +
'ypoc' = 'rename' in the same way that 'copy' + 'remove' = 'rename'. 

And as it happens, ypocs are actually a real thing in Mercurial, though
we admittedly don't handle them very well, nor do we have any way to
display them. A ypoc exists in the following not entirely uncommon
situation:

1-2-4
 \ /
  3

1: create a
2: copy a to b
3: modify a
4: merge

Here, merge should merge the changes of a into b and record both a and b
as ancestors for the result[1], giving us an 2:1 relationship that's the
inverse of a typical 1:n copy. If we then say "we don't really want a"
and commit that removal with the merge, we've got a genuine ypoc. 

In other words, a ypoc is a cross-file merge where the 'other' file
doesn't exist afterwards. And that actually suggests that copy itself
not really the fundamental relationship, because you can have all of the
following topologies:

copy   ypoc   rename  rename  ???   ???

 a-a   a-a    a          a    a-a   a-a
  \     /      \        /      /     \
   b   b        b      b      b-b   b-b

(the ??? case comes up naturally in merges involving copies as we've
seen above)

So really, what we've got is a 'cross-filename inheritance relation' and
depending on the topology, we can think of it as a copy, rename, ypoc,
etc.

In theory, pathcopies actually conceptually cares about this. Graft and
rebase care about a topology that looks like this:

 o-o-o-o-x-o-o-o-o-a
          \
           o-o-b

where pathcopies(a,b) = pathcopies(a,x) + pathcopies(x,b). If there's a
'add' on the a->x path (aka a removal on x->a) and a 'ypoc' on the x->b
path, then we may have the equivalent of a rename on the joined path.

But again, we handle these things very badly at present, which is not
surprising given how mind-warping it is:

 - we can't display ypocs or ???s usefully
   (though they matter for annotate!)
 - we probably don't/can't record some of these cases properly
 - pathcopies and mergecopies are not capable of representing them
 - we get confused by them when doing graft/rebase

As you've seen, pathcopies uses a dict, so it can only represent 1:n
relationships, and not n:1 ones, nor n:m, which can also exist!

But as you've also seen, we actually want to get some of this sort of
information out of at least mergecopies, hence the multiple dictionaries
that are returned.

For now, you're probably right that whatever the correct and complete
semantics should be, what _backwardsrename is doing is wrong-ish as a
ypoc is definitely not a copy, nor do we say we're returning 'cross-file
inheritance relations'.

You also may have discovered that mergecopies does not yet properly
handle the idea that the given ancestor may not actually be a
topological ancestor (as needed by rebase/graft).

At some point, I want to rationalize things so that mergecopies is
defined in terms of pathcopies so that it handles backwards legs, and
maybe eventually have a well-defined theory for what should happen with
the reverse-copy-like cases beyond 'punt'.

All of this is a long way of saying that I don't think your commit
comment is correct.

[1] See
http://www.selenic.com/hg/file/98687cdddcb1/mercurial/localrepo.py#l1026

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list