[PATCH V2] checkcopies: don't lose origin of file during merge (issue4748)

Jeremy Parente jeremy.parente at oneaccess-net.com
Fri Jul 17 03:39:58 CDT 2015

Ok, I understand.

Thanks you for the given explanation and the time spent.


On 07/17/2015 02:20 AM, Matt Mackall wrote:
> On Thu, 2015-07-16 at 10:33 +0200, Jeremy Parente wrote:
>> # HG changeset patch
>> # User Jeremy Parente <jeremy.parente at oneaccess-net.com>
>> # Date 1437035066 -7200
>> #      Thu Jul 16 10:24:26 2015 +0200
>> # Branch stable
>> # Node ID abd4cab8a1bac17d149ec44c36e9f556670c14b1
>> # Parent  540cd0ddac49c1125b2e013aa2ff18ecbd4dd954
>> checkcopies: don't lose origin of file during merge (issue4748)
> Ok, I've spent most of today thinking about this and I've decided I'm
> going to have to reject it. It's a lovely patch and you did in fact find
> the right place to make the change, and the test changes look good too,
> but I'm afraid it bumps up against deeper theoretical concerns.
> Let's imagine we've got a file named a that gets renamed to b, and then
> later a merge+commit happens. The DAG of that file's history today looks
> like this:
> a->b
> With your patch, it looks like:
> a->b->b
>   \___/
> (FYI, you can see this with debugindex and debugrename)
> ..which gives us a superfluous second revision of b that's unchanged,
> except that it's been renamed from a.. which we already knew and
> recorded. It also says a is both a parent and grandparent of b, which is
> false (and generally bad form, even for computers).
> And the resulting extra DAG entry is actually quite undesirable, because
> it now looks like "a change" and will fool later merges into thinking
> something interesting happened and cause bad merge decisions to happen.
> Also, it's going to generate tons of redundant file nodes on branchy
> projects.
> The first, simpler graph more accurately reflects the history. In one
> branch we did a rename.. and in the other nothing happened, so nothing
> was recorded. This is distinct from the other case you mentioned, which
> looks like this:
> a->b->b'
>   \    / <- this edge is not technically a rename or copy[1]
>    a'--
> ..where the new node is not redundant and a is not both parent and
> grandparent of b'. So it's perfectly kosher.
> Now you may be thinking "but the diff.." Yes, the diff is unhelpful, but
> that's just another instance of the classic Diffs Of Merges Are
> Basically Meaningless Because It's The Wrong Tool For Job problem:
> https://mercurial.selenic.com/wiki/MergeDiffs
> However, in this particular instance, we could make diff slightly
> smarter.. by giving it less information. If you try this in Git, it'll
> work, but only because Git never actually stores any sort of rename
> metadata in history. So it literally guesses where renames are every
> time by comparing file contents without any reference to their actual
> history. In the future, we could supplement our diff (and merge)
> algorithms with this sort of heuristic.. when real rename data isn't
> present.
> [1] we call it a 'ypoc', because it looks like a time-reversed copy. And
> diff has no concept of ypocs because it's unfit for merges.


Ingénieur R&D
Tel : +33 (0)493 001 641
Fax : +33 (0)493 001 661
OneAccess, 2455 Route des Dolines, BP 355,
F-06906 Sophia Antipolis, France


Confidentiality: This e-mail and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to whom they are
addressed. Access to this message by anyone else is unauthorised. If you have
received this e-mail in error please notify the sender immediately and delete
this message from your computer without further action. Any disclosure,
dissemination, distribution or copying of this message or any files transmitted
with it by an unauthorised recipient is strictly prohibited and may be unlawful.


More information about the Mercurial-devel mailing list