Bug 6163 - rebasing when copy source is unrelated is broken
Summary: rebasing when copy source is unrelated is broken
Status: RESOLVED ARCHIVED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: rebase (show other bugs)
Version: 4.8
Hardware: PC Linux
: normal bug
Assignee: Bugzilla
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-06-25 13:58 UTC by Martin von Zweigbergk
Modified: 2021-06-05 00:00 UTC (History)
2 users (show)

See Also:
Python Version: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin von Zweigbergk 2019-06-25 13:58 UTC
Consider this history:

  @  4 'rename x to y'
  |
  o  3 'add x again'
  |
  | o  2 'modify x'
  | |
  | o  1 'add x'
  |/
  o  0 'base'

Assume 1 and 3 have the same content so there is no conflict. We now do `hg rebase -r 2 -d 4`. We would expect that to merge the changed file 'x' from commit 2 into file 'y'. However, we instead get a moidfy/delete conflict since 57203e0210f8 (copies: calculate mergecopies() based on
pathcopies(), 2019-04-11). The issue is that pathcopies() considers 'x' in commit 1 and 'x' in commit 3 to be unrelated since it chains the copies in 0->1 with the copies in 0->4. It's technically correct that they are not related, but this case happens quite frequently in practice, so we need to handle it better.

Also consider this similar case:


  @  4 'rename x to y'
  |
  o  3 'add x again'
  |
  o  2 'remove x'
  |
  | o  1 'modify x'
  |/
  o  0 'add x'


The 'x' in commit 0 and the 'x' in commit 3 are technically not related here either. However, pathcopies() doesn't handle that case so it still considers them related. I consider it a bug in pathcopies(). I've meant to fix that bug, but we clearly need to fix the bug above first.
Comment 1 Martin von Zweigbergk 2019-06-25 14:06 UTC
Oh, the third similar case is this:

  @  4 'rename x to y'
  |
  | o  3 'modify x'
  | |
  | o  2 'add x again'
  | |
  | o  1 'remove x'
  |/
  o  0 'add x'

That "suffers" from the same bug in pathcopies() that makes this bug not happen there.
Comment 2 Martin von Zweigbergk 2019-06-26 13:43 UTC
Also compare these two cases:


  @  4 'rename x to y'
  |
  o  3 'add x again'
  |
  o  2 'remove x'
  |
  | o  1 'modify x'
  |/
  o  0 'add x'


  @  4 'add y again'
  |
  o  3 'remove y'
  |
  o  2 'rename x to y'
  |
  | o  1 'modify x'
  |/
  o  0 'add x'


Even before my recent changes, `hg rebase -r 1 -d 4` would succeed in the first case but not in the second case. I can't think of a good reason it should be that way. Note that rebasing commit 1 first to 2 and then to 4 would make it succeed in the second case. IMO, that suggests that we're doing it wrong there.
Comment 3 Martin von Zweigbergk 2019-06-28 18:34 UTC
Let me clarify my position a bit. I initially thought it was a bug that pathcopies() considers file 'x' in the examples related. However, if we don't consider them related, I can't think of a good reason that the rebase operations would succeed, and I'm pretty sure we do want them to succeed. So I think we should instead make it so file 'x' is considered related in all these cases. I'm working on patches for that.
Comment 4 Martin von Zweigbergk 2019-06-28 18:38 UTC
I forgot to say that I'm pretty sure the reason that cases in comment 2 behave differently is simply that it is expensive to continue tracing the file in the second case. I don't think it was ever a desired behavior.
Comment 5 Martin von Zweigbergk 2019-07-01 19:29 UTC
More examples:


  @  4 'rename x to y'
  |  
  o  3 'add x with same content as in rev 2'
  |  
  | o  2 'modify x'
  | |  
  | o  1 'add x'
  |/   x
  o  0 'base'
     a

  @  5 'rename x to y'
  |  
  o  4 'modify x to same content as in rev 2'
  |  
  o  3 'add empty x'
  |  
  | o  2 'modify x'
  | |  
  | o  1 'add x'
  |/   x
  o  0 'base'
     a

Before my recent changes to mergecopies(), `hg rebase -r 2 -d .` would succeed in the first case but not in the second. That is because x's at @^ has the same file nodeid as at 2 in the first case, which leads the "is related" check to succeed.

I'm getting more and more convinced that at least the current check (well, before my changes) for relatedness does more harm than good. If you disagree, please speak up now before I remove it.
Comment 6 Martin von Zweigbergk 2019-07-02 13:09 UTC
I'll just fix the regression for now. I may get back to the more general problem of how to treat unrelated files later.
Comment 7 HG Bot 2019-07-11 20:05 UTC
Fixed by https://mercurial-scm.org/repo/hg/rev/d013099c551b
Martin von Zweigbergk <martinvonz@google.com>
copies: filter invalid copies only at end of pathcopies() (issue6163)

copies._filter() filters out copies whose source file does not exist
in the start commit or whose target file does not exist in the end
commit. We do that after chaining copies with dirstate copies or
backward renames from another branch. We also do at the end of the
changeset-centric copy tracing. The filtering means that we will
remove copies to/from files that did not exist in some intermediate
commit. That is inconsistent with what we do if a file has been
deleted and then re-added (we allow updating across that).

Copying the two first examples from issue6163:

  @  4 'rename x to y'
  |
  o  3 'add x again'
  |
  o  2 'remove x'
  |
  | o  1 'modify x'
  |/
  o  0 'add x'

  @  4 'rename x to y'
  |
  o  3 'add x again'
  |
  | o  2 'modify x'
  | |
  | o  1 'add x'
  |/
  o  0 'base'

When doing `hg rebase -r 1 -d 4` in the first case, it succeeds, but
`hg rebase -r 2 -d 4` in the second case does not. That's because we
chain and filter via commit 0, which does not have file 'x' in the
second case. IMO, that's clearly inconsistent. So this patch removes
the filtering step so it only happens at the end. If a file was
temporarily removed, whether via a merge base or not, it will now
still be considered the same file. That fixes issue6163 for the
changeset-centric case.

Differential Revision: https://phab.mercurial-scm.org/D6603

(please test the fix)
Comment 8 HG Bot 2019-07-11 20:10 UTC
Fixed by https://mercurial-scm.org/repo/hg/rev/819712deac69
Martin von Zweigbergk <martinvonz@google.com>
copies: follow copies across merge base without source file (issue6163)

As in the previous patch, consider these two histories:

  @  4 'rename x to y'
  |
  o  3 'add x again'
  |
  o  2 'remove x'
  |
  | o  1 'modify x'
  |/
  o  0 'add x'

  @  4 'rename x to y'
  |
  o  3 'add x again'
  |
  | o  2 'modify x'
  | |
  | o  1 'add x'
  |/
  o  0 'base'

We trace copies from the 'modify x' commit to commit 4 by going via
the merge base (commit 0). When tracing file 'y' (_tracefile()) in the
first case, we immediately find the rename from 'x'. We check to see
if 'x' exists in the merge base, which it does, so we consider it a
valid copy. In the second case, 'x' does not exist in the merge base,
so it's not considered a valid copy. As a workaround, this patch makes
it so we also attempt the check in mergecopies's base commit (commit 1
in the second case). That feels pretty ugly to me, but I don't have
any better ideas.

Note that we actually also check not only that the filename matches,
but also that the file's nodeid matches. I don't know why we do that,
but it was like that already before I rewrote mergecopies(). That
means that the rebase will still fail in cases like this (again, it
already failed before my rewrite):


  @  4 'rename x to y'
  |
  o  3 'add x again with content X2'
  |
  o  2 'remove x'
  |
  | o  1 'modify x to content X2'
  |/
  o  1 'modify x to content X1'
  |
  o  0 'add x with content X0'

Differential Revision: https://phab.mercurial-scm.org/D6604

(please test the fix)
Comment 9 Bugzilla 2019-07-19 00:00 UTC
Bug was set to TESTING for 7 days, resolving
Comment 10 Martin von Zweigbergk 2019-09-12 16:25 UTC
I'm still not sure what the right behavior is here, so I'll reopen it.
Comment 11 Bugzilla 2020-03-01 00:02 UTC
Bug was inactive for 150 days, archiving
Comment 12 Bugzilla 2020-08-01 00:01 UTC
Bug was inactive for 151 days, archiving
Comment 13 Bugzilla 2020-12-30 00:00 UTC
Bug was inactive for 151 days, archiving
Comment 14 Bugzilla 2021-06-05 00:00 UTC
Bug was inactive for 150 days, archiving