Bug 3084 - Files can co-exist as largefiles and regular files due to merge (and probably rebase)
Summary: Files can co-exist as largefiles and regular files due to merge (and probably...
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: unspecified
Hardware: All All
: normal bug
Assignee: Bugzilla
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-04 09:56 UTC by Na'Tosha Bard
Modified: 2015-01-22 15:04 UTC (History)
9 users (show)

See Also:
Python Version: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Na'Tosha Bard 2011-11-04 09:56 UTC
We ran into a very interesting bug here and I'm not sure what the proper fix
should be.  Test case:

$ hg init foo
$ cd foo/
$ echo "n1" > n1
$ hg add
adding n1
$ hg commit -m "Add as normal file"
$ echo "b1" > b1
$ hg add --large b1
$ hg commit -m "Add as largefile"
$ hg update -r 0
0 files updated, 0 files merged, 1 files removed, 0 files unresolved
getting changed largefiles
0 largefiles updated, 1 removed
$ echo "b2" > b1
$ hg add
adding b1
$ hg commit -m "Add as normal file"
created new head
$ hg merge
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
(branch merge, don't forget to commit)
getting changed largefiles
1 largefiles updated, 0 removed
$ hg commit -m "Merge two heads"

Now we have "b1" in the history, both as a regular file AND as a largefile.
 Of course this happens because Mercurial is tracking the standin, and not
the actual largefile, but *logically* they should be the same file.

Also, if I make a clone of foo, b1 is immediately modified according to
"status":

$ hg clone foo bar
updating to branch default
3 files updated, 0 files merged, 0 files removed, 0 files unresolved
getting changed largefiles
0 largefiles updated, 0 removed
$ cd bar
$ hg status
M b1

What should we do in this case?  Should we look for a name collision between
the files the standins represent and and regular files on the two sides of
the merge and abort the merge?
Comment 1 Matt Mackall 2011-11-04 11:58 UTC
Adding some clueful people to nosy
Comment 2 Na'Tosha Bard 2011-11-08 06:58 UTC
I think it makes the most sense to fix this by checking for name collisions
between largefiles and regular files in both heads, and aborting the merge
in the event of a name collision.  We can provide an error message
explaining that the user needs to fix one of the heads before he can merge.

Does this seem reasonable?
Comment 3 Greg Ward 2011-11-11 20:02 UTC
It's a merge conflict, so it makes sense to detect at merge time. I'd say
it's a bit more serious than divergent renames, so deserves more than a
warning. It should probably be treated like "local modified file which
remote deleted" -- demand an answer from the user before continuing.

Aborting and insisting that the user fix the conflict with another commit
would be less than optimal IMHO. I can't think of another case where
Mercurial makes you commit to fix a merge conflict.

Also, I suspect it's easy to cook this up with a working dir merge. Thought
experiment:

  # in clone 1:
  hg update
  echo large > a
  hg add --large
  hg commit -m"add a as a largefile"

  # in clone 2:
  hg update
  echo notlarge > a
  hg pull ../clone1
  hg update

That case should be handled the same as merging two heads where a is large
in one and normal in the other.
Comment 4 Martin Geisler 2011-11-30 05:10 UTC
When Na'Tosha told me about this bug, my initial thought was to treat it 
transparently as if both files had been largefiles from the beginning.

But now that I've tested it, I see that largefiles are not file-merged, 
instead you're asked

  largefile very-large has a merge conflict
  keep (l)ocal or take (o)ther?

But would it still make sense to "upgrade" the non-largefile to a largefile 
and then just prompt like that?
Comment 5 Na'Tosha Bard 2011-11-30 10:01 UTC
I think this would be an improvement over the current behavior.  Certainly I
would like to see this implemented rather than leaving it like it is.

I guess ideally, it would be nicest to ask the user if they want the
non-largefile file to be a largefile or if they want the largefile to be
downgraded to a non-largefile file (not likely to happen, but not
impossible), rather than assuming.  Then ask keep local or take other (or
maybe get fancy and detect if they are text then fire up the merge tools).

In any case, this is a really bad bug, and I think an initial patch with
your proposed solution would be a welcome improvement.
Comment 6 Martin Geisler 2011-12-05 08:22 UTC
The logic in mercurial.copies.copies is messed up because of largefiles and 
directory renames are also not detected correctly. I'm adding this here 
since I'm pretty sure it's related.

My test case for directory renames is below. Adding --debug to the second 
merge shows that it works more by chance than by design (I think) since it 
just treats the foo/b1 and foo/b2 largefiles as unknown and untracked files 
in the working copy.

  $ echo "[extensions]" >> $HGRCPATH
  $ echo "largefiles =" >> $HGRCPATH

Create the repository outside of $HOME since largefiles write to
$HOME/.cache/largefiles.

  $ hg init test
  $ cd test
  $ mkdir foo
  $ echo "n1" > foo/n1
  $ echo "b1" > foo/b1
  $ hg add foo/n1
  $ hg add --large foo/b1
  $ hg commit -m "Add files"

Rename

  $ hg rename foo bar   
  moving foo/n1 to bar/n1
  moving .hglf/foo/b1 to .hglf/bar/b1
  $ hg commit -m "Rename"

Create normal file under old directory name

  $ hg update 0
  2 files updated, 0 files merged, 2 files removed, 0 files unresolved
  getting changed largefiles
  1 largefiles updated, 0 removed
  $ echo "n2" > foo/n2
  $ hg add foo/n2
  $ hg commit -m "Add another normal file"
  created new head

  $ hg merge 1
  2 files updated, 0 files merged, 2 files removed, 0 files unresolved
  (branch merge, don't forget to commit)
  getting changed largefiles
  0 largefiles updated, 1 removed

  $ ls bar
  b1
  n1
  n2

Create big file under old directory name

  $ hg update 0 -C
  2 files updated, 0 files merged, 3 files removed, 0 files unresolved
  getting changed largefiles
  1 largefiles updated, 1 removed

  $ echo "b2" > foo/b2
  $ hg add foo/b2
  $ hg commit -m "Add another largefile"
  created new head

  $ hg merge 1
  4 files updated, 0 files merged, 2 files removed, 0 files unresolved
  (branch merge, don't forget to commit)
  getting changed largefiles
  0 largefiles updated, 0 removed

  $ ls bar
  b1
  b2
  n1
Comment 7 Martin Geisler 2011-12-16 04:18 UTC
I've sent a patch to the mailinglist and it's queued by Matt -- it should 
appear in the stable branch any time now.
Comment 8 HG Bot 2011-12-16 20:00 UTC
Fixed by http://selenic.com/repo/hg/rev/9036c7d106bf
Martin Geisler <mg@aragost.com>
largefiles: handle merges between normal files and largefiles (issue3084)

(please test the fix)
Comment 9 Bugzilla 2012-05-12 09:24 UTC

--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:24 EDT  ---

This bug was previously known as _bug_ 3084 at http://mercurial.selenic.com/bts/issue3084
Comment 10 HG Bot 2014-08-15 13:00 UTC
Fixed by http://selenic.com/repo/hg/rev/1dad76c0afb7
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
largefiles: add test for large/normal conflict at linear merging

Before this patch, there is no explicit test for it: test-issue3084.t
seems to test such conflict only at branch merging.

This patch uses "[debug] dirstate.delaywrite" feature for the tests
expecting "M" status of largefiles, to confirm certainly whether files
are marked unexpectedly as "clean".

(please test the fix)
Comment 11 HG Bot 2014-12-01 19:46 UTC
Fixed by http://selenic.com/repo/hg/rev/442bb30171db
Mads Kiilerich <madski@unity3d.com>
tests: clean-up of largefiles tests in test-issue3084.t

Prepare for adding more test cases to the systematic testing, moving the test
from ac3b3a2d976d to another section.

(please test the fix)
Comment 12 HG Bot 2014-12-01 19:46 UTC
Fixed by http://selenic.com/repo/hg/rev/c90d9ab6777a
Mads Kiilerich <madski@unity3d.com>
tests: add test-issue3084.t cases for 'changed but same' as for 'unchanged'

Use suffix -same for cases where file changed but content is the same - that is
the case where manifestmerge doesn't detect that a file is unchanged.

(The suffix -id is already used for cases where the file didn't change - that
is the trivial case where manifestmerge detects that the file is unchanged.)

These new tests are good but the results are bad. There shouldn't be any merge
conflicts or prompts when one side didn't change.

(please test the fix)
Comment 13 HG Bot 2014-12-10 18:31 UTC
Fixed by http://selenic.com/repo/hg/rev/38e55e55ae4d
Martin von Zweigbergk <martinvonz@google.com>
largefiles: rewrite merge code using dictionary with entry per file

In overridecalculateupdates(), we currently only deal with conflicts
that result in a 'g' action for either the largefile or a standin. We
will soon want to deal cases with 'cd' and 'dc' actions here. It will
be easier to reason about such cases if we rewrite it using a dict
from filename to action.

A side-effect of this change is that the output can only have one
action per file (which should be a good change). Before this change,
when one of the tests in test-issue3084 received this input (the 'a'
in the input was a result of 'cd' conflict resolved in favor of the
modified file):

  'g': [('.hglf/f', ('',), 'remote created')],
  'a': [('f', None, 'prompt keep')],

and the user chose to keep the local largefile, it produced this
output:


  'g': [('.hglf/f', ('',), 'remote created')],
  'r': [('f', None, 'replaced by standin')],
  'a': [('f', None, 'prompt keep')],

Although 'a' actions are processed after 'r' actions by
recordupdates(), it still worked because 'a' actions have no effect on
merges (only on updates). After this change, the output is:

  'g': [('.hglf/f', ('',), 'remote created')],
  'r': [('f', None, 'replaced by standin')],

Similarly, there are several tests in test-largefiles-update that get
inputs like:

  'a': [('.hglf/large2', None, 'prompt keep')],
  'g': [('large2', ('',), 'remote created')],

and when the user chooses to keep the local largefile, they produce
this output:

  'a': [('.hglf/large2', None, 'prompt keep'),
        ('.hglf/large2', None, 'keep standin')],
  'lfmr': [('large2', None, 'forget non-standin largefile')],

In this case, it was not a merge but an update, so the 'a' action does
have an effect. However, since dirstate.add() is idempotent, it still
has no obserable effect.

After this change, the output is:

  'a': [('.hglf/large2', None, 'keep standin')],
  'lfmr': [('large2', None, 'forget non-standin largefile')],

(please test the fix)
Comment 14 Matt Mackall 2015-01-22 15:04 UTC
Bulk testing -> fixed