[issue1456] Permission changes cause corrupted dirstate with update

Tom Karzes mercurial-bugs at selenic.com
Tue Jan 6 18:09:50 CST 2009


New submission from Tom Karzes <Tom.Karzes at magnumsemi.com>:

This is a pretty serious bug which can cause lost work.  We've noticed the
effects for some time, and I finally managed to identify the cause.  This bug
exists in versions 1.1.2 and earlier (I tested as far back as 1.0.1), but it
became harder to reproduce starting with 1.1 due to a change that was made to
manifestmerge().

The symptom of the bug is that Mercurial loses track of the fact that a working
copy of a file has been modified, resulting in "hg status" failing to show the
change, "hg commit" failing to commit it, etc.  This is due to the dirstate file
being incorrectly updated.

The following scenario will reproduce the bug in versions 1.0.1 through 1.1.2 of
Mercurial.  This example was run on Redhat Linux, and my umask value was 022.

First, create a repository, and cd to it:

    % hg init testrepo
    % cd testrepo

Now create a file foo.txt, and add and commit it.  Its protection is 644:

    % echo foo > foo.txt
    % ls -l foo.txt
    -rw-r--r--  1 tkarzes vpgrp 4 Jan  6 15:10 foo.txt
    % hg add foo.txt
    % hg ci -m zzz

Now change the protection to 755 and commit it as a new change set:

    % chmod 755 foo.txt
    % ls -l foo.txt
    -rwxr-xr-x  1 tkarzes vpgrp 4 Jan  6 15:10 foo.txt
    % hg ci -m zzz

Now revert back to the previous revision:

    % hg update -r 0
    0 files updated, 0 files merged, 0 files removed, 0 files unresolved
    % ls -l foo.txt
    -rw-r--r--  1 tkarzes vpgrp 4 Jan  6 15:10 foo.txt
    %

Now make a genuine content change to the file, but don't commit it:

    % echo yyy >> foo.txt

Age the change, to make it trust its cached info:

    % sleep 10

At this point we're updated to rev 0, and the contents of foo.txt are modified.
 Everything is still ok:

    % ls -l foo.txt
    -rw-r--r--  1 tkarzes vpgrp 8 Jan  6 15:11 foo.txt
    % hg status
    M foo.txt
    %

Now we hit the bug by doing an "hg update".  This should change the protection
of foo.txt, but still know that its contents are modified:

    % hg update
    0 files updated, 0 files merged, 0 files removed, 0 files unresolved
    % ls -l foo.txt
    -rwxr-xr-x  1 tkarzes vpgrp 8 Jan  6 15:11 foo.txt
    %

But dirstate has been corrupted, and it has lost track of the fact that foo.txt
has been changed:

    % hg status
    %

The change is still there though:  The "yyy" line is still there:

    % cat foo.txt
    foo
    yyy
    %

If you change the modification date, Mercurial ignores the dirstate info and
suddenly realizes it's been changed again:

    % touch foo.txt
    % hg status
    M foo.txt
    %

I believe the problem is due to some faulty logic in manifestmerge() in
merge.py, where it compares manifests.  Here is an excerpt from an if-else chain
where it tries to determine if a file has changed:

    # are files different?
    if n != m2[f]:
        a = ma.get(f, nullid)
        # are we clobbering?
        if overwrite:
            act("clobbering", "g", f, rflags)
        # or are we going back in time and clean?
        elif backwards:
            if not n[20:] or not p2[f].cmp(p1[f].data()):
                act("reverting", "g", f, rflags)
        # are both different from the ancestor?
        elif n != a and m2[f] != a:
            act("versions differ", "m", f, f, f, rflags, False)
        # is remote's version newer?
        elif m2[f] != a:
            act("remote is newer", "g", f, rflags)
        # local is newer, not overwrite, check mode bits
        elif m1.flags(f) != rflags:
            act("update permissions", "e", f, rflags)
    # contents same, check mode bits
    elif m1.flags(f) != rflags:
        act("update permissions", "e", f, rflags)

In release 1.1, the "backwards" case was changed to intercept all backwards
cases, which appears to prevent the bug in the backwards case (which includes
updating to the parent revision), but the forward case still has the bug.  In
this case, it hits the "local is newer, not overwrite, check mode bits" case. 
If the mode bits have changed, it returns "e", ignoring the contents of the
file.  This fails if both the mode and the contents have changed, in which case
it updates the mode and then disregards the content change, resulting in a
corrupt repository state.

Part of the difficulty with this bug is that it's not clear to me what the
design intent is for handling permission changes to files.  If you clone a
repository, then change the permission of a file, "hg status" shows the file as
modified.  However, if you then "hg update", the file permission change is
silently reverted.  This appears to be intentional, but I find it unintuitive. 
Regardless of this, it should never lose track of the fact that the contents of
the working copy of a file have been changed, which is what's happening here.

This bug occurs quite often for us because we have people editing Unix files
from their Windows PCs, which in many cases causes mode bit changes as a side
effect.

----------
messages: 8340
nosy: tkarzes
priority: bug
status: unread
title: Permission changes cause corrupted dirstate with update
topic: merge, update

____________________________________________________
Mercurial issue tracker <mercurial-bugs at selenic.com>
<http://www.selenic.com/mercurial/bts/issue1456>
____________________________________________________



More information about the Mercurial-devel mailing list