Corruption issue from filesystem exception.

Sune Foldager cryo at cyanite.org
Fri Feb 13 04:34:36 CST 2009


Some follow-up thoughts.

The problem could well be a virus checker that watches all newly created
files, causing it to pick up the tmpfoo file that mkstemp creates. This
prevents the subsequent unlink, since the file is still open. Another
possibility is to use mktemp instead, which doesn't create the file. I don't
see what is gained from creating the file? If one is worried about the file
names colliding, we could add handling for that instead.

-- Sune.


------------------------------------

A little update on my search for the cause of the corruption/inconvenience
problems we encounter in test on Windows.

The problem stems from util.rename, which has some special code to handle
that Windows can't rename a file onto an existing one, here without
comments:

    try:
        os.rename(src, dst)
    except OSError, err:
        fd, temp = tempfile.mkstemp(dir=os.path.dirname(dst) or '.')
        os.close(fd)
        os.unlink(temp)
        os.rename(dst, temp)
        os.unlink(temp)
        os.rename(src, dst)

The initial rename always fails on Windows (except when the target file
doesn't exist). The problematic rename is the second. From time to time (but
actually quite often; about _once per day_ currently, in our medium-scale
testing) it fails. This causes an "abort: Cannot create a file when that
file already exists.". It happens with commit and pull (often) and rebase
(only twice in total). Only the rebase scenario causes corruption (since it
involves .i and .d files). For commit and pull it's always the dirstate file
that it fails to rename. The .hg dir then contains:

    .dirstate-foo
    dirstate

and no tmpfoo. This clearly points the the second rename, os.rename(dst,
temp), as the culprit. I don't know what the problem is; I can only
speculate that the os.unlink is somehow delayed a few ms or so, preventing
the rename from going through. I propose inserting some more try-except with
a short sleep perhaps, to remove the problem. I am planning to test that
myself shortly.

-- 
Sune Foldager




More information about the Mercurial-devel mailing list