Corrupted repositories on NFS

Nicolas Dumazet nicdumz at gmail.com
Mon Nov 29 04:50:08 CST 2010


2010/11/26 Jesper Noehr <jesper at noehr.org>:
> I'm chiming in here as I'm kind of in the dark whether this is an
> actual bug in Mercurial, and whether my fix is actually "good."
>
> Any comments appreciated.

It's not the first time I hear about NFS bugs.

In reality the second most common troubleshooting question on IRC for
strange bugs, after "do you use inotify", is usually: do you access
your repo over NFS, or use any strange filesystem? ;)


1) First patch looks OK. Returning a dummy value (-1? 'race'?) will
force testlock into a ValueError, effectively raising LockHeld and
forcing us to retry creating a lock.

2) I'm not an NFS expert, but googling seems to show that the atomic
way to delete a file should indeed be a rename+unlink, so the second
(a cleaner version ;) patch makes sense to me. As a note, lock.release
swallows OSErrors after an unlink, we probably want to include this in
util.unlock?

The trace you're seeing, however, does not look too good. Supposedly,
"wctx = self[None]; merge = len(wctx.parents()) > 1" sequence happens
while we hold the wlock: we should be the only ones operating on the
repo...
Can you try to log more, to see what's happening?

>
>
> Jesper
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>



-- 
Nicolas Dumazet — NicDumZ


More information about the Mercurial-devel mailing list