Corrupted repositories on NFS

Jesper Noehr jesper at noehr.org
Sun Nov 28 18:30:34 CST 2010


On Mon, Nov 29, 2010 at 11:05 AM, Mads Kiilerich <mads at kiilerich.com> wrote:
> Jesper Noehr wrote, On 11/26/2010 06:01 AM:
>>
>> I modified
>> http://bitbucket.org/mirror/mercurial-crew/src/tip/mercurial/util.py#cl-593
>> (util.readlock) to return a dummy-string in case os.readlink raised
>> errno.ENOENT, triggering mercurials error.LockHeld, which seems to
>> have fixed that race condition.
>>
>> Secondly, unlinking on NFS is not atomic. The recommended way to go
>> about it is to 1. rename the file (which is atomic), and 2. unlink it.
>> Then you get the same guarantees you can get from a normal filesystem.
>> I've modified mercurial to rename, then unlink, in cases where it
>> deals with lockfiles. That fixes the other race.
>
> ...
>>
>> I'm chiming in here as I'm kind of in the dark whether this is an
>> actual bug in Mercurial, and whether my fix is actually "good."
>>
>> Any comments appreciated.
>
> Show us the patches ;-)

Okay:

diff -uar mercurial-crew/mercurial/lock.py
/home/jnoehr/env/lib/python2.6/site-packages/mercurial/lock.py
--- mercurial-crew/mercurial/lock.py    2010-11-15 16:55:36.000000000 -0600
+++ /home/jnoehr/env/lib/python2.6/site-packages/mercurial/lock.py
 2010-11-25 17:52:56.508009888 -0600
@@ -113,7 +113,8 @@
         # held, or can race and break valid lock.
         try:
             l = lock(self.f + '.break', timeout=0)
-            os.unlink(self.f)
+            #os.unlink(self.f)
+            util.unlock(self.f)
             l.release()
         except error.LockError:
             return locker
@@ -126,7 +127,8 @@
             if self.releasefn:
                 self.releasefn()
             try:
-                os.unlink(self.f)
+                util.unlock(self.f)
+#                os.unlink(self.f)
             except OSError:
                 pass


diff -uar mercurial-crew/mercurial/util.py
/home/jnoehr/env/lib/python2.6/site-packages/mercurial/util.py
--- mercurial-crew/mercurial/util.py    2010-11-15 16:55:36.000000000 -0600
+++ /home/jnoehr/env/lib/python2.6/site-packages/mercurial/util.py
 2010-11-25 19:10:20.987009938 -0600
@@ -19,6 +19,11 @@
 import os, stat, time, calendar, textwrap, unicodedata, signal
 import imp, socket

+def unlock(f):
+    tmp = '%s.fancylock' % f
+    os.rename(f, tmp)
+    os.unlink(tmp)
+
 # Python compatibility

 def sha1(s):
@@ -594,7 +599,9 @@
     try:
         return os.readlink(pathname)
     except OSError, why:
-        if why.errno not in (errno.EINVAL, errno.ENOSYS):
+        if why.errno == errno.ENOENT:
+            return 'dummy'
+        elif why.errno not in (errno.EINVAL, errno.ENOSYS):
             raise
     except AttributeError: # no symlink in os
         pass


Jesper


More information about the Mercurial-devel mailing list