[PATCH 7 of 7] context: avoid writing outdated dirstate out (issue5584)

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Fri Jun 9 00:08:27 EDT 2017


# HG changeset patch
# User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
# Date 1496981269 -32400
#      Fri Jun 09 13:07:49 2017 +0900
# Node ID 664e3e012535c9d161f0ecb95c6dc0bfdfe9c672
# Parent  9b4ff10f9db00ae7d91a28067d6b50655f5f957d
context: avoid writing outdated dirstate out (issue5584)

Before this patch, workingctx.status() may cause writing outdated
dirstate out, if:

  - .hg/dirstate is changed simultaneously after last loading it,
  - there is any file, which should be dirstate.normal()-ed

Typical issue case is:

  - the working directory is updated by "hg update"
  - .hg/dirstate is updated in background (e.g. fsmonitor)

This patch compares identities of dirstate before and after
acquisition of wlock, and avoids writing outdated dirstate out, if
change of .hg/dirstate is detected.

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -1763,19 +1763,30 @@ class workingctx(committablectx):
         # update dirstate for files that are actually clean
         if fixup:
             try:
+                oldid = self._repo.dirstate.identity()
+
                 # updating the dirstate is optional
                 # so we don't wait on the lock
                 # wlock can invalidate the dirstate, so cache normal _after_
                 # taking the lock
                 with self._repo.wlock(False):
-                    normal = self._repo.dirstate.normal
-                    for f in fixup:
-                        normal(f)
-                    # write changes out explicitly, because nesting
-                    # wlock at runtime may prevent 'wlock.release()'
-                    # after this block from doing so for subsequent
-                    # changing files
-                    self._repo.dirstate.write(self._repo.currenttransaction())
+                    if self._repo.dirstate.identity() == oldid:
+                        normal = self._repo.dirstate.normal
+                        for f in fixup:
+                            normal(f)
+                        # write changes out explicitly, because nesting
+                        # wlock at runtime may prevent 'wlock.release()'
+                        # after this block from doing so for subsequent
+                        # changing files
+                        tr = self._repo.currenttransaction()
+                        self._repo.dirstate.write(tr)
+                    else:
+                        # in this case, writing changes out breaks
+                        # consistency, because .hg/dirstate was
+                        # already changed simultaneously after last
+                        # caching (see also issue5584 for detail)
+                        self._repo.ui.debug('skip updating dirstate: '
+                                            'identity mismatch\n')
             except error.LockError:
                 pass
         return modified, deleted, fixup
diff --git a/tests/test-dirstate-race.t b/tests/test-dirstate-race.t
--- a/tests/test-dirstate-race.t
+++ b/tests/test-dirstate-race.t
@@ -95,3 +95,65 @@ anyway.
   ! d
   ! dir1/c
   ! e
+
+  $ rmdir d e
+  $ hg update -C -q .
+
+Test that dirstate changes aren't written out at the end of "hg
+status", if .hg/dirstate is already changed simultaneously before
+acquisition of wlock in workingctx._checklookup().
+
+This avoidance is important to keep consistency of dirstate in race
+condition (see issue5584 for detail).
+
+  $ hg parents -q
+  1:* (glob)
+
+  $ hg debugrebuilddirstate
+  $ hg debugdirstate
+  n   0         -1 unset               a
+  n   0         -1 unset               b
+  n   0         -1 unset               d
+  n   0         -1 unset               dir1/c
+  n   0         -1 unset               e
+
+  $ cat > $TESTTMP/dirstaterace.sh <<EOF
+  > # This script assumes timetable of typical issue5584 case below:
+  > #
+  > # 1. "hg status" loads .hg/dirstate
+  > # 2. "hg status" confirms clean-ness of FILE
+  > # 3. "hg update -C 0" updates the working directory simultaneously
+  > #    (FILE is removed, and FILE is dropped from .hg/dirstate)
+  > # 4. "hg status" acquires wlock
+  > #    (.hg/dirstate is re-loaded = no FILE entry in dirstate)
+  > # 5. "hg status" marks FILE in dirstate as clean
+  > #    (FILE entry is added to in-memory dirstate)
+  > # 6. "hg status" writes dirstate changes into .hg/dirstate
+  > #    (FILE entry is written into .hg/dirstate)
+  > #
+  > # To reproduce similar situation easily and certainly, #2 and #3
+  > # are swapped.  "hg cat" below ensures #2 on "hg status" side.
+  > 
+  > hg update -q -C 0
+  > hg cat -r 1 b > b
+  > EOF
+
+"hg status" below should excludes "e", of which exec flag is set, for
+portability of test scenario, because unsure but missing "e" is
+treated differently in _checklookup() according to runtime platform.
+
+- "missing(!)" on POSIX, "pctx[f].cmp(self[f])" raises ENOENT
+- "modified(M)" on Windows, "self.flags(f) != pctx.flags(f)" is True
+
+  $ hg status --config extensions.dirstaterace=$TESTTMP/dirstaterace.py --debug -X path:e
+  skip updating dirstate: identity mismatch
+  M a
+  ! d
+  ! dir1/c
+
+  $ hg parents -q
+  0:* (glob)
+  $ hg files
+  a
+  $ hg debugdirstate
+  n * * * a (glob)


More information about the Mercurial-devel mailing list