[PATCH 1 of 2] transplant: avoid a dirstate race when transplanting multiple changesets

Sat Jan 29 15:11:46 CST 2011

# HG changeset patch
# User Greg Ward <greg-hg at gerg.ca>
# Date 1295964871 18000
# Branch stable
# Node ID 10d88a8557f9eaa7e681bba661d49cf6081b5e5b
# Parent  8dc488dfcdb4e23d929d12cfab4037c77d2e227f
transplant: avoid a dirstate race when transplanting multiple changesets
(issue2264, issue2516)

The race happens when adjacent transplanted changesets change the same
file but keep its size the same.  When those changesets are applied to
the working dir before re-committing them, the mtime is almost
certainly the same, so dirstate fails to notice the change.  The
result depends on circumstances: if the transplanted changesets affect
only one file, then we get a big noisy crash: "RuntimeError: nothing
committed after transplant".  But if other files are included, we get
subtle data loss, i.e. the file whose size does not change is silently
dropped from the second transplant.

I couldn't think of a way to fix this in dirstate, which means other
extensions or scripts that do multiple commits in rapid succession
could suffer the same problem.  But it's not too hard to fix in
transplant: just mark each file involved in a patch as "normal
lookup", forcing repo.status() to work harder when committing the next
transplant in the series.

diff --git a/hgext/transplant.py b/hgext/transplant.py
--- a/hgext/transplant.py
+++ b/hgext/transplant.py
@@ -177,6 +177,16 @@
             lock.release()
             wlock.release()
 
+        # Extra after-the-fact check for one of the symptoms of
+        # issue2264: if this is raised, it's too late and we've already
+        # committed bad changesets.  Not sure if it's worth the overhead
+        # of a status() call.
+        (modified, added, removed) = repo.status()[0:3]
+        if modified or added or removed:
+            raise RuntimeError('uncommitted changes left in working dir '
+                               'after transplanting %d changesets\n'
+                               % len(revs))
+
     def filter(self, filter, changelog, patchfile):
         '''arbitrarily rewrite changeset before applying it'''
 
@@ -252,6 +262,12 @@
             m = match.exact(repo.root, '', files)
 
         n = repo.commit(message, user, date, extra=extra, match=m)
+
+        # force repo.status() to look harder at each file in this patch
+        # when committing the next patch (avoids a dirstate race)
+        for fn in files:
+            repo.dirstate.normallookup(fn)
+
         if not n:
             # Crash here to prevent an unclear crash later, in
             # transplants.write().  This can happen if patch.patch()
diff --git a/tests/test-transplant-multiple.t b/tests/test-transplant-multiple.t
new file mode 100644
--- /dev/null
+++ b/tests/test-transplant-multiple.t
@@ -0,0 +1,80 @@
+# reproduce issue2264, issue2516 (thanks to issue2516 for the original
+# script)
+
+create test repo
+  $ cat <<EOF >> $HGRCPATH
+  > [extensions]
+  > transplant =
+  > graphlog =
+  > EOF
+  $ hg init repo
+  $ cd repo
+  $ template="{rev}  {desc|firstline}  [{branches}]\n"
+
+# we need to start out with two changesets on the default branch
+# in order to avoid the cute little optimization where transplant
+# pulls rather than transplants
+add initial changesets
+  $ echo feature1 > file1
+  $ hg ci -Am"feature 1"
+  adding file1
+  $ echo feature2 >> file2
+  $ hg ci -Am"feature 2"
+  adding file2
+
+# The changes to 'bugfix' are enough to show the bug: in fact, with only
+# those changes, it's a very noisy crash ("RuntimeError: nothing
+# committed after transplant").  But if we modify a second file in the
+# transplanted changesets, the bug is much more subtle: transplant
+# silently drops the second change to 'bugfix' on the floor, and we only
+# see it when we run 'hg status' after transplanting.  Subtle data loss
+# bugs are worse than crashes, so reproduce the subtle case here.
+commit bug fixes on bug fix branch
+  $ hg branch fixes
+  marked working directory as branch fixes
+  $ echo fix1 > bugfix
+  $ echo fix1 >> file1
+  $ hg ci -Am"fix 1"
+  adding bugfix
+  $ echo fix2 > bugfix
+  $ echo fix2 >> file1
+  $ hg ci -Am"fix 2"
+  $ hg glog --template="$template"
+  @  3  fix 2  [fixes]
+  |
+  o  2  fix 1  [fixes]
+  |
+  o  1  feature 2  []
+  |
+  o  0  feature 1  []
+  
+transplant bug fixes onto release branch
+  $ hg update 0
+  1 files updated, 0 files merged, 2 files removed, 0 files unresolved
+  $ hg branch release
+  marked working directory as branch release
+  $ hg transplant 2 3
+  applying [0-9a-f]{12} (re)
+  [0-9a-f]{12} transplanted to [0-9a-f]{12} (re)
+  applying [0-9a-f]{12} (re)
+  [0-9a-f]{12} transplanted to [0-9a-f]{12} (re)
+  $ hg glog --template="$template"
+  @  5  fix 2  [release]
+  |
+  o  4  fix 1  [release]
+  |
+  | o  3  fix 2  [fixes]
+  | |
+  | o  2  fix 1  [fixes]
+  | |
+  | o  1  feature 2  []
+  |/
+  o  0  feature 1  []
+  
+  $ hg status
+  $ hg status --rev 0:4
+  M file1
+  A bugfix
+  $ hg status --rev 4:5
+  M bugfix
+  M file1