[PATCH 2 of 2] branchmap: recover from out of sync revs and names cache (issue5058)

Gregory Szorc gregory.szorc at gmail.com
Sat Mar 12 20:13:26 EST 2016


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1457831565 28800
#      Sat Mar 12 17:12:45 2016 -0800
# Node ID 0f9104a35c8614eaff672dc9e10badb318db09f6
# Parent  cd89ed3be2b6f5db1d716bc0ffea4ee5480619ac
branchmap: recover from out of sync revs and names cache (issue5058)

There was a bug report of an IndexError during self._names[branchidx].
This could happen if the revs cache contained a reference to a name
at an index that doesn't exist in the names cache. Whether this can
happen reliably or whether it requires filesystem or similar
corruption, I'm not sure.

This patch catches the IndexError and clears the cache.

It feels like blowing away both caches is appropriate because if
one entry in the revs cache points to an invalid entry in the names
cache, then there is no telling how out of sync the two caches are.
It's possible the names cache is completely unrelated to the revs
cache (as unlikely as this hopefully is).

The added test reproduces the stack reported on the bug without the
corresponding code change.

diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py
--- a/mercurial/branchmap.py
+++ b/mercurial/branchmap.py
@@ -403,17 +403,32 @@ class revbranchcache(object):
         cachenode, branchidx = unpack(
             _rbcrecfmt, buffer(self._rbcrevs, rbcrevidx, _rbcrecsize))
         close = bool(branchidx & _rbccloseflag)
         if close:
             branchidx &= _rbcbranchidxmask
         if cachenode == '\0\0\0\0':
             pass
         elif cachenode == reponode:
-            return self._names[branchidx], close
+            try:
+                return self._names[branchidx], close
+            except IndexError:
+                # If the names list is out of sync with the revs cache,
+                # all bets are off. Blow away both caches.
+                self._repo.ui.debug('revbranch cache references unknown name '
+                                    'index; resetting\n')
+                self._names = []
+                self._rbcnamescount = 0
+                self._rbcsnameslen = 0
+                self._namesreverse = {}
+                self._rbcrevs = array('c')
+                self._rbcrevs.extend('\0' * (rev * _rbcrecsize))
+                self._rbcrevslen = 0
+
+                return self._branchinfo(rev)
         else:
             # rev/node map has changed, invalidate the cache from here up
             truncate = rbcrevidx + _rbcrecsize
             del self._rbcrevs[truncate:]
             self._rbcrevslen = min(self._rbcrevslen, truncate)
 
         # fall back to slow path and make sure it will be written to disk
         return self._branchinfo(rev)
diff --git a/tests/test-branches.t b/tests/test-branches.t
--- a/tests/test-branches.t
+++ b/tests/test-branches.t
@@ -618,15 +618,43 @@ contain the branch name even though it n
   0000: 19 70 9c 5a 00 00 00 00 dd 6b 44 0d 00 00 00 01 |.p.Z.....kD.....|
   0010: 88 1f e2 b9 00 00 00 01 ac 22 03 33 00 00 00 02 |.........".3....|
   0020: ae e3 9c d1 00 00 00 02 d8 cb c6 1d 00 00 00 01 |................|
   0030: 58 97 36 a2 00 00 00 03 10 ff 58 95 00 00 00 04 |X.6.......X.....|
   0040: ee bb 94 44 00 00 00 02 5f 40 61 bb 00 00 00 02 |...D...._ at a.....|
   0050: bf be 84 1b 00 00 00 02 d3 f1 63 45 80 00 00 02 |..........cE....|
   0060: e3 d4 9c 05 80 00 00 02 e2 3b 55 05 00 00 00 02 |.........;U.....|
   0070: f8 94 c2 56 80 00 00 03                         |...V....|
+
+recovery when rbc-names cache is missing entries
+
+  $ echo "a" > .hg/cache/rbc-names-v1
+
+  $ hg log -r 'branch(a)' -T '{rev}\n' --debug
+  revbranch cache references unknown name index; resetting
+  1
+  2
+  5
+  truncating cache/rbc-revs-v1 to 0
+  $ f --size --hexdump .hg/cache/rbc-*
+  .hg/cache/rbc-names-v1: size=79
+  0000: 61 00 62 00 63 00 61 20 62 72 61 6e 63 68 20 6e |a.b.c.a branch n|
+  0010: 61 6d 65 20 6d 75 63 68 20 6c 6f 6e 67 65 72 20 |ame much longer |
+  0020: 74 68 61 6e 20 74 68 65 20 64 65 66 61 75 6c 74 |than the default|
+  0030: 20 6a 75 73 74 69 66 69 63 61 74 69 6f 6e 20 75 | justification u|
+  0040: 73 65 64 20 62 79 20 62 72 61 6e 63 68 65 73    |sed by branches|
+  .hg/cache/rbc-revs-v1: size=120
+  0000: 00 00 00 00 00 00 00 00 dd 6b 44 0d 00 00 00 00 |.........kD.....|
+  0010: 88 1f e2 b9 00 00 00 00 ac 22 03 33 00 00 00 01 |.........".3....|
+  0020: ae e3 9c d1 00 00 00 01 d8 cb c6 1d 00 00 00 00 |................|
+  0030: 58 97 36 a2 00 00 00 02 10 ff 58 95 00 00 00 03 |X.6.......X.....|
+  0040: ee bb 94 44 00 00 00 01 5f 40 61 bb 00 00 00 01 |...D...._ at a.....|
+  0050: bf be 84 1b 00 00 00 01 d3 f1 63 45 80 00 00 01 |..........cE....|
+  0060: e3 d4 9c 05 80 00 00 01 e2 3b 55 05 00 00 00 01 |.........;U.....|
+  0070: f8 94 c2 56 80 00 00 02                         |...V....|
+
 cache is updated/truncated when stripping - it is thus very hard to get in a
 situation where the cache is out of sync and the hash check detects it
   $ hg --config extensions.strip= strip -r tip --nob
   $ f --size .hg/cache/rbc-revs*
   .hg/cache/rbc-revs-v1: size=112
 
   $ cd ..


More information about the Mercurial-devel mailing list