[PATCH 2 of 2 v2 stable] rbc: fix superfluous rebuilding from scratch - don't abuse self._rbcnamescount

Augie Fackler raf at durin42.com
Wed Jul 20 11:19:56 EDT 2016


On Wed, Jul 20, 2016 at 12:20:05AM +0200, Mads Kiilerich wrote:
> # HG changeset patch
> # User Mads Kiilerich <madski at unity3d.com>
> # Date 1468873509 -7200
> #      Mon Jul 18 22:25:09 2016 +0200
> # Branch stable
> # Node ID db137cbc98afc3a640d31a05f5d2ead333956922
> # Parent  33a9257663527f333478cc9a9e12bcdfa68e1b85
> rbc: fix superfluous rebuilding from scratch - don't abuse self._rbcnamescount

queued these for stable, thanks

>
> The code used self._rbcnamescount as if it was the length of self._names ...
> but actually it is just the number of good entries on disk. This caused the
> cache to be populated inefficiently. In some cases very inefficiently.
>
> Instead of checking the length before lookup, just try a lookup in self._names
> - that is also in most cases faster.
>
> Comments and debug messages are tweaked to help understanding the issue
> and the fix.
>
> diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py
> --- a/mercurial/branchmap.py
> +++ b/mercurial/branchmap.py
> @@ -358,7 +358,7 @@ class revbranchcache(object):
>          self._repo = repo
>          self._names = [] # branch names in local encoding with static index
>          self._rbcrevs = array('c') # structs of type _rbcrecfmt
> -        self._rbcsnameslen = 0
> +        self._rbcsnameslen = 0 # length of names read at _rbcsnameslen
>          try:
>              bndata = repo.vfs.read(_rbcnames)
>              self._rbcsnameslen = len(bndata) # for verification before writing
> @@ -380,7 +380,8 @@ class revbranchcache(object):
>                                 len(repo.changelog))
>          if self._rbcrevslen == 0:
>              self._names = []
> -        self._rbcnamescount = len(self._names) # number of good names on disk
> +        self._rbcnamescount = len(self._names) # number of names read at
> +                                               # _rbcsnameslen
>          self._namesreverse = dict((b, r) for r, b in enumerate(self._names))
>
>      def _clear(self):
> @@ -416,13 +417,17 @@ class revbranchcache(object):
>          if cachenode == '\0\0\0\0':
>              pass
>          elif cachenode == reponode:
> -            if branchidx < self._rbcnamescount:
> +            try:
>                  return self._names[branchidx], close
> -            # referenced branch doesn't exist - rebuild is expensive but needed
> -            self._repo.ui.debug("rebuilding corrupted revision branch cache\n")
> -            self._clear()
> +            except IndexError:
> +                # recover from invalid reference to unknown branch
> +                self._repo.ui.debug("referenced branch names not found"
> +                    " - rebuilding revision branch cache from scratch\n")
> +                self._clear()
>          else:
>              # rev/node map has changed, invalidate the cache from here up
> +            self._repo.ui.debug("history modification detected - truncating "
> +                "revision branch cache to revision %s\n" % rev)
>              truncate = rbcrevidx + _rbcrecsize
>              del self._rbcrevs[truncate:]
>              self._rbcrevslen = min(self._rbcrevslen, truncate)
> diff --git a/tests/test-branches.t b/tests/test-branches.t
> --- a/tests/test-branches.t
> +++ b/tests/test-branches.t
> @@ -587,6 +587,8 @@ recovery from invalid cache file with so
>    $ f --size .hg/cache/rbc-revs*
>    .hg/cache/rbc-revs-v1: size=120
>    $ hg log -r 'branch(.)' -T '{rev} ' --debug
> +  history modification detected - truncating revision branch cache to revision 13
> +  history modification detected - truncating revision branch cache to revision 1
>    3 4 8 9 10 11 12 13 truncating cache/rbc-revs-v1 to 8
>    $ rm -f .hg/cache/branch* && hg head a -T '{rev}\n' --debug
>    5
> @@ -632,7 +634,7 @@ situation where the cache is out of sync
>  cache is rebuilt when corruption is detected
>    $ echo > .hg/cache/rbc-names-v1
>    $ hg log -r '5:&branch(.)' -T '{rev} ' --debug
> -  rebuilding corrupted revision branch cache
> +  referenced branch names not found - rebuilding revision branch cache from scratch
>    8 9 10 11 12 13 truncating cache/rbc-revs-v1 to 40
>    $ f --size --hexdump .hg/cache/rbc-*
>    .hg/cache/rbc-names-v1: size=79
> @@ -688,16 +690,13 @@ Test for multiple incorrect branch cache
>    0010: 56 46 78 69 00 00 00 01                         |VFxi....|
>    $ : > .hg/cache/rbc-revs-v1
>
> +No superfluous rebuilding of cache:
>    $ hg log -r "branch(null)&branch(branch)" --debug
> -  rebuilding corrupted revision branch cache
> -  rebuilding corrupted revision branch cache
> -  truncating cache/rbc-revs-v1 to 8
> -BUG: the cache was declared corrupt multiple times and not fully rebuilt:
>    $ f --size --hexdump .hg/cache/rbc-*
>    .hg/cache/rbc-names-v1: size=14
>    0000: 64 65 66 61 75 6c 74 00 62 72 61 6e 63 68       |default.branch|
>    .hg/cache/rbc-revs-v1: size=24
> -  0000: 00 00 00 00 00 00 00 00 fa 4c 04 e5 00 00 00 00 |.........L......|
> +  0000: 66 e5 f5 aa 00 00 00 00 fa 4c 04 e5 00 00 00 00 |f........L......|
>    0010: 56 46 78 69 00 00 00 01                         |VFxi....|
>
>    $ cd ..
> diff --git a/tests/test-rebase-conflicts.t b/tests/test-rebase-conflicts.t
> --- a/tests/test-rebase-conflicts.t
> +++ b/tests/test-rebase-conflicts.t
> @@ -302,6 +302,7 @@ Check that the right ancestors is used w
>    bundle2-input-part: total payload size 1713
>    bundle2-input-bundle: 0 parts total
>    invalid branchheads cache (served): tip differs
> +  history modification detected - truncating revision branch cache to revision 9
>    rebase completed
>    updating the branch cache
>    truncating cache/rbc-revs-v1 to 72
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list