[PATCH 2 of 2 LOG-SPEED-FIX] repoview: cache filtered changelog

Pierre-Yves David pierre-yves.david at ens-lyon.org
Fri Jan 18 16:49:57 CST 2013


# HG changeset patch
# User Pierre-Yves David <pierre-yves.david at logilab.fr>
# Date 1358549012 -3600
# Node ID 8de68237ff6885eac13b2c1d64eb3ad8c4cae562
# Parent  b6f71d62f8f7a1fb027891ee0acee2e3a015d8b9
repoview: cache filtered changelog

Creating a new changelog object for each access is costly and prevents efficient
caching changelog side. This introduced a x5 performance regression in log
because chunk read from disk were never reused. We were jumping from about 100
disk read to about 20 000.

This changeset introduce a simple cache mechanism that help the last changelog
object created by a repoview. The changelog is reused until the changelog or the
filtering changes.

The cache invalidation is much more complicated than it should be. But MQ test
show a strange cache desync. I was unable to track down the source of this
desync in descent time so I'm not sure if the issue is in MQ or core. However
given the proximity to the 2.5 freeze, I'm choosing the inelegant but safe route
that makes the cache mechanism safer.

diff --git a/mercurial/repoview.py b/mercurial/repoview.py
--- a/mercurial/repoview.py
+++ b/mercurial/repoview.py
@@ -153,20 +153,43 @@ class repoview(object):
     """
 
     def __init__(self, repo, filtername):
         object.__setattr__(self, '_unfilteredrepo', repo)
         object.__setattr__(self, 'filtername', filtername)
+        object.__setattr__(self, '_clcachekey', None)
+        object.__setattr__(self, '_clcache', None)
 
     # not a cacheproperty on purpose we shall implement a proper cache later
     @property
     def changelog(self):
         """return a filtered version of the changeset
 
         this changelog must not be used for writing"""
         # some cache may be implemented later
-        cl = copy.copy(self._unfilteredrepo.changelog)
-        cl.filteredrevs = filterrevs(self._unfilteredrepo, self.filtername)
+        unfi = self._unfilteredrepo
+        unfichangelog = unfi.changelog
+        revs = filterrevs(unfi, self.filtername)
+        cl = self._clcache
+        newkey = (len(unfichangelog), unfichangelog.tip(), hash(revs))
+        if cl is not None:
+            # we need to check curkey too for some obscure reason.
+            # MQ test show a corruption of the underlying repo (in _clcache)
+            # without change in the cachekey.
+            oldfilter = cl.filteredrevs
+            try:
+                cl.filterrevs = ()  # disable filtering for tip
+                curkey = (len(cl), cl.tip(), hash(oldfilter))
+            finally:
+                cl.filteredrevs = oldfilter
+            if newkey != self._clcachekey or newkey != curkey:
+                cl = None
+        # could have been made None by the previous if
+        if cl is None:
+            cl = copy.copy(unfichangelog)
+            cl.filteredrevs = revs
+            object.__setattr__(self, '_clcache', cl)
+            object.__setattr__(self, '_clcachekey', newkey)
         return cl
 
     def unfiltered(self):
         """Return an unfiltered version of a repo"""
         return self._unfilteredrepo


More information about the Mercurial-devel mailing list