[PATCH 1 of 6 V2] obsstore: add a 'cachekey' method

Pierre-Yves David pierre-yves.david at ens-lyon.org
Sat May 20 15:30:15 UTC 2017


# HG changeset patch
# User Pierre-Yves David <pierre-yves.david at octobus.net>
# Date 1495191830 -7200
#      Fri May 19 13:03:50 2017 +0200
# Node ID 221be1ef98902fa695a709371f75e63f9b3e950a
# Parent  566cfe9cbbb9b163bb58c8666759a634badacdd7
# EXP-Topic obscache
# Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/
#              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ -r 221be1ef9890
obsstore: add a 'cachekey' method

Parsing the full obsstore is slow, so cache that depends on obsstore content
wants a way to know if the obsstore changed, and it this change was append only.

For this purpose we introduce an official cachekey for the obsstore. This cache
key work in a way similar to the '(tiprev, tipnode)' pair used for the
changelog. We use the size of the obsstore file and the hash of its tail. That
way, we can check if the obsstore grew and if the content we knew is still
present in the obsstore.

This will be used in later changeset to cache related to the obsolete property.

diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
--- a/mercurial/obsolete.py
+++ b/mercurial/obsolete.py
@@ -70,6 +70,7 @@ comment associated with each format for 
 from __future__ import absolute_import
 
 import errno
+import hashlib
 import struct
 
 from .i18n import _
@@ -547,6 +548,8 @@ class obsstore(object):
     # parents: (tuple of nodeid) or None, parents of precursors
     #          None is used when no data has been recorded
 
+    _obskeysize = 200
+
     def __init__(self, svfs, defaultformat=_fm1version, readonly=False):
         # caches for various obsolescence related cache
         self.caches = {}
@@ -574,6 +577,46 @@ class obsstore(object):
 
     __bool__ = __nonzero__
 
+    def cachekey(self, index=None):
+        """return (current-length, cachekey)
+
+        'current-length': is the current length of the obsstore storage file,
+        'cachekey' is the hash of the last 200 bytes ending at 'index'.
+
+        If 'index' is unspecified, current obsstore length is used.
+        Cacheckey will be set to nullid if the obsstore is empty.
+        'current-lenght' is -always- the current obsstore length, regardless of
+        the 'index' value.
+
+        If the index specified is higher than the current obsstore file
+        length, cachekey will be set to None."""
+        # default value
+        obsstoresize = 0
+        keydata = ''
+        # try to get actual data from the obsstore
+        try:
+            with self.svfs('obsstore') as obsfile:
+                obsfile.seek(0, 2)
+                obsstoresize = obsfile.tell()
+                if index is None:
+                    index = obsstoresize
+                elif obsstoresize < index:
+                    return obsstoresize, None
+                actualsize = min(index, self._obskeysize)
+                if actualsize:
+                    obsfile.seek(index - actualsize, 0)
+                    keydata = obsfile.read(actualsize)
+        except (OSError, IOError) as e:
+            if e.errno != errno.ENOENT:
+                raise
+        if keydata:
+            key = hashlib.sha1(keydata).digest()
+        else:
+            # reusing an existing "empty" value make it easier to define a
+            # default cachekey for 'no data'.
+            key = node.nullid
+        return obsstoresize, key
+
     @property
     def readonly(self):
         """True if marker creation is disabled


More information about the Mercurial-devel mailing list