[PATCH 2 of 2] scmutil: add filecache, a smart property-like decorator that compares stat info

Adrian Buehlmann adrian at cadifra.com
Wed Jul 20 10:40:04 CDT 2011


On 2011-07-20 16:55, Idan Kamara wrote:
> # HG changeset patch
> # User Idan Kamara <idankk86 at gmail.com>
> # Date 1310227619 -10800
> # Node ID 50e3e6d522fd1545c77017c6b5220df0dce9d088
> # Parent  f0d36f770ceb2a335fb8643c3687da8fefaf1bf4
> scmutil: add filecache, a smart property-like decorator that compares stat info
> 
> The idea is being able to associate a file with a property, and watch
> that file stat info for modifications when we decide it's important for it to
> be up-to-date. Once it changes, we recreate the object.
> 
> As a consequence, localrepo.invalidate() will become much less expensive in the
> case where nothing changed on-disk.
> 
> diff -r f0d36f770ceb -r 50e3e6d522fd mercurial/scmutil.py
> --- a/mercurial/scmutil.py	Wed Jul 20 17:47:37 2011 +0300
> +++ b/mercurial/scmutil.py	Sat Jul 09 19:06:59 2011 +0300
> @@ -709,3 +709,49 @@
>          raise error.RequirementError(_("unknown repository format: "
>              "requires features '%s' (upgrade Mercurial)") % "', '".join(missings))
>      return requirements
> +
> +class filecache(object):
> +    '''A property like decorator that tracks a file under .hg/ for updates.
> +
> +    Records stat info when called in _invalidatecache.
> +
> +    On subsequent calls, compares old stat info with new info, and recreates
> +    the object when needed, updating the new stat info in _invalidatecache.
> +
> +    The underlying filesystem must provide either st_ino or subsecond precision
> +    to ensure the cache is reliable. Otherwise, we always reread the file.'''
> +    def __init__(self, path, instore=False):
> +        self.path = path
> +        self.instore = instore
> +
> +    def __call__(self, func):
> +        self.func = func
> +        self.name = func.__name__
> +        return self
> +
> +    def __get__(self, obj, type=None):
> +        path = self.instore and obj.sjoin(self.path) or obj.join(self.path)
> +
> +        if self.name in obj._invalidatecache:
> +            cacheentry = obj._invalidatecache[self.name]
> +            stat = util.stat(path)
> +
> +            if stat != cacheentry[1] or not self._cacheable(stat):

I think if we have seen _cacheable returning False a single time, we
should never try again for this filecache object and always call
self.func(obj) for the rest of the lifetime of self.

Also, this 'if' clause looks defective to me. If stat != cacheentry[1]
then this code caches the func call without calling and checking
_cacheable. But we should check _cacheable before making a cache entry
and we shouldn't make a cache entry if _cacheable returns False.

> +                cacheentry[1] = stat
> +                result = cacheentry[0] = self.func(obj)
> +            else:
> +                result = cacheentry[0]
> +        else:
> +            # stat -before- reading so our cache doesn't lie if someone
> +            # modifies between the time we read+stat it
> +            stat = util.stat(path)
> +            result = self.func(obj)
> +            obj._invalidatecache[self.name] = [result, stat, path]
> +
> +        setattr(obj, self.name, result)
> +        return result
> +
> +    def _cacheable(self, st):
> +        '''Check if we have inode info or subsecond precision. This is necessary
> +        to ensure that our cache is reliable.'''
> +        return not st or (st.st_ino or st.st_mtime != int(st.st_mtime))

Perhaps _cacheable should be a non-member function and reside in util,
so we have a chance to have split posix / windows implementations (that
is: platform specific implementations).



More information about the Mercurial-devel mailing list