[PATCH 02 of 11] scmutil: add filecache, a smart property-like decorator that compares stat info

Matt Mackall mpm at selenic.com
Mon Jul 18 16:29:48 CDT 2011


On Mon, 2011-07-18 at 22:32 +0200, Adrian Buehlmann wrote:
> On 2011-07-18 22:12, Matt Mackall wrote:
> > On Sat, 2011-07-16 at 18:03 +0200, Adrian Buehlmann wrote:
> >> On 2011-07-16 16:34, Idan Kamara wrote:
> >>> # HG changeset patch
> >>> # User Idan Kamara <idankk86 at gmail.com>
> >>> # Date 1310227619 -10800
> >>> # Node ID b99305dd59279aec962e23da2a362e0d8b785965
> >>> # Parent  d36f5aec2f9e4214fafe048bccd0bb47ac5f9c16
> >>> scmutil: add filecache, a smart property-like decorator that compares stat info
> >>>
> >>> The idea is being able to associate a file with a property, and watch
> >>> that file stat info for modifications when we decide it's important for it to
> >>> be up-to-date. Once it changes, we recreate the object.
> >>>
> >>> As a consequence, localrepo.invalidate() will become much less expensive in the
> >>> case where nothing changed on-disk.
> >>>
> >>> diff -r d36f5aec2f9e -r b99305dd5927 mercurial/scmutil.py
> >>> --- a/mercurial/scmutil.py	Sat Jul 16 15:30:43 2011 +0300
> >>> +++ b/mercurial/scmutil.py	Sat Jul 09 19:06:59 2011 +0300
> >>> @@ -709,3 +709,41 @@
> >>>          raise error.RequirementError(_("unknown repository format: "
> >>>              "requires features '%s' (upgrade Mercurial)") % "', '".join(missings))
> >>>      return requirements
> >>> +
> >>> +class filecache(object):
> >>> +    '''A property like decorator that tracks a file under .hg/ for updates.
> >>> +
> >>> +    Records stat info when called in _invalidatecache.
> >>> +
> >>> +    On subsequent calls, compares old stat info with new info, and recreates
> >>> +    the object when needed, updating the new stat info in _invalidatecache.'''
> >>> +    def __init__(self, path, instore=False):
> >>> +        self.path = path
> >>> +        self.instore = instore
> >>> +
> >>> +    def __call__(self, func):
> >>> +        self.func = func
> >>> +        self.name = func.__name__
> >>> +        return self
> >>> +
> >>> +    def __get__(self, obj, type=None):
> >>> +        path = self.instore and obj.sjoin(self.path) or obj.join(self.path)
> >>> +
> >>> +        if self.name in obj._invalidatecache:
> >>> +            cacheentry = obj._invalidatecache[self.name]
> >>> +            stat = util.stat(path)
> >>> +
> >>> +            if stat != cacheentry[1]:
> >>> +                cacheentry[1] = stat
> >>> +                result = cacheentry[0] = self.func(obj)
> >>> +            else:
> >>> +                result = cacheentry[0]
> >>> +        else:
> >>> +            # stat -before- reading so our cache doesn't lie if someone
> >>> +            # modifies between the time we read+stat it
> >>> +            stat = util.stat(path)
> >>> +            result = self.func(obj)
> >>> +            obj._invalidatecache[self.name] = [result, stat, path]
> >>> +
> >>> +        setattr(obj, self.name, result)
> >>> +        return result
> >>
> >> What happens if the file changed its contents without changing mtime nor
> >> size?
> > 
> > Excellent question. Answer: we lose.
> 
> ..
> 
> > We need to cache and compare -the whole stat result-. There's absolutely
> > no reason not to here.
> 
> How does that solve the problem of missing a file change that changes
> file contents without changing size nor mtime? (and thus failing to call
> func again)

We've got three buckets we can dump filesystems into:

have subsecond timestamps (eg NTFS, Btrfs, ext4..):
  changes are detected by comparing timestamps
have inodes (ext3, HFS+):
  changes made by non-append operations are made atomic rename 
  and result in timestamp changes
neither (eg VFAT):
  similar issues (and solutions) to dirstate apply

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list