[PATCH 4 of 7] manifest: use specific opener to avoid file stat ambiguity around truncation

Pierre-Yves David pierre-yves.david at ens-lyon.org
Tue Sep 20 12:14:32 EDT 2016



On 09/16/2016 10:51 PM, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> # Date 1474057496 -32400
> #      Sat Sep 17 05:24:56 2016 +0900
> # Node ID 55bd90d63ad3a76fd4d27d5eabb3fe9a7fa0642b
> # Parent  71b6b49f8a7ab6c894028b9153290f4bbf0f54f6
> manifest: use specific opener to avoid file stat ambiguity around truncation
>
> If steps below occurs at "the same time in sec", all of mtime, ctime
> and size are same between (1) and (3).
>
>   1. append data to 00manifest.i (and close transaction)
>   2. discard appended data by truncation (strip or rollback)
>   3. append same size but different data to 00manifest.i again
>
> Therefore, cache validation doesn't work after (3) as expected.
>
> To avoid such file stat ambiguity around truncation, this patch uses
> specific opener, which forces checkambig=True at writing 00manifest.i
> changes out.
>
> Hooking vfs.__call__() is the only way to centralize changes into
> manifest.py, because all logic to actually write in-memory changes out
> is implemented in revlog layer.
>
> In fact, reusing already wrapped self.opener for dirlogcache entries
> like as below is a little redundant, because storecache-ed properties
> of localrepository doesn't refer file stat of non-root 00manifest.i
> files (= ambiguity of such files doesn't cause any issue).
>
>         if dir not in self._dirlogcache:
>             self._dirlogcache[dir] = manifestrevlog(self.opener, dir,
>                                                     self._dirlogcache)
>
> But wrapped opener never checks ambiguity of file stat for non-root
> 00manifest.i files, because "path == '00manifest.i'" condition in
> wrapper.__call__() is always False for such files. Therefore, this
> patch simply reuses wrapped self.opener for non-root 00manifest.i
> files.
>
> Even after this patch, avoiding file stat ambiguity of 00manifest.i
> around truncation isn't yet completed, because truncation side isn't
> aware of this issue.
>
> This is a part of ExactCacheValidationPlan.
>
>     https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
>
> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
> --- a/mercurial/manifest.py
> +++ b/mercurial/manifest.py
> @@ -892,6 +892,28 @@ class treemanifest(object):
>                  subp1, subp2 = subp2, subp1
>              writesubtree(subm, subp1, subp2)
>
> +def _checkambigopener(opener):
> +    """build an opener to add checkambig=True at changing 00manifest.i
> +    """
> +    class wrapper(opener.__class__):
> +        def __call__(self, path, mode="r", text=False, atomictemp=False,
> +                     notindexed=False, backgroundclose=False, checkambig=False,
> +                      **kwargs):
> +            if path == '00manifest.i' and mode not in ('r', 'rb'):
> +                # check file stat ambiguity at closing forcibly
> +                checkambig = True
> +            supercall = super(wrapper, self).__call__
> +            return supercall(path, mode,
> +                             text=text,
> +                             atomictemp=atomictemp,
> +                             notindexed=notindexed,
> +                             backgroundclose=backgroundclose,
> +                             checkambig=checkambig,
> +                             **kwargs)
> +    opener.__class__ = wrapper

This wrapper logic is a bit scary here. Could we  had the necessary 
minimal hooks in revlog to allow manifest and changelog to hooks in ? 
That would seems more robust/clean than wrapping core from core.

> +
> +    return opener
> +
>  class manifestrevlog(revlog.revlog):
>      '''A revlog that stores manifest texts. This is responsible for caching the
>      full-text manifest contents.
> @@ -927,6 +949,13 @@ class manifestrevlog(revlog.revlog):
>          else:
>              self._dirlogcache = {'': self}
>
> +            # If "dir" isn't empty, this "opener" is reused one via
> +            # "self.opener" of another manifestrevlog. Therefore,
> +            # wrapping opener is needed only at once for root
> +            # 00manifest.i file, to avoid multiple (= redundant)
> +            # wrapping.
> +            opener = _checkambigopener(opener)
> +
>          super(manifestrevlog, self).__init__(opener, indexfile)
>
>      @property
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list