[PATCH 12 of 14] vfs: add the possibility to have a "ward" to check vfs usage

Yuya Nishihara yuya at tcha.org
Sat Jul 8 03:36:17 EDT 2017


On Fri, 07 Jul 2017 18:48:44 +0200, Boris Feld wrote:
> On Fri, 2017-07-07 at 21:48 +0900, Yuya Nishihara wrote:
> > On Thu, 06 Jul 2017 20:53:50 +0200, Boris Feld wrote:
> > > On Wed, 2017-07-05 at 23:55 +0900, Yuya Nishihara wrote:
> > > > On Sun, 02 Jul 2017 04:56:37 +0200, Pierre-Yves David wrote:
> > > > > # HG changeset patch
> > > > > # User Pierre-Yves David <pierre-yves.david at ens-lyon.org>
> > > > > # Date 1470323266 -7200
> > > > > #      Thu Aug 04 17:07:46 2016 +0200
> > > > > # Node ID ebf81efdd6d7ff15c64683933635616c00475688
> > > > > # Parent  34b8be7f0420b07d0f7b71379c6055e5b26223d5
> > > > > # EXP-Topic vfs.ward
> > > > > # Available At https://www.mercurial-scm.org/repo/users/marmout
> > > > > e/me
> > > > > rcurial/
> > > > > #              hg pull https://www.mercurial-scm.org/repo/users
> > > > > /mar
> > > > > moute/mercurial/ -r ebf81efdd6d7
> > > > > vfs: add the possibility to have a "ward" to check vfs usage
> > > > > The feature has some similarity with the 'pathauditor', but
> > > > > further
> > > > > digging
> > > > > show it make more sense to keep them separated. The path
> > > > > auditor is
> > > > > fairly
> > > > > autonomous (create from vfs.__init__ without any extra
> > > > > information), and check
> > > > > very generic details. On the other hand, the wards will have
> > > > > deeper
> > > > > knownledge
> > > > > of the Mercurial logic and is created at the local repository
> > > > > level. Mixing the
> > > > > two would add much complexity for no real gains.
> > > > 
> > > > My gut feeling is that vfs isn't the right place if they require
> > > > deeper
> > > > knowledge of repository/dirstate logic. And the whole weakref
> > > > business
> > > > floating around wards, hooks and transaction seems like just
> > > > getting
> > > > around
> > > > layering violation.
> > > 
> > > We pondered on this choice a long time and couldn't find any other
> > > layer that is used by both Mercurial core and the extensions to
> > > access
> > > the file system. It seems to be the highest abstraction layer we
> > > could
> > > hook into without introducing a new one and updating all callers.
> > > 
> > > Our reflexion is, as the vfs classes comes from the scmutil file,
> > > it
> > > seems okay to have scm related logic in that layer.
> > 
> > IIRC, one possible alternative was to move lock itself to a vfs as
> > the
> > whole vfs is theoretically covered by the lock.
> 
> The practice divert a bit from the theory. For example, the
> ".hg/bookmark" file is in theory covered by the 'wlock'. In practice it
> is now covered by the transaction and therefor the 'lock'. On the other
> hand there are multiple files handled by the vfs ('hgrc', 'dirstate',
> 'store/', etc) that are not covered by the wlock.
> 
> Right now, all these exceptions are stored on the repository, and
> extension can update them. To us, the repository layers seems the right
> layer for logic regarding individual file semantic. We also want to add
> some logic regarding open transaction to the ward, something unrelated
> to the vfs.
> 
> The initial motivation for this series (beside catching bug) is to make
> it safer to introduce new vfs. For example, the share extensions
> suffers from lack of cache sharing, each share has its own '.hg/cache/'
> directory. Introducing a 'repo.cvfs' dedicated to cache could fix that,
> but we wants to warn people still accessing 'cache/' through
> 'repo.vfs'.
>
> To push that logic further, "share" actually needs a better distinction
> between the working copy specific piece and the more generic pieces. An
> extra split of "repo.vfs" could take care of that. However, some of the
> file, like bookmark, can be shared or not depending on some config and
> share option. So the ward needs access to that logic (already living on
> the repository) too.

It's probably okay to handle simple repo-related business by hooking vfs
calls (via auditor/ward), but the future story sounds pretty complicated.
I don't think it's great idea to introduce many-consumers <-> many-vfs
relation mediated by a low-level hook, "ward".

> We need to eventually rework the repository format to a more race free
> data structure. This will likely involve more vfs to access data
> dispatched in various directory (eg: append-only vs full update). The
> ward should help a lot here but it need to be aware of many high level
> logic regarding file semantic.

At this point, I think we'll need a proper storage layer which shadows
low-level vfs operation.

> We plan to add more vfs-s and the current locking design seems better
> than having one lock per vfs, as multiple vfs will be covered by the
> same lock. If we move the locks on the vfs layer, we either would need
> to update it again later or introduce much more complex code to manage
> the interaction between vfs-s locks.

Yeah, it's annoying vfs itself has physical layering violation.

> > > > Do you have any idea how to expand this ward() callback to the
> > > > other
> > > > vfs
> > > > operations, which have different set of parameters?
> > > 
> > > All other operations (move, delete, copy, etc) could be
> > > "translated" to
> > > 'read' or 'write' mode. An alternative would be to abstract the
> > > mode
> > > with a couple of symbolic constant "create, update, append, delete,
> > > …".
> > > What do you think?
> > 
> > So it will be quite similar to audit(path) if it had a mode flag?
> 
> In the end likely, but updating vfs code to call audit for each
> operation and the audit functions source code right now would requires
> too much works for this series.

Really? IIUC, it's just to add mode parameter to auditor and wrap vfs.audit
by a ward function.

FWIW, a ward object could manage lock, wlock and transaction to get rid of
the weakref.


More information about the Mercurial-devel mailing list