[PATCH 12 of 14] vfs: add the possibility to have a "ward" to check vfs usage

Boris Feld boris.feld at octobus.net
Tue Jul 11 10:32:46 EDT 2017


On Sat, 2017-07-08 at 16:36 +0900, Yuya Nishihara wrote:
> On Fri, 07 Jul 2017 18:48:44 +0200, Boris Feld wrote:
> > On Fri, 2017-07-07 at 21:48 +0900, Yuya Nishihara wrote:
> > > On Thu, 06 Jul 2017 20:53:50 +0200, Boris Feld wrote:
> > > > On Wed, 2017-07-05 at 23:55 +0900, Yuya Nishihara wrote:
> > > > > On Sun, 02 Jul 2017 04:56:37 +0200, Pierre-Yves David wrote:
> > > > > > # HG changeset patch
> > > > > > # User Pierre-Yves David <pierre-yves.david at ens-lyon.org>
> > > > > > # Date 1470323266 -7200
> > > > > > #      Thu Aug 04 17:07:46 2016 +0200
> > > > > > # Node ID ebf81efdd6d7ff15c64683933635616c00475688
> > > > > > # Parent  34b8be7f0420b07d0f7b71379c6055e5b26223d5
> > > > > > # EXP-Topic vfs.ward
> > > > > > # Available At https://www.mercurial-scm.org/repo/users/mar
> > > > > > mout
> > > > > > e/me
> > > > > > rcurial/
> > > > > > #              hg pull https://www.mercurial-scm.org/repo/u
> > > > > > sers
> > > > > > /mar
> > > > > > moute/mercurial/ -r ebf81efdd6d7
> > > > > > vfs: add the possibility to have a "ward" to check vfs
> > > > > > usage
> > > > > > The feature has some similarity with the 'pathauditor', but
> > > > > > further
> > > > > > digging
> > > > > > show it make more sense to keep them separated. The path
> > > > > > auditor is
> > > > > > fairly
> > > > > > autonomous (create from vfs.__init__ without any extra
> > > > > > information), and check
> > > > > > very generic details. On the other hand, the wards will
> > > > > > have
> > > > > > deeper
> > > > > > knownledge
> > > > > > of the Mercurial logic and is created at the local
> > > > > > repository
> > > > > > level. Mixing the
> > > > > > two would add much complexity for no real gains.
> > > > > 
> > > > > My gut feeling is that vfs isn't the right place if they
> > > > > require
> > > > > deeper
> > > > > knowledge of repository/dirstate logic. And the whole weakref
> > > > > business
> > > > > floating around wards, hooks and transaction seems like just
> > > > > getting
> > > > > around
> > > > > layering violation.
> > > > 
> > > > We pondered on this choice a long time and couldn't find any
> > > > other
> > > > layer that is used by both Mercurial core and the extensions to
> > > > access
> > > > the file system. It seems to be the highest abstraction layer
> > > > we
> > > > could
> > > > hook into without introducing a new one and updating all
> > > > callers.
> > > > 
> > > > Our reflexion is, as the vfs classes comes from the scmutil
> > > > file,
> > > > it
> > > > seems okay to have scm related logic in that layer.
> > > 
> > > IIRC, one possible alternative was to move lock itself to a vfs
> > > as
> > > the
> > > whole vfs is theoretically covered by the lock.
> > 
> > The practice divert a bit from the theory. For example, the
> > ".hg/bookmark" file is in theory covered by the 'wlock'. In
> > practice it
> > is now covered by the transaction and therefor the 'lock'. On the
> > other
> > hand there are multiple files handled by the vfs ('hgrc',
> > 'dirstate',
> > 'store/', etc) that are not covered by the wlock.
> > 
> > Right now, all these exceptions are stored on the repository, and
> > extension can update them. To us, the repository layers seems the
> > right
> > layer for logic regarding individual file semantic. We also want to
> > add
> > some logic regarding open transaction to the ward, something
> > unrelated
> > to the vfs.
> > 
> > The initial motivation for this series (beside catching bug) is to
> > make
> > it safer to introduce new vfs. For example, the share extensions
> > suffers from lack of cache sharing, each share has its own
> > '.hg/cache/'
> > directory. Introducing a 'repo.cvfs' dedicated to cache could fix
> > that,
> > but we wants to warn people still accessing 'cache/' through
> > 'repo.vfs'.
> > 
> > To push that logic further, "share" actually needs a better
> > distinction
> > between the working copy specific piece and the more generic
> > pieces. An
> > extra split of "repo.vfs" could take care of that. However, some of
> > the
> > file, like bookmark, can be shared or not depending on some config
> > and
> > share option. So the ward needs access to that logic (already
> > living on
> > the repository) too.
> 
> It's probably okay to handle simple repo-related business by hooking
> vfs
> calls (via auditor/ward), but the future story sounds pretty
> complicated.
> I don't think it's great idea to introduce many-consumers <-> many-
> vfs
> relation mediated by a low-level hook, "ward".
> 
> > We need to eventually rework the repository format to a more race
> > free
> > data structure. This will likely involve more vfs to access data
> > dispatched in various directory (eg: append-only vs full update).
> > The
> > ward should help a lot here but it need to be aware of many high
> > level
> > logic regarding file semantic.
> 
> At this point, I think we'll need a proper storage layer which
> shadows
> low-level vfs operation.
> 
> > We plan to add more vfs-s and the current locking design seems
> > better
> > than having one lock per vfs, as multiple vfs will be covered by
> > the
> > same lock. If we move the locks on the vfs layer, we either would
> > need
> > to update it again later or introduce much more complex code to
> > manage
> > the interaction between vfs-s locks.
> 
> Yeah, it's annoying vfs itself has physical layering violation.
> 
> > > > > Do you have any idea how to expand this ward() callback to
> > > > > the
> > > > > other
> > > > > vfs
> > > > > operations, which have different set of parameters?
> > > > 
> > > > All other operations (move, delete, copy, etc) could be
> > > > "translated" to
> > > > 'read' or 'write' mode. An alternative would be to abstract the
> > > > mode
> > > > with a couple of symbolic constant "create, update, append,
> > > > delete,
> > > > …".
> > > > What do you think?
> > > 
> > > So it will be quite similar to audit(path) if it had a mode flag?
> > 
> > In the end likely, but updating vfs code to call audit for each
> > operation and the audit functions source code right now would
> > requires
> > too much works for this series.
> 
> Really? IIUC, it's just to add mode parameter to auditor and wrap
> vfs.audit
> by a ward function.
> 
> FWIW, a ward object could manage lock, wlock and transaction to get
> rid of
> the weakref.

We are not sure to follow you here. Introducing a proper storage layer
seems complicated and out-of-scope of this series. I cannot see how we
could make sure that both core and extensions are forced to use this
new storage layer instead of vfs directly. Could you clarify what do
you have in mind?

In the mean time of this refactoring, we're gonna send a V2 that follow
your advice of wrapping audit with the ward function, the code is
cleaner that we anticipated, so it seems fine. It's probably simpler
that the previous design.

We are not sure adding a full ward object responsible for the lock
makes sense. The check are only here for devel-warning so adding a new
object just for them seems a bit too much. What is your opinion about
that? Do you see other use-cases of having this ward object?

> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list