[PATCH 6 of 8 sparse V2] dirstate: move validation of files in sparse checkout from extension

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Tue Jul 11 14:01:40 EDT 2017


At Mon, 10 Jul 2017 21:57:05 -0700,
Gregory Szorc wrote:
> 
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1499554173 25200
> #      Sat Jul 08 15:49:33 2017 -0700
> # Node ID 7b09c7c8f8457dd96f45334dad13a0ceaae92a18
> # Parent  03eed39f7af2db1678c2d39a2a7342e5c66b0804
> dirstate: move validation of files in sparse checkout from extension
> 
> The sparse extension wraps various dirstate methods to validate that
> the path being modified in dirstate exists in the sparse checkout.
> This commit moves that code to dirstate itself.
> 
> As part of the move, the error messages were tweaked slightly.
> And the code for the validation was reworked slightly. The most
> notable change is the removal of the "f is not None" check.
> This check has instead been inlined into copy(), which is the only
> place where it should be relevant.
> 
> I'm not crazy about the references to sparse's UI adjustments
> leaking into core. But what can you do.
> 
> I'm not a dirstate expert and am not sure if this is the ideal
> way to hook in path validation into dirstate now that the code is in
> core. But this does preserve the behavior of the extension. We can
> always refactor later.
> 
> Unless the sparse feature is enabled via the sparse extension,
> the matcher will be an alwaysmatcher and the new code will
> essentially no-op. This change does add the overhead of calling
> a few functions, however. If the modified methods are called in
> tight loops (this can occur for working directory updates) this could
> have a performance impact. I haven't explicitly measured the perf
> impact because I view this code as temporary until it can be
> refactored. The end state should essentially be an attribute
> lookup, so I'm not too worried about long-term perf impact.
> 
> diff --git a/hgext/sparse.py b/hgext/sparse.py
> --- a/hgext/sparse.py
> +++ b/hgext/sparse.py
> @@ -240,23 +240,6 @@ def _setupdirstate(ui):
>          return orig(self, parent, allfiles, changedfiles)
>      extensions.wrapfunction(dirstate.dirstate, 'rebuild', _rebuild)
>  
> -    # Prevent adding files that are outside the sparse checkout
> -    editfuncs = ['normal', 'add', 'normallookup', 'copy', 'remove', 'merge']
> -    hint = _('include file with `hg debugsparse --include <pattern>` or use ' +
> -             '`hg add -s <file>` to include file directory while adding')
> -    for func in editfuncs:
> -        def _wrapper(orig, self, *args):
> -            sparsematch = self._sparsematcher
> -            if not sparsematch.always():
> -                for f in args:
> -                    if (f is not None and not sparsematch(f) and
> -                        f not in self):
> -                        raise error.Abort(_("cannot add '%s' - it is outside "
> -                                            "the sparse checkout") % f,
> -                                          hint=hint)
> -            return orig(self, *args)
> -        extensions.wrapfunction(dirstate.dirstate, func, _wrapper)
> -
>  @command('^debugsparse', [
>      ('I', 'include', False, _('include files in the sparse checkout')),
>      ('X', 'exclude', False, _('exclude files in the sparse checkout')),
> diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
> --- a/mercurial/dirstate.py
> +++ b/mercurial/dirstate.py
> @@ -68,6 +68,10 @@ def nonnormalentries(dmap):
>                  otherparent.add(fname)
>          return nonnorm, otherparent
>  
> +ADD_SPARSE_HINT = _("include file with 'hg debugsparse --include <pattern>' "
> +                    "or use 'hg add --sparse <file>' to include file "
> +                    "directory while adding")
> +
>  class dirstate(object):
>  
>      def __init__(self, opener, ui, root, validate, sparsematchfn):
> @@ -273,6 +277,18 @@ class dirstate(object):
>          # it's safe because f is always a relative path
>          return self._rootdir + f
>  
> +    def _ensuresparsematch(self, f):
> +        """Verifies a path is part of the sparse config and aborts if not."""
> +        matcher = self._sparsematcher
> +
> +        if matcher.always():
> +            return
> +
> +        if not matcher(f) and f not in self._map:
> +            raise error.Abort(_("cannot add '%s' because it is outside the "
> +                                "sparse checkout") % f,
> +                              hint=ADD_SPARSE_HINT)
> +

Nit picking for later refactoring.

"cannot add 'XXXX'" seems uncomfortable, when this code path is
executed for removed file. This should:

  1. use more generic phrase ("manage", "operate', 'involve' or so on), or
  2. switch message according to operation type

"use 'hg add --sparse ...." in ADD_SPARSE_HINT, too.

BTW:

  - normal() always uses _addpath()
  - add() always uses _addpath()
  - merge() uses
    - normallookup()
    - otherparent(), which always uses _addpath()
  - normallookup()
    If "self._pl[1] != nullid and f in self._map" route is
    executed, "f in self._map" means that specified file is already
    managed in dirstate, and it should be detected before.

    Therefore, normallookup() always uses _addpath(), in practice

  - remove() always uses _droppath()

  - copy()
    Both "source" and "dest" should exist in dirstate before copy()

Therefore, how about these?

  - check "inside sparse" in _addpath() and _droppath(),
    and raise Abort, SparseError (as Martin mentioned), or so

    We can switch message and hint here.

    Strictly speaking, this allows lstat() on a file outside sparse in
    normal() case. But caller (merge/update, revert, and so on) should
    avoid it.

  - check "inside sparse" in code paths below,
    and raise ProgrammingError or so (use "assert" ?)

    - "self._pl[1] != nullid and f in self._map" route in normallookup()
    - at the beginning of copy()


>      def flagfunc(self, buildfallback):
>          if self._checklink and self._checkexec:
>              def f(x):
> @@ -514,6 +530,11 @@ class dirstate(object):
>          """Mark dest as a copy of source. Unmark dest if source is None."""
>          if source == dest:
>              return
> +
> +        if source is not None:
> +            self._ensuresparsematch(source)
> +        self._ensuresparsematch(dest)
> +
>          self._dirty = True
>          if source is not None:
>              self._copymap[dest] = source
> @@ -565,6 +586,7 @@ class dirstate(object):
>  
>      def normal(self, f):
>          '''Mark a file normal and clean.'''
> +        self._ensuresparsematch(f)
>          s = os.lstat(self._join(f))
>          mtime = s.st_mtime
>          self._addpath(f, 'n', s.st_mode,
> @@ -581,6 +603,8 @@ class dirstate(object):
>  
>      def normallookup(self, f):
>          '''Mark a file normal, but possibly dirty.'''
> +        self._ensuresparsematch(f)
> +
>          if self._pl[1] != nullid and f in self._map:
>              # if there is a merge going on and the file was either
>              # in state 'm' (-1) or coming from other parent (-2) before
> @@ -620,12 +644,14 @@ class dirstate(object):
>  
>      def add(self, f):
>          '''Mark a file added.'''
> +        self._ensuresparsematch(f)
>          self._addpath(f, 'a', 0, -1, -1)
>          if f in self._copymap:
>              del self._copymap[f]
>  
>      def remove(self, f):
>          '''Mark a file removed.'''
> +        self._ensuresparsematch(f)
>          self._dirty = True
>          self._droppath(f)
>          size = 0
> @@ -644,6 +670,8 @@ class dirstate(object):
>  
>      def merge(self, f):
>          '''Mark a file merged.'''
> +        self._ensuresparsematch(f)
> +
>          if self._pl[1] == nullid:
>              return self.normallookup(f)
>          return self.otherparent(f)
> diff --git a/tests/test-sparse.t b/tests/test-sparse.t
> --- a/tests/test-sparse.t
> +++ b/tests/test-sparse.t
> @@ -98,8 +98,8 @@ Verify status only shows included files
>  Adding an excluded file should fail
>  
>    $ hg add hide3
> -  abort: cannot add 'hide3' - it is outside the sparse checkout
> -  (include file with `hg debugsparse --include <pattern>` or use `hg add -s <file>` to include file directory while adding)
> +  abort: cannot add 'hide3' because it is outside the sparse checkout
> +  (include file with 'hg debugsparse --include <pattern>' or use 'hg add --sparse <file>' to include file directory while adding)
>    [255]
>  
>  Verify deleting sparseness while a file has changes fails
> @@ -265,8 +265,8 @@ Test that add -s adds dirs to sparse pro
>    $ touch add/foo
>    $ touch add/bar
>    $ hg add add/foo
> -  abort: cannot add 'add/foo' - it is outside the sparse checkout
> -  (include file with `hg debugsparse --include <pattern>` or use `hg add -s <file>` to include file directory while adding)
> +  abort: cannot add 'add/foo' because it is outside the sparse checkout
> +  (include file with 'hg debugsparse --include <pattern>' or use 'hg add --sparse <file>' to include file directory while adding)
>    [255]
>    $ hg add -s add/foo
>    $ hg st
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

-- 
----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp


More information about the Mercurial-devel mailing list