[PATCH 2 of 2] match: add a subclass for dirstate normalizing of the matched patterns

Siddharth Agarwal sid at less-broken.com
Mon Apr 13 12:36:53 CDT 2015


On 04/12/2015 04:52 PM, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison at yahoo.com>
> # Date 1428817161 14400
> #      Sun Apr 12 01:39:21 2015 -0400
> # Node ID 6172eed8aa036002775a2ed02df47be5df02acc7
> # Parent  75835458befcf5ddcef740c1a2ef0d5ce6804928
> match: add a subclass for dirstate normalizing of the matched patterns
>
> This class is only needed on case insensitive filesystems, and only for wdir
> context matches.  It allows the user to not match the case of the items in the
> filesystem- especially for naming directories, which dirstate doesn't handle[1].
> Making dirstate handle mismatched directory cases is too expensive[2].
>
> Since dirstate doesn't apply to committed csets, this is only created by
> overriding basectx.match() in workingctx, and only on icasefs.  The default
> arguments have been dropped, because the ctx must be passed to the matcher in
> order to function.
>
> For operations that can apply to both wdir and some other context, this ends up
> normalizing the filename to the case as it exists in the filesystem, and using
> that case for the lookup in the other context.  See the diff example in the
> test.
>
> Previously, given a directory with an inexact case:
>
>   - add worked as expected
>
>   - diff, forget and status would silently ignore the request
>
>   - files would exit with 1
>
>   - commit, revert and remove would fail (even when the commands leading up to
>     them worked):
>
>         $ hg ci -m "AbCDef" capsdir1/capsdir
>         abort: CapsDir1/CapsDir: no match under directory!
>
>         $ hg revert -r '.^' capsdir1/capsdir
>         capsdir1\capsdir: no such file in rev 64dae27060b7
>
>         $ hg remove capsdir1/capsdir
>         not removing capsdir1\capsdir: no tracked files
>         [1]
>
> Globs are normalized, so that the -I and -X don't need to be specified with a
> case match.  Without that, the second last remove (with -X) removes the files,
> leaving nothing for the last remove.  However, specifying the files as
> 'glob:**.Txt' does not work.  Perhaps this requires 're.IGNORECASE'?
>
> There are only a handful of places that create matchers directly, instead of
> being routed through the context.match() method.  Some may benefit from changing
> over to using ctx.match() as a factory function:
>
>   revset.checkstatus()
>   revset.contains()
>   revset.filelog()
>   revset._matchfiles()
>   localrepository._loadfilter()
>   ignore.ignore()
>   fileset.subrepo()
>   filemerge._picktool()
>   overrides.addlargefiles()
>   lfcommands.lfconvert()
>   kwtemplate.__init__()
>   eolfile.__init__()
>   eolfile.checkrev()
>   acl.buildmatch()
>
> Currently, a toplevel subrepo can be named with an inexact case.  However, the
> path auditor gets in the way of naming _anything_ in the subrepo if the top
> level case doesn't match.

So this is a TODO then?

>
>   --- a/tests/test-subrepo-deep-nested-change.t
>   +++ b/tests/test-subrepo-deep-nested-change.t
>   @@ -170,8 +170,15 @@
>      R sub1/sub2/test.txt
>      $ hg update -Cq
>      $ touch sub1/sub2/folder/bar
>   +#if icasefs
>   +  $ hg addremove Sub1/sub2
>   +  abort: path 'Sub1\sub2' is inside nested repo 'Sub1'
>   +  [255]
>   +  $ hg -q addremove sub1/sub2
>   +#else
>      $ hg addremove sub1/sub2
>      adding sub1/sub2/folder/bar (glob)
>   +#endif
>      $ hg status -S
>      A sub1/sub2/folder/bar
>      ? foo/bar/abc
>
> The narrowmatcher class may need to be tweaked when that is fixed.
>
>
> [1] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068183.html
> [2] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068191.html
>
> diff --git a/mercurial/context.py b/mercurial/context.py
> --- a/mercurial/context.py
> +++ b/mercurial/context.py
> @@ -1424,6 +1424,19 @@
>              finally:
>                  wlock.release()
>  
> +    def match(self, pats=[], include=None, exclude=None, default='glob'):
> +        r = self._repo
> +
> +        # Only a case insensitive filesystem needs magic to translate user input
> +        # to actual case in the filesystem.
> +        if not util.checkcase(r.root):
> +            return matchmod.icasefsmatcher(r.root, r.getcwd(), pats, include,
> +                                           exclude, default, False, r.auditor,
> +                                           self)
> +        return matchmod.match(r.root, r.getcwd(), pats,
> +                              include, exclude, default,
> +                              auditor=r.auditor, ctx=self)
> +
>      def _filtersuspectsymlink(self, files):
>          if not files or self._repo.dirstate._checklink:
>              return files
> diff --git a/mercurial/match.py b/mercurial/match.py
> --- a/mercurial/match.py
> +++ b/mercurial/match.py
> @@ -273,6 +273,34 @@
>      def rel(self, f):
>          return self._matcher.rel(self._path + "/" + f)
>  
> +class icasefsmatcher(match):
> +    """A matcher for wdir on case insenstive filesystems, which normalizes the
> +    given patterns to the case in the filesystem.
> +    """
> +
> +    def __init__(self, root, cwd, patterns, include, exclude, default, exact,
> +                 auditor, ctx):
> +        init = super(icasefsmatcher, self).__init__
> +        self._dsnormalize = ctx.repo().dirstate.normalize
> +
> +        init(root, cwd, patterns, include, exclude, default, exact, auditor,
> +             ctx)
> +
> +        # Exact matches must be based off of the actual user input, otherwise
> +        # inexact case matches are treated as exact, and not noted without -v.
> +        if not exact and self._files:
> +            self._fmap = set(_roots(self._kp))
> +
> +    def _normalize(self, patterns, default, root, cwd, auditor):

We shouldn't apply case normalization on exact matchers at all, I think.

Other than that this looks fine. dirstate.normalize is a little more
expensive than necessary but the number of patterns is usually very small.

- Siddharth

> +        self._kp = super(icasefsmatcher, self)._normalize(patterns, default,
> +                                                          root, cwd, auditor)
> +        kindpats = []
> +        for kind, pats in self._kp:
> +            if kind not in ('re', 'relre'):  # regex can't be normalized
> +                pats = self._dsnormalize(pats)
> +            kindpats.append((kind, pats))
> +        return kindpats
> +
>  def patkind(pattern, default=None):
>      '''If pattern is 'kind:pat' with a known kind, return kind.'''
>      return _patsplit(pattern, default)[0]
> diff --git a/tests/test-add.t b/tests/test-add.t
> --- a/tests/test-add.t
> +++ b/tests/test-add.t
> @@ -176,12 +176,48 @@
>    $ mkdir CapsDir1/CapsDir/SubDir
>    $ echo def > CapsDir1/CapsDir/SubDir/Def.txt
>  
> -  $ hg add -v capsdir1/capsdir
> +  $ hg add capsdir1/capsdir
>    adding CapsDir1/CapsDir/AbC.txt (glob)
>    adding CapsDir1/CapsDir/SubDir/Def.txt (glob)
>  
>    $ hg forget capsdir1/capsdir/abc.txt
>    removing CapsDir1/CapsDir/AbC.txt (glob)
> +
> +  $ hg forget capsdir1/capsdir
> +  removing CapsDir1/CapsDir/SubDir/Def.txt (glob)
> +
> +  $ hg add capsdir1
> +  adding CapsDir1/CapsDir/AbC.txt (glob)
> +  adding CapsDir1/CapsDir/SubDir/Def.txt (glob)
> +
> +  $ hg ci -m "AbCDef" capsdir1/capsdir
> +
> +  $ hg status -A capsdir1/capsdir
> +  C CapsDir1/CapsDir/AbC.txt
> +  C CapsDir1/CapsDir/SubDir/Def.txt
> +
> +  $ hg files capsdir1/capsdir
> +  CapsDir1/CapsDir/AbC.txt (glob)
> +  CapsDir1/CapsDir/SubDir/Def.txt (glob)
> +
> +  $ echo xyz > CapsDir1/CapsDir/SubDir/Def.txt
> +  $ hg ci -m xyz capsdir1/capsdir/subdir/def.txt
> +
> +  $ hg revert -r '.^' capsdir1/capsdir
> +  reverting CapsDir1/CapsDir/SubDir/Def.txt (glob)
> +
> +  $ hg diff capsdir1/capsdir
> +  diff -r 5112e00e781d CapsDir1/CapsDir/SubDir/Def.txt
> +  --- a/CapsDir1/CapsDir/SubDir/Def.txt	Thu Jan 01 00:00:00 1970 +0000
> +  +++ b/CapsDir1/CapsDir/SubDir/Def.txt	* +0000 (glob)
> +  @@ -1,1 +1,1 @@
> +  -xyz
> +  +def
> +
> +  $ hg remove -f 'glob:**.txt' -X capsdir1/capsdir
> +  $ hg remove -f 'glob:**.txt' -I capsdir1/capsdir
> +  removing CapsDir1/CapsDir/AbC.txt (glob)
> +  removing CapsDir1/CapsDir/SubDir/Def.txt (glob)
>  #endif
>  
>    $ cd ..
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel



More information about the Mercurial-devel mailing list