[PATCH 2 of 2] match: add a subclass for dirstate normalizing of the matched patterns

Matt Harbison mharbison72 at gmail.com
Sun Apr 12 15:52:13 CDT 2015


# HG changeset patch
# User Matt Harbison <matt_harbison at yahoo.com>
# Date 1428817161 14400
#      Sun Apr 12 01:39:21 2015 -0400
# Node ID 6172eed8aa036002775a2ed02df47be5df02acc7
# Parent  75835458befcf5ddcef740c1a2ef0d5ce6804928
match: add a subclass for dirstate normalizing of the matched patterns

This class is only needed on case insensitive filesystems, and only for wdir
context matches.  It allows the user to not match the case of the items in the
filesystem- especially for naming directories, which dirstate doesn't handle[1].
Making dirstate handle mismatched directory cases is too expensive[2].

Since dirstate doesn't apply to committed csets, this is only created by
overriding basectx.match() in workingctx, and only on icasefs.  The default
arguments have been dropped, because the ctx must be passed to the matcher in
order to function.

For operations that can apply to both wdir and some other context, this ends up
normalizing the filename to the case as it exists in the filesystem, and using
that case for the lookup in the other context.  See the diff example in the
test.

Previously, given a directory with an inexact case:

  - add worked as expected

  - diff, forget and status would silently ignore the request

  - files would exit with 1

  - commit, revert and remove would fail (even when the commands leading up to
    them worked):

        $ hg ci -m "AbCDef" capsdir1/capsdir
        abort: CapsDir1/CapsDir: no match under directory!

        $ hg revert -r '.^' capsdir1/capsdir
        capsdir1\capsdir: no such file in rev 64dae27060b7

        $ hg remove capsdir1/capsdir
        not removing capsdir1\capsdir: no tracked files
        [1]

Globs are normalized, so that the -I and -X don't need to be specified with a
case match.  Without that, the second last remove (with -X) removes the files,
leaving nothing for the last remove.  However, specifying the files as
'glob:**.Txt' does not work.  Perhaps this requires 're.IGNORECASE'?

There are only a handful of places that create matchers directly, instead of
being routed through the context.match() method.  Some may benefit from changing
over to using ctx.match() as a factory function:

  revset.checkstatus()
  revset.contains()
  revset.filelog()
  revset._matchfiles()
  localrepository._loadfilter()
  ignore.ignore()
  fileset.subrepo()
  filemerge._picktool()
  overrides.addlargefiles()
  lfcommands.lfconvert()
  kwtemplate.__init__()
  eolfile.__init__()
  eolfile.checkrev()
  acl.buildmatch()

Currently, a toplevel subrepo can be named with an inexact case.  However, the
path auditor gets in the way of naming _anything_ in the subrepo if the top
level case doesn't match.

  --- a/tests/test-subrepo-deep-nested-change.t
  +++ b/tests/test-subrepo-deep-nested-change.t
  @@ -170,8 +170,15 @@
     R sub1/sub2/test.txt
     $ hg update -Cq
     $ touch sub1/sub2/folder/bar
  +#if icasefs
  +  $ hg addremove Sub1/sub2
  +  abort: path 'Sub1\sub2' is inside nested repo 'Sub1'
  +  [255]
  +  $ hg -q addremove sub1/sub2
  +#else
     $ hg addremove sub1/sub2
     adding sub1/sub2/folder/bar (glob)
  +#endif
     $ hg status -S
     A sub1/sub2/folder/bar
     ? foo/bar/abc

The narrowmatcher class may need to be tweaked when that is fixed.


[1] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068183.html
[2] http://www.selenic.com/pipermail/mercurial-devel/2015-April/068191.html

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -1424,6 +1424,19 @@
             finally:
                 wlock.release()
 
+    def match(self, pats=[], include=None, exclude=None, default='glob'):
+        r = self._repo
+
+        # Only a case insensitive filesystem needs magic to translate user input
+        # to actual case in the filesystem.
+        if not util.checkcase(r.root):
+            return matchmod.icasefsmatcher(r.root, r.getcwd(), pats, include,
+                                           exclude, default, False, r.auditor,
+                                           self)
+        return matchmod.match(r.root, r.getcwd(), pats,
+                              include, exclude, default,
+                              auditor=r.auditor, ctx=self)
+
     def _filtersuspectsymlink(self, files):
         if not files or self._repo.dirstate._checklink:
             return files
diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -273,6 +273,34 @@
     def rel(self, f):
         return self._matcher.rel(self._path + "/" + f)
 
+class icasefsmatcher(match):
+    """A matcher for wdir on case insenstive filesystems, which normalizes the
+    given patterns to the case in the filesystem.
+    """
+
+    def __init__(self, root, cwd, patterns, include, exclude, default, exact,
+                 auditor, ctx):
+        init = super(icasefsmatcher, self).__init__
+        self._dsnormalize = ctx.repo().dirstate.normalize
+
+        init(root, cwd, patterns, include, exclude, default, exact, auditor,
+             ctx)
+
+        # Exact matches must be based off of the actual user input, otherwise
+        # inexact case matches are treated as exact, and not noted without -v.
+        if not exact and self._files:
+            self._fmap = set(_roots(self._kp))
+
+    def _normalize(self, patterns, default, root, cwd, auditor):
+        self._kp = super(icasefsmatcher, self)._normalize(patterns, default,
+                                                          root, cwd, auditor)
+        kindpats = []
+        for kind, pats in self._kp:
+            if kind not in ('re', 'relre'):  # regex can't be normalized
+                pats = self._dsnormalize(pats)
+            kindpats.append((kind, pats))
+        return kindpats
+
 def patkind(pattern, default=None):
     '''If pattern is 'kind:pat' with a known kind, return kind.'''
     return _patsplit(pattern, default)[0]
diff --git a/tests/test-add.t b/tests/test-add.t
--- a/tests/test-add.t
+++ b/tests/test-add.t
@@ -176,12 +176,48 @@
   $ mkdir CapsDir1/CapsDir/SubDir
   $ echo def > CapsDir1/CapsDir/SubDir/Def.txt
 
-  $ hg add -v capsdir1/capsdir
+  $ hg add capsdir1/capsdir
   adding CapsDir1/CapsDir/AbC.txt (glob)
   adding CapsDir1/CapsDir/SubDir/Def.txt (glob)
 
   $ hg forget capsdir1/capsdir/abc.txt
   removing CapsDir1/CapsDir/AbC.txt (glob)
+
+  $ hg forget capsdir1/capsdir
+  removing CapsDir1/CapsDir/SubDir/Def.txt (glob)
+
+  $ hg add capsdir1
+  adding CapsDir1/CapsDir/AbC.txt (glob)
+  adding CapsDir1/CapsDir/SubDir/Def.txt (glob)
+
+  $ hg ci -m "AbCDef" capsdir1/capsdir
+
+  $ hg status -A capsdir1/capsdir
+  C CapsDir1/CapsDir/AbC.txt
+  C CapsDir1/CapsDir/SubDir/Def.txt
+
+  $ hg files capsdir1/capsdir
+  CapsDir1/CapsDir/AbC.txt (glob)
+  CapsDir1/CapsDir/SubDir/Def.txt (glob)
+
+  $ echo xyz > CapsDir1/CapsDir/SubDir/Def.txt
+  $ hg ci -m xyz capsdir1/capsdir/subdir/def.txt
+
+  $ hg revert -r '.^' capsdir1/capsdir
+  reverting CapsDir1/CapsDir/SubDir/Def.txt (glob)
+
+  $ hg diff capsdir1/capsdir
+  diff -r 5112e00e781d CapsDir1/CapsDir/SubDir/Def.txt
+  --- a/CapsDir1/CapsDir/SubDir/Def.txt	Thu Jan 01 00:00:00 1970 +0000
+  +++ b/CapsDir1/CapsDir/SubDir/Def.txt	* +0000 (glob)
+  @@ -1,1 +1,1 @@
+  -xyz
+  +def
+
+  $ hg remove -f 'glob:**.txt' -X capsdir1/capsdir
+  $ hg remove -f 'glob:**.txt' -I capsdir1/capsdir
+  removing CapsDir1/CapsDir/AbC.txt (glob)
+  removing CapsDir1/CapsDir/SubDir/Def.txt (glob)
 #endif
 
   $ cd ..


More information about the Mercurial-devel mailing list