D5493: match: support rooted globs in hgignore

valentin.gatienbaron (Valentin Gatien-Baron) phabricator at mercurial-scm.org
Fri Jan 4 20:34:59 UTC 2019


valentin.gatienbaron created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  In a .hgignore, "glob:foo" always means "**/foo". This cannot be
  avoided because there is no syntax like "^" in regexes to say you
  don't want the implied "**/" (of course one can use regexes, but glob
  syntax is nice).
  
  When you have a long list of fairly specific globs like
  path/to/some/thing, this has two consequences:
  
  1. unintended files may be ignored (not too common though)
  2. matching performance can suffer significantly Here is vanilla hg status timing on a private repository:
  
    Using syntax:glob everywhere real	0m2.199s user	0m1.545s sys	0m0.619s
  
    When rooting the appropriate globs real	0m1.434s user	0m0.847s sys	0m0.565s
  
    (tangentially, none of this shows up in --profile's output. It seems that C code doesn't play well with profiling)
  
  The code already supports this but there is no syntax to make use of
  it, so it seems reasonable to create such syntax. I create a new
  hgignore syntax "rootedglob". There might be a better name, but
  "rooted" is the terminology in use in user-facing documentation.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5493

AFFECTED FILES
  mercurial/help/hgignore.txt
  mercurial/match.py
  tests/test-hgignore.t

CHANGE DETAILS

diff --git a/tests/test-hgignore.t b/tests/test-hgignore.t
--- a/tests/test-hgignore.t
+++ b/tests/test-hgignore.t
@@ -239,6 +239,17 @@
   dir/c.o is ignored
   (ignore rule in $TESTTMP/ignorerepo/.hgignore, line 2: 'dir/**/c.o') (glob)
 
+Check rooted globs
+
+  $ hg purge --all --config extensions.purge=
+  $ echo "syntax: rootedglob" > .hgignore
+  $ echo "a/*.ext" >> .hgignore
+  $ for p in a b/a aa; do mkdir -p $p; touch $p/b.ext; done
+  $ hg status -A 'set:**.ext'
+  ? aa/b.ext
+  ? b/a/b.ext
+  I a/b.ext
+
 Check using 'include:' in ignore file
 
   $ hg purge --all --config extensions.purge=
@@ -257,10 +268,15 @@
 Check recursive uses of 'include:'
 
   $ echo "include:nested/ignore" >> otherignore
-  $ mkdir nested
+  $ mkdir nested nested/more
   $ echo "glob:*ignore" > nested/ignore
+  $ echo "rootedglob:a" >> nested/ignore
+  $ touch a nested/a nested/more/a
   $ hg status
   A dir/b.o
+  ? nested/a
+  ? nested/more/a
+  $ rm a nested/a nested/more/a
 
   $ cp otherignore goodignore
   $ echo "include:badignore" >> otherignore
@@ -291,18 +307,26 @@
   ? dir1/file2
   ? dir2/file1
 
-Check including subincludes with regexs
+Check including subincludes with other patterns
 
   $ echo "subinclude:dir1/.hgignore" >> .hgignore
+
+  $ mkdir dir1/subdir
+  $ touch dir1/subdir/file1
+  $ echo "rootedglob:f?le1" > dir1/.hgignore
+  $ hg status
+  ? dir1/file2
+  ? dir1/subdir/file1
+  ? dir2/file1
+  $ rm dir1/subdir/file1
+
   $ echo "regexp:f.le1" > dir1/.hgignore
-
   $ hg status
   ? dir1/file2
   ? dir2/file1
 
 Check multiple levels of sub-ignores
 
-  $ mkdir dir1/subdir
   $ touch dir1/subdir/subfile1 dir1/subdir/subfile3 dir1/subdir/subfile4
   $ echo "subinclude:subdir/.hgignore" >> dir1/.hgignore
   $ echo "glob:subfil*3" >> dir1/subdir/.hgignore
diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -7,6 +7,7 @@
 
 from __future__ import absolute_import, print_function
 
+import collections
 import copy
 import itertools
 import os
@@ -1351,19 +1352,21 @@
     syntax: glob   # defaults following lines to non-rooted globs
     re:pattern     # non-rooted regular expression
     glob:pattern   # non-rooted glob
+    rootedglob:pat # rooted glob (same root as ^ in regexps)
     pattern        # pattern of the current default type
 
     if sourceinfo is set, returns a list of tuples:
     (pattern, lineno, originalline). This is useful to debug ignore patterns.
     '''
 
-    syntaxes = {
-        're': 'relre:',
-        'regexp': 'relre:',
-        'glob': 'relglob:',
-        'include': 'include',
-        'subinclude': 'subinclude',
-    }
+    syntaxes = collections.OrderedDict([
+        ('re', 'relre:'),
+        ('regexp', 'relre:'),
+        ('glob', 'relglob:'),
+        ('rootedglob', 'glob:'), # after 'glob' line, so glob:a means relglob:a
+        ('include', 'include'),
+        ('subinclude', 'subinclude'),
+    ])
     syntax = 'relre:'
     patterns = []
 
diff --git a/mercurial/help/hgignore.txt b/mercurial/help/hgignore.txt
--- a/mercurial/help/hgignore.txt
+++ b/mercurial/help/hgignore.txt
@@ -59,14 +59,17 @@
   Regular expression, Python/Perl syntax.
 ``glob``
   Shell-style glob.
+``rootedglob``
+  A variant of ``glob`` that is rooted (see below).
 
 The chosen syntax stays in effect when parsing all patterns that
 follow, until another syntax is selected.
 
-Neither glob nor regexp patterns are rooted. A glob-syntax pattern of
-the form ``*.c`` will match a file ending in ``.c`` in any directory,
-and a regexp pattern of the form ``\.c$`` will do the same. To root a
-regexp pattern, start it with ``^``.
+Neither ``glob`` nor regexp patterns are rooted. A glob-syntax
+pattern of the form ``*.c`` will match a file ending in ``.c`` in any
+directory, and a regexp pattern of the form ``\.c$`` will do the
+same. To root a regexp pattern, start it with ``^``. To get the same
+effect with glob-syntax, you have to use ``rootedglob``.
 
 Subdirectories can have their own .hgignore settings by adding
 ``subinclude:path/to/subdir/.hgignore`` to the root ``.hgignore``. See



To: valentin.gatienbaron, #hg-reviewers
Cc: mercurial-devel


More information about the Mercurial-devel mailing list