[PATCH] minifileset: allow 'path:' patterns to have an explicit trailing slash

Matt Harbison mharbison72 at gmail.com
Thu Feb 15 22:05:47 EST 2018


On Tue, 13 Feb 2018 08:38:27 -0500, Yuya Nishihara <yuya at tcha.org> wrote:

> On Tue, 13 Feb 2018 07:58:50 -0500, Matt Harbison wrote:
>>
>> > On Feb 13, 2018, at 6:27 AM, Yuya Nishihara <yuya at tcha.org> wrote:
>> >
>> >> On Mon, 12 Feb 2018 22:17:35 -0500, Matt Harbison wrote:
>> >> # HG changeset patch
>> >> # User Matt Harbison <matt_harbison at yahoo.com>
>> >> # Date 1518488713 18000
>> >> #      Mon Feb 12 21:25:13 2018 -0500
>> >> # Node ID e99e6917138593d2dddf7e0f5506dbd3f6c87743
>> >> # Parent  9b5df6e19a4f308e14703a8136cd0530c1e1d1a9
>> >> minifileset: allow 'path:' patterns to have an explicit trailing  
>> slash
>> >>
>> >> We allow for it on the command line, with `hg status`, for example.   
>> I thought
>> >> that I copied this "n.startswith(p) and (len(n) == pl or n[pl] ==  
>> '/')" pattern
>> >> from somewhere, but I don't see it now.
>> >>
>> >> diff --git a/mercurial/minifileset.py b/mercurial/minifileset.py
>> >> --- a/mercurial/minifileset.py
>> >> +++ b/mercurial/minifileset.py
>> >> @@ -26,7 +26,7 @@
>> >>                     raise error.ParseError(_('reserved character:  
>> %s') % c)
>> >>             return lambda n, s: n.endswith(ext)
>> >>         elif name.startswith('path:'): # directory or full path test
>> >> -            p = name[5:] # prefix
>> >> +            p = name[5:] if name[-1] != '/' else name[5:-1] # prefix
>> >
>> > Doesn't it mean 'a/' matches 'a'?
>>
>> Yes. (But 'a' won’t match 'ab'.)
>
> Ugh, I never thought 'path:hg/' would match the file 'hg', but it does
> probably because of util.normpath().
>
>   % hg debugwalk 'path:hg/'
>   matcher: <patternmatcher patterns='(?:hg(?:/|$))'>
>   f  hg  hg  exact

Hmm, that surprises me too.  It does look like util.normpath() is involved:

diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -199,7 +199,10 @@
          if kind in cwdrelativepatternkinds:
              pat = pathutil.canonpath(root, cwd, pat, auditor)
          elif kind in ('relglob', 'path', 'rootfilesin'):
+            isdir = pat[-1] in br'\/'
              pat = util.normpath(pat)
+            if isdir:
+                pat += b'/'
          elif kind in ('listfile', 'listfile0'):
              try:
                  files = util.readfile(pat)

$ ../hg debugwalk 'path:hg/'
matcher: <patternmatcher patterns='(?:hg\\/(?:/|$))'>
..\hg\: The filename, directory name, or volume label syntax is incorrect

(Though that pattern looks like it would have problems matching against  
files in an actual directory.  In practice, it could walk tests/, though  
it double slashed the first separator.)

> Perhaps applying normpath() would look saner than testing if name[-1] ==  
> '/'.

Yeah, that was definitely a micro optimization.

>> Basically, I spent some time last week writing ignore rules for some  
>> converted repos, and got into the habit of appending a trailing '/' to  
>> ensure the match is a directory, and not just a substring.  When I did  
>> that here, it took awhile to figure out why the path was being  
>> ignored.  ('path:' only matches directories)
>>
>> > Can't we reuse some parts of the match module to build a function or  
>> regexp
>> > from a pattern string?
>>
>> Probably.  I’ve seen a couple cases where a regex pattern would be  
>> useful.  I just assumed those other match types were part of the  
>> performance concern that was the reason for splitting out the mini  
>> language in the first place.
>
> (CC Jun)
>
> I think the O(n) concern came from how fileset filters n-length list, not
> from the matcher function itself.

Unless I'm missing something, the only time patternmatcher walks ctx is if  
there's a 'set:' kind.  So if we filter that out that, the relative kinds  
(except relglob), and 'subinclude:', I don't see why we can't create one  
of those to build the match function.  That would allow regex,  
rootfilesin, and (rel)glob support too.


More information about the Mercurial-devel mailing list