[PATCH] match: adding non-recursive directory matching

Rodrigo Damazio rdamazio at google.com
Tue Oct 25 22:51:59 EDT 2016


On Tue, Oct 25, 2016 at 4:31 PM, FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
wrote:

>
> At Mon, 24 Oct 2016 10:34:52 -0700,
> Rodrigo Damazio wrote:
> >
> > [1  <text/plain; UTF-8 (7bit)>]
> > It sounds like we'd like to do 3 somewhat orthogonal things:
> > - allow user to specify the directory the pattern is relative to
> > (root/cwd/any)
> > - allow the user to specify recursiveness/non-recursiveness consistently
> > (not covered by the *path patterns, but could be the defined behavior for
> > the globs)
> > - clean up the matcher API (discussed during Sprint)
> >
> > Doing all 3 together would probably take some time and a lot of
> > back-and-forth, so I'm wondering if it'd be ok to start by updating this
> > patch to implement "rootglob" with consistent recursiveness behavior, and
> > we can then more slowly add the other patterns and converge on a cleaner
> > API?
>
> (let's suspend posting revised series while code freeze period, to
> focus on stabilization :-))
>

Sure, I understand you're under the freeze. Feel free to prioritize
reviewing my patches appropriately.
(notice the new patch is based on default, not stable)

    https://www.mercurial-scm.org/wiki/TimeBasedReleasePlan#Code_Freeze
>
> In my previous reply, I assume that newly introduced syntaxes do:
>
>   - match recursively by default regardless of the way of passing
>     (command line, -I/-X, ....), because of similarity with almost all
>     of existing syntaxes
>
>     Only glob/relglob as PATTERN in command line require "end of name"
>     matching.
>
>   - require additional "-eon" ("end of name") suffix for non-recursive
>     matching (e.g. "rootglob-eon", "cwdre-eon", "anypath-eon", ...)
>
> But according to your revised patch, "rootglob" syntax matches
> non-recursively. Would you assume as below ?
>
>   - newly introduced syntaxes match non-recursively by default
>   - recursive matching requires any additional suffix (e.g. "-recursive")
>

Ah, the assumption is slightly different - the assumption is that for glob
types, specifically, we're doing a full match, so that to get recursiveness
at the end the user should specify /** or similar. This allows the user to
do recursive or non-recursive matching by using * or ** as appropriate.
I'll suggest that regex types also do a full match, and the user can end
them with .* if they want it to be a prefix.
I believe this is simpler and more flexible than having 18 different
pattern types just to account for the different behavior of the matching. I
considered that we could, likewise, make partial matching be the default,
but I decided against that when making the patch because then it'd be
impossible to make them non-recursive by a modifier, without doubling the
number of matchers as you suggested.

On the other hand, you assume that newly introduced *path syntaxes
> will be recursive, as below. Would you assume that default
> recursive-ness is different between *glob and *path syntaxes ?
>

path would be recursive, as will glob that ends with ** or regex that ends
with .*


> > Also, for discussion: I assume the *path patterns will be recursive when
> > they reference a directory. Do we also want a non-recursive equivalent
> > (rootexact, rootfiles, rootnonrecursive or something like that)?
>
> IMHO, making patch description explain how recursive matching will be
> controlled in the future helps reviewers to evaluate your patch.
>

I'm happy to update the documentation on my patch to better reflect the
full-matching characteristic, if it's OK to push a new version of the patch
:)


> BTW, bikeshedding about name of additional suffix:
>
>   - for non-recursive matching, in "recursive matching by default" case
>
>     - "-eon"
>
>       "end of name matching" is my coined word only for explanation,
>       and let's choose better one :-)
>
>     - "-exact" for non-recursive matching
>
>       this might confuse developers, because current implementation
>       already uses "exact" term as "matching without any special
>       handling".
>
>         https://selenic.com/repo/hg/file/438173c41587/mercurial/
> match.py#l100
>
>     - "-nonrecursive"
>
>       this is too long, isn't it ?
>
>     - "-file"
>
>       this seems better (short and understandable for end users)
>
>   - for recursive matching, in "non-recursive matching by default" case
>
>     - "-recursive"
>
>       this is too long, isn't it ?
>
>     - "-dir"
>
>       this seems better (short and understandable for end users)


> > Thanks
> > Rodrigo
> >
> >
> >
> > On Mon, Oct 24, 2016 at 6:21 AM, Pierre-Yves David <
> > pierre-yves.david at ens-lyon.org> wrote:
> >
> > >
> > >
> > > On 10/21/2016 05:13 PM, FUJIWARA Katsunori wrote:
> > >
> > >> At Tue, 18 Oct 2016 10:12:07 -0400,
> > >> Augie Fackler wrote:
> > >>
> > >>>
> > >>> On Tue, Oct 18, 2016 at 9:52 AM, Yuya Nishihara <yuya at tcha.org>
> wrote:
> > >>>
> > >>>> On Tue, 18 Oct 2016 09:40:36 -0400, Augie Fackler wrote:
> > >>>>
> > >>>>> On Oct 18, 2016, at 09:38, Yuya Nishihara <yuya at tcha.org> wrote:
> > >>>>>>
> > >>>>>>> After coordinating on irc to figure out what this proposal
> actually
> > >>>>>>> is, I've noticed that the semantics of this "exact" proposal are
> > >>>>>>> exactly what "glob" does today, which means (I think) that
> > >>>>>>> "files:foo/bar" should be representable as "glob:foo/bar/*" -
> what am
> > >>>>>>> I missing?
> > >>>>>>>
> > >>>>>>
> > >>>>>> Maybe we want a "glob" relative to the repo root?
> > >>>>>>
> > >>>>>
> > >>>>> As far as I can tell, it already is. "relglob:" is relative to your
> > >>>>> location in the repo according to the docs.
> > >>>>>
> > >>>>
> > >>>> Unfortunately that isn't.
> > >>>>
> > >>>>         'glob:<glob>' - a glob relative to cwd
> > >>>>         'relglob:<glob>' - an unrooted glob (*.c matches C files in
> all
> > >>>> dirs)
> > >>>>
> > >>>> Don't ask me why. ;-)
> > >>>>
> > >>>
> > >>> Oh wat. It looks like narrowhg might change this behavior in narrowed
> > >>> repositories, thus my additional confusion.
> > >>>
> > >>> Maybe we should add "absglob" that is always repo-root-absolute. How
> > >>> do we feel about that overall?
> > >>>
> > >>
> > >> FYI, current pattern matching is implemented as below. This was
> > >> chatted in "non-recursive directory matching" session of 4.0 sprint,
> > >> and sorry for my late posting of this translation from
> > >> http://d.hatena.ne.jp/flying-foozy/20140107/1389087728 in Japanese,
> as
> > >> my backlog of the last sprint.
> > >>
> > >>   ============ ======= ======= ===========
> > >>   pattern type root-ed cwd-ed  any-of-path
> > >>   ============ ======= ======= ===========
> > >>   wildcard     ---     glob    relglob
> > >>   regexp       re      ---     relre
> > >>   raw string   path    relpath ---
> > >>   ============ ======= ======= ===========
> > >>
> > >>   If rule is read in from file (e.g. .hgignore):
> > >>
> > >>     * "glob" is treated as "relglob"
> > >>     * "re" is treated as "relre"
> > >>
> > >>   This is mentioned in "hg help patterns" and "hg help hgignore", but
> > >>   syntax name "relglob" and "relre" themselves aren't explained.
> > >>
> > >>   "end of name" matching is required:
> > >>
> > >>     * for glob/relglob as PATTERN (e.g. argument in command line), but
> > >>     * not for glob/relglob as INCLUDES/EXCLUDES, or other pattern
> syntaxes
> > >>
> > >>   For example, file "foo/bar/baz" is:
> > >>
> > >>     * not matched at "hg files glob:foo/bar"
> > >>     * but matched at "hg file -I glob:foo/bar"
> > >>
> > >>   This isn't mentioned in any help document :-<, and the latter seems
> > >>   to cause the issue mentioned in this patch series.
> > >>
> > >> How about introducing new systematic names like below to re-organize
> > >> current complicated mapping between names and matching ? (and enable
> > >> "end of name" matching by "-eon" suffix or so)
> > >>
> > >>   ============ ======== ======= ===========
> > >>   pattern type root-ed  cwd-ed  any-of-path
> > >>   ============ ======== ======= ===========
> > >>   wildcard     rootglob cwdglob anyglob
> > >>   regexp       rootre   cwdre   anyre
> > >>   raw string   rootpath cwdpath anypath
> > >>   ============ ======== ======= ===========
> > >>
> > >
> > > Moving toward a more regular and clear feature set and naming seems a
> win.
> > > I'm +1 for moving in that direction.
> > >
> > > Cheers,
> > >
> > > --
> > > Pierre-Yves David
> > >
> > [2  <text/html; UTF-8 (quoted-printable)>]
> >
>
> ----------------------------------------------------------------------
> [FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20161025/60de4af5/attachment.html>


More information about the Mercurial-devel mailing list