[PATCH] match: adding non-recursive directory matching

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Tue Oct 25 19:31:49 EDT 2016


At Mon, 24 Oct 2016 10:34:52 -0700,
Rodrigo Damazio wrote:
> 
> [1  <text/plain; UTF-8 (7bit)>]
> It sounds like we'd like to do 3 somewhat orthogonal things:
> - allow user to specify the directory the pattern is relative to
> (root/cwd/any)
> - allow the user to specify recursiveness/non-recursiveness consistently
> (not covered by the *path patterns, but could be the defined behavior for
> the globs)
> - clean up the matcher API (discussed during Sprint)
> 
> Doing all 3 together would probably take some time and a lot of
> back-and-forth, so I'm wondering if it'd be ok to start by updating this
> patch to implement "rootglob" with consistent recursiveness behavior, and
> we can then more slowly add the other patterns and converge on a cleaner
> API?

(let's suspend posting revised series while code freeze period, to
focus on stabilization :-))

    https://www.mercurial-scm.org/wiki/TimeBasedReleasePlan#Code_Freeze

In my previous reply, I assume that newly introduced syntaxes do:

  - match recursively by default regardless of the way of passing
    (command line, -I/-X, ....), because of similarity with almost all
    of existing syntaxes

    Only glob/relglob as PATTERN in command line require "end of name"
    matching.

  - require additional "-eon" ("end of name") suffix for non-recursive
    matching (e.g. "rootglob-eon", "cwdre-eon", "anypath-eon", ...)

But according to your revised patch, "rootglob" syntax matches
non-recursively. Would you assume as below ?

  - newly introduced syntaxes match non-recursively by default
  - recursive matching requires any additional suffix (e.g. "-recursive")

On the other hand, you assume that newly introduced *path syntaxes
will be recursive, as below. Would you assume that default
recursive-ness is different between *glob and *path syntaxes ?

> Also, for discussion: I assume the *path patterns will be recursive when
> they reference a directory. Do we also want a non-recursive equivalent
> (rootexact, rootfiles, rootnonrecursive or something like that)?

IMHO, making patch description explain how recursive matching will be
controlled in the future helps reviewers to evaluate your patch.


BTW, bikeshedding about name of additional suffix:

  - for non-recursive matching, in "recursive matching by default" case

    - "-eon"

      "end of name matching" is my coined word only for explanation,
      and let's choose better one :-)

    - "-exact" for non-recursive matching

      this might confuse developers, because current implementation
      already uses "exact" term as "matching without any special
      handling".

        https://selenic.com/repo/hg/file/438173c41587/mercurial/match.py#l100

    - "-nonrecursive"

      this is too long, isn't it ?

    - "-file"

      this seems better (short and understandable for end users)

  - for recursive matching, in "non-recursive matching by default" case

    - "-recursive"

      this is too long, isn't it ?

    - "-dir"

      this seems better (short and understandable for end users)

> Thanks
> Rodrigo
> 
> 
> 
> On Mon, Oct 24, 2016 at 6:21 AM, Pierre-Yves David <
> pierre-yves.david at ens-lyon.org> wrote:
> 
> >
> >
> > On 10/21/2016 05:13 PM, FUJIWARA Katsunori wrote:
> >
> >> At Tue, 18 Oct 2016 10:12:07 -0400,
> >> Augie Fackler wrote:
> >>
> >>>
> >>> On Tue, Oct 18, 2016 at 9:52 AM, Yuya Nishihara <yuya at tcha.org> wrote:
> >>>
> >>>> On Tue, 18 Oct 2016 09:40:36 -0400, Augie Fackler wrote:
> >>>>
> >>>>> On Oct 18, 2016, at 09:38, Yuya Nishihara <yuya at tcha.org> wrote:
> >>>>>>
> >>>>>>> After coordinating on irc to figure out what this proposal actually
> >>>>>>> is, I've noticed that the semantics of this "exact" proposal are
> >>>>>>> exactly what "glob" does today, which means (I think) that
> >>>>>>> "files:foo/bar" should be representable as "glob:foo/bar/*" - what am
> >>>>>>> I missing?
> >>>>>>>
> >>>>>>
> >>>>>> Maybe we want a "glob" relative to the repo root?
> >>>>>>
> >>>>>
> >>>>> As far as I can tell, it already is. "relglob:" is relative to your
> >>>>> location in the repo according to the docs.
> >>>>>
> >>>>
> >>>> Unfortunately that isn't.
> >>>>
> >>>>         'glob:<glob>' - a glob relative to cwd
> >>>>         'relglob:<glob>' - an unrooted glob (*.c matches C files in all
> >>>> dirs)
> >>>>
> >>>> Don't ask me why. ;-)
> >>>>
> >>>
> >>> Oh wat. It looks like narrowhg might change this behavior in narrowed
> >>> repositories, thus my additional confusion.
> >>>
> >>> Maybe we should add "absglob" that is always repo-root-absolute. How
> >>> do we feel about that overall?
> >>>
> >>
> >> FYI, current pattern matching is implemented as below. This was
> >> chatted in "non-recursive directory matching" session of 4.0 sprint,
> >> and sorry for my late posting of this translation from
> >> http://d.hatena.ne.jp/flying-foozy/20140107/1389087728 in Japanese, as
> >> my backlog of the last sprint.
> >>
> >>   ============ ======= ======= ===========
> >>   pattern type root-ed cwd-ed  any-of-path
> >>   ============ ======= ======= ===========
> >>   wildcard     ---     glob    relglob
> >>   regexp       re      ---     relre
> >>   raw string   path    relpath ---
> >>   ============ ======= ======= ===========
> >>
> >>   If rule is read in from file (e.g. .hgignore):
> >>
> >>     * "glob" is treated as "relglob"
> >>     * "re" is treated as "relre"
> >>
> >>   This is mentioned in "hg help patterns" and "hg help hgignore", but
> >>   syntax name "relglob" and "relre" themselves aren't explained.
> >>
> >>   "end of name" matching is required:
> >>
> >>     * for glob/relglob as PATTERN (e.g. argument in command line), but
> >>     * not for glob/relglob as INCLUDES/EXCLUDES, or other pattern syntaxes
> >>
> >>   For example, file "foo/bar/baz" is:
> >>
> >>     * not matched at "hg files glob:foo/bar"
> >>     * but matched at "hg file -I glob:foo/bar"
> >>
> >>   This isn't mentioned in any help document :-<, and the latter seems
> >>   to cause the issue mentioned in this patch series.
> >>
> >> How about introducing new systematic names like below to re-organize
> >> current complicated mapping between names and matching ? (and enable
> >> "end of name" matching by "-eon" suffix or so)
> >>
> >>   ============ ======== ======= ===========
> >>   pattern type root-ed  cwd-ed  any-of-path
> >>   ============ ======== ======= ===========
> >>   wildcard     rootglob cwdglob anyglob
> >>   regexp       rootre   cwdre   anyre
> >>   raw string   rootpath cwdpath anypath
> >>   ============ ======== ======= ===========
> >>
> >
> > Moving toward a more regular and clear feature set and naming seems a win.
> > I'm +1 for moving in that direction.
> >
> > Cheers,
> >
> > --
> > Pierre-Yves David
> >
> [2  <text/html; UTF-8 (quoted-printable)>]
> 

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp


More information about the Mercurial-devel mailing list