[PATCH] match: adding non-recursive directory matching

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Thu Nov 24 10:28:31 EST 2016


At Wed, 23 Nov 2016 19:55:16 -0800,
Rodrigo Damazio wrote:
> 
> Hi guys - any comments on the preferred way forward?
> 
> (I do have a follow-up patch for optimizing visitdir accordingly, but don't
> want to send it until this one is agreed upon)

Sorry for long interval !

> On Thu, Nov 17, 2016 at 1:19 PM, Rodrigo Damazio <rdamazio at google.com>
> wrote:
> 
> >
> >
> > On Thu, Nov 17, 2016 at 7:52 AM, FUJIWARA Katsunori <foozy at lares.dti.ne.jp
> > > wrote:
> >
> >>
> >> (sorry for late reply)
> >>
> >> At Wed, 26 Oct 2016 14:02:48 -0700,
> >> Rodrigo Damazio wrote:
> >> >
> >> > On Wed, Oct 26, 2016 at 12:17 AM, FUJIWARA Katsunori <
> >> foozy at lares.dti.ne.jp>
> >> > wrote:
> >> >
> >> > >
> >> > > At Tue, 25 Oct 2016 19:51:59 -0700,
> >> > > Rodrigo Damazio wrote:
> >> > > >
> >> > > > On Tue, Oct 25, 2016 at 4:31 PM, FUJIWARA Katsunori <
> >> > > foozy at lares.dti.ne.jp>
> >> > > > wrote:
> >> > > >
> >> > > > >
> >> > > > > At Mon, 24 Oct 2016 10:34:52 -0700,
> >> > > > > Rodrigo Damazio wrote:
> >>
> >> [snip]
> >>
> >> > > > On the other hand, you assume that newly introduced *path syntaxes
> >> > > > > will be recursive, as below. Would you assume that default
> >> > > > > recursive-ness is different between *glob and *path syntaxes ?
> >> > > > >
> >> > > >
> >> > > > path would be recursive, as will glob that ends with ** or regex
> >> that
> >> > > ends
> >> > > > with .*
> >> > > >
> >> > > >
> >> > > > > > Also, for discussion: I assume the *path patterns will be
> >> recursive
> >> > > when
> >> > > > > > they reference a directory. Do we also want a non-recursive
> >> > > equivalent
> >> > > > > > (rootexact, rootfiles, rootnonrecursive or something like that)?
> >> > >
> >> > > How about adding syntax type "file"/"dir" ?
> >> > >
> >> > >   ===== ============= =================
> >> > >   type  for recursive for non-recursive
> >> > >   ===== ============= =================
> >> > >   glob  use "**"      use "*"
> >> > >   re    omit "$"      append "$"
> >> > >   path  always(*1)    ----
> >> > >   file  ----          always
> >> > >   dir   always(*2)    ----
> >> > >   ===== ============= =================
> >> > >
> >> > >   (*1) match against both file and directory
> >> > >   (*2) match against only directory
> >> > >
> >> > > "dir" might be overkill, though :-) (is it useful in resolving name
> >> > > collision at merging or so ?)
> >> > >
> >> >
> >> > foozy, thanks so much for the review and discussion.
> >> > Sounds like we do agree about the glob behavior then, so let me know if
> >> > you'd like any changes to the latest version of this patch, other than
> >> > improving documentation. I'm happy to send an updated version as soon as
> >> > someone is ready to review.
> >> >
> >> > I understand the difference between dir and path (and between the
> >> original
> >> > version of this patch and file) would be that they'd validate the type
> >> of
> >> > entry being matched (so that passing a filename to dir or dir name to
> >> file
> >> > would be an error) - is that what you have in mind?
> >>
> >> Yes > "passing a filename to dir or dir name to file would be an error"
> >>
> >>
> >> > The current matchers
> >> > don't have a good mechanism to verify the type, so some significant
> >> > rewiring would need to be done to pass that information down.
> >>
> >> Current match implement uses two additional pattern suffix '(?:/|$)'
> >> and '$' to control recursive matching of "glob" and "path". The former
> >> allows to match recursively (for "glob" and "path"), and the latter
> >> doesn't (only for "glob").
> >>
> >> I simply think using this technique to implement pattern types "file"
> >> and "dir".
> >>
> >>     path:PATTERN => ESCAPED-PATTERN(?:/|$)
> >>     file:PATTERN => ESCAPED-PATTERN$
> >>     dif:PATTERN  => ESCAPED-PATTERN/
> >>
> >
> > Yes, "files:" was the original version of this patch and the case I really
> > care about :) I changed it to rootglob after your comments.
> > Which way would be preferred to move forward?

"files:" is "path:" family, and "rootglob:" is "glob:" family. As we
concluded before, "path:" itself can't control recursion of matching
well.

Therefore, I think that "files:" should be implemented if needed,
regardless of implementing "rootglob:".

Of course, we need high point view of this area, at first :-)


BTW, it is a little ambiguous (at least, for me) that "files:foo"
matches against both file "foo" and files just under directory
"foo". Name other than "files:" may resolve this ambiguity, but I
don't have any better (and short enough) name :-<

  ========== ==== ======= ===========
  pattern    foo  foo/bar foo/bar/baz
  ========== ==== ======= ===========
  path:foo    o     o         o

  files:foo   o     o         x

  file:foo    o     x         x
  dir:foo     x     o         o
  ========== ==== ======= ===========


----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp


More information about the Mercurial-devel mailing list