[PATCH] match: adding non-recursive directory matching

Rodrigo Damazio rdamazio at google.com
Fri Jan 27 18:14:38 EST 2017


On Fri, Jan 27, 2017 at 1:03 AM, FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
wrote:

>
> At Thu, 26 Jan 2017 17:27:17 -0800,
> Rodrigo Damazio wrote:
> >
> > [1  <multipart/alternative (7bit)>]
> > [1.1  <text/plain; UTF-8 (7bit)>]
> > All sounds very reasonable, and "filesin:" or "rootfilesin:" LGTM.
>
> Is it OK for your solution that "rootfilesin:FOO" doesn't match
> against "file FOO", even though your patch posted in this thread made
> "files:FOO" do so ? or, is combining "rootfile:" and "rootfilesin"
> acceptable for your solution ?
>

Yes, not matching files is fine, and actually the easiest to implement (the
regex is simpler and our custom server doesn't support files anyway).
For that, rootfilesin:foo/bar can produce regex ^foo/bar/[^/]+$ or similar
which would not match a file called bar. visitdir would have to be updated
accordingly, of course, but that shouldn't be too hard (and i can take the
opportunity to add some comments to the code).

If that looks good to you, let me know and I'll send an updated patch.

> On Thu, Jan 26, 2017 at 11:24 AM, Martin von Zweigbergk <
> > martinvonz at google.com> wrote:
> >
> > > On Thu, Jan 26, 2017 at 11:19 AM, FUJIWARA Katsunori
> > > <foozy at lares.dti.ne.jp> wrote:
> > > >
> > > > At Wed, 25 Jan 2017 20:54:37 -0800,
> > > > Martin von Zweigbergk wrote:
> > > >>
> > > >> On Mon, Jan 23, 2017 at 5:02 PM, Rodrigo Damazio via Mercurial-devel
> > > >> <mercurial-devel at mercurial-scm.org> wrote:
> > > >> > Getting back to this after the end-of-year hiatus (yes, I know it
> > > happens to
> > > >> > be during another code freeze :) I seem to have good timing).
> > > >> >
> > > >> > On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David
> > > >> > <pierre-yves.david at ens-lyon.org> wrote:
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote:
> > > >> >>>
> > > >> >>>     If I got these two pieces right, it looks like we could just
> > > apply
> > > >> >>>     the improvement to 'visitdir' to 'set:your/glob/*' and have
> your
> > > >> >>>     usecase filled while not jumping into UI changes. Would that
> > > work
> > > >> >>>     for you ?
> > > >> >>>
> > > >> >>>
> > > >> >>> Not without a third set of changes, since set expansion doesn't
> use
> > > >> >>> visitdir (or the matcher being built) at all - the dependency is
> > > that
> > > >> >>> building the matcher depends on expanding the set (and thus the
> set
> > > >> >>> can't depend on the matcher).
> > > >> >>> It would technically be doable for re:, but I'm wary of getting
> > > into the
> > > >> >>> business of parsing and special-casing regexes to assume what
> they
> > > match
> > > >> >>> or don't.
> > > >> >>
> > > >> >>
> > > >> >> Rodrigo and I chatted directly about this a couple of days ago.
> Here
> > > is a
> > > >> >> quick summary of my new understanding of the situation.
> > > >> >>
> > > >> >> Fileset
> > > >> >> -------
> > > >> >>
> > > >> >> Fileset (behind "set:") can give the right result, but it is
> powered
> > > by
> > > >> >> not very modern code, it follow the old revset principle of "get
> > > everything
> > > >> >> and then run filters on that everything". That does not fit
> Rodrigo
> > > needs at
> > > >> >> all. It was easy to make 'set:' a bit smarter in the simple case
> but
> > > then we
> > > >> >> get into the issue that the matcher class is using 'set:' in a
> > > strange,
> > > >> >> non-lazy, way that does not use all the 'visitdir' feature
> > > Rodrigo/Google
> > > >> >> needs.
> > > >> >>
> > > >> >> So in short, fileset needs a rework before being usable in a
> > > demanding
> > > >> >> context.
> > > >> >>
> > > >> >>
> > > >> >> Current path restriction capability
> > > >> >> -----------------------------------
> > > >> >>
> > > >> >> The 'Match' class already have logic to restrict the path visited
> > > >> >> (implemented in the 'visitdir' method). To clarify, this logic
> as no
> > > effect
> > > >> >> on the returned match but is only an optimization for the
> directory
> > > we
> > > >> >> visit. It seems to only kicks in when treemanifest is used.
> > > >> >> This logic already works with a couple of patterns type (all
> pattern
> > > use
> > > >> >> the same class). However, that logic currently do not support the
> > > case were
> > > >> >> one want to select some subdirectory and skips the rest of the
> > > subtrees
> > > >> >> under it.
> > > >> >
> > > >> >
> > > >> > That is correct.
> > > >> >
> > > >> >> note: Rodrigo, you seems to have a good understanding of the
> logic.
> > > Do you
> > > >> >> think you could document the involved attributes (_includeroots,
> > > >> >> _includedirs, _excluderoots, etc) That would help a lot the next
> > > poor souls
> > > >> >> looking at the code.
> > > >> >
> > > >> >
> > > >> > Sure. It took me a while to understand that "roots" means
> "recursive
> > > >> > directories" and "dirs" means "non-recursive directories" in that
> > > code - it
> > > >> > all became much more clear after that. I'll be sure to add
> comments
> > > in my
> > > >> > patch and/or rename the attributes.
> > > >> >
> > > >> >>
> > > >> >>
> > > >> >> Way forward
> > > >> >> -----------
> > > >> >>
> > > >> >> That limitation in the matcher class optimization is the main
> > > blocker for
> > > >> >> Rodrigo/Google needs. The optimization is independent of the UI
> part
> > > we
> > > >> >> actually provides to user as all patterns use the same matcher
> class
> > > and
> > > >> >> some existing class could already benefit from this optimization.
> > > >> >>
> > > >> >> Rodrigo seems to have a patch to update the matcher code to
> track and
> > > >> >> optimize the "subdir-but-not-subtree" case. He has not submitted
> > > this patch
> > > >> >> yet. Submitting that patches seems the next step to me. It will
> get
> > > the
> > > >> >> matcher code in a state that can actually be used for the
> > > >> >> narrowhg+treemanifest usecase.
> > > >> >>
> > > >> >> Once that code is in, it seems easy to make sure various patterns
> > > use it
> > > >> >> basic, easily recognizable cases. We poked at updating the code
> to
> > > have
> > > >> >> basic regexp matching a subtree recognized as such and that was
> > > quite easy.
> > > >> >>
> > > >> >>
> > > >> >> Rodrigo, does that match your current understanding of the
> situation?
> > > >> >
> > > >> >
> > > >> > It does.
> > > >> > And just to clarify on the patches - I sent an initial patch, then
> > > after
> > > >> > comments changes it significantly, so those are two different
> changes:
> > > >> >
> > > >> > The first implements a "files:" matcher which matches all files
> > > inside a
> > > >> > directory, non-recursively. This has no wildcards, so
> special-casing
> > > it in
> > > >> > visitdir and any other places needed results in clean and simple
> code
> > > ("if
> > > >> > it's files:, don't recurse").
> > > >> > The second implements "rootglob:" which allows any number of
> > > wildcards at
> > > >> > point in the path, and is part of Foozy's plan for the new set of
> > > matchers.
> > > >> > This adds some complexity in splitting dirs and roots (mentioned
> > > above) by
> > > >> > having to parse the wildcards, and then the visitdir change looks
> > > less clean
> > > >> > ("if it's a rootglob that has a single /* wildcard at the end,
> then
> > > don't
> > > >> > recurse" - other cases are possible but start to get more
> complex).
> > > >> >
> > > >> > For these reasons, I'd still prefer to get "files:" or similar in,
> > > but I'm
> > > >> > open for doing it either way. Please advise on the preferred way
> and
> > > I'll
> > > >> > send an updated patch (2 patches really - one for the matcher, one
> > > for the
> > > >> > visitdir optimization which makes it work with narrow).
> > > >>
> > > >> I'm fine with not doing rootglob:, but if I read foozy's proposal
> > > >> right, the proposed files: will be what he would call rootfiles:. I
> > > >> liked his proposal for a systematic naming, and if I got that
> right, I
> > > >> think we should call it that from the beginning so we don't end up
> > > >> with more aliases.
> > > >
> > > > Yeah, we should avoid confusion of naming !
> > > >
> > > >> I'd also like "rootfiles:foo" to *not* match the
> > > >> file "foo", but only files in the directory "foo/". I mention that
> > > >> because last I heard, he was unsure about that himself. Foozy, do
> you
> > > >> agree?
> > > >
> > > > I don't have strong opinion against mode "XXX:", which matches
> against
> > > > both "just this file" and "files directly under this directory"
> > > >
> > > > Therefore, I agree with adding new mode "XXX:", if it is needed (and
> > > > Rodrigo/Google think so).
> > > >
> > > > But, name "files:" doesn't seem to suit for "XXX:" mode (at least,
> for
> > > > me).
> > > >
> > > > Even if it matches against only "files directly under this
> directory",
> > > > "files:" doesn't yet.
> > > >
> > > > Maybe, root cause of my bad feel is that "foo" of "files:foo" should
> > > > be the directory in both cases, even though mode name "files:" has
> > > > less "(under) this directory" flavor.
> > > >
> > > > If it is possible to combine 2 modes below for solving issues of
> > > > Rodrigo/Google, I'm +1 for splitting "XXX:" into them, because naming
> > > > "YYY:" should be easier than naming "XXX:".
> > > >
> > > >   - "file:" matching against "just this file"
> > > >   - "YYY:" matching against "files directly under this directory"
> > > >
> > > > Are there any better (and short enough) names for "XXX:" or "YYY:"
> > > > than "files:" ?
> > > >
> > > >   - "filesin:" (Files-In) for "YYY:" (<=> "files under" as "dir:") ?
> > > >   - "fileorin:" (File-Or-(Files-)In) for "XXX:" ?
> > > >
> > > > Of course, I'm also OK with naming "XXX:" or "YYY:" as "files:", to
> go
> > > > forward :-)
> > >
> > > I like "file:" and "filesin:" for those two cases. But we should add
> > > the "root" prefix so we don't have to do that later, right?
> > >
> > > >
> > > >>
> > > >> >
> > > >> > Thanks
> > > >> > Rodrigo
> > > >> >
> > > >> >
> > > >> > _______________________________________________
> > > >> > Mercurial-devel mailing list
> > > >> > Mercurial-devel at mercurial-scm.org
> > > >> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> > > >> >
> > > >>
> > > >
> > > > --
> > > > ------------------------------------------------------------
> ----------
> > > > [FUJIWARA Katsunori]
> foozy at lares.dti.ne.jp
> > >
> > [1.2  <text/html; UTF-8 (quoted-printable)>]
> >
> > [2 S/MIME Cryptographic Signature <application/pkcs7-signature (base64)>]
> >
>
> --
> ----------------------------------------------------------------------
> [FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20170127/f7d96cd0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4847 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20170127/f7d96cd0/attachment.bin>


More information about the Mercurial-devel mailing list