[PATCH] revset: introduce the summary predicate
Yuya Nishihara
yuya at tcha.org
Wed Jan 11 09:15:28 EST 2017
On Tue, 10 Jan 2017 20:09:45 -0500, Matt Harbison wrote:
> On Mon, 09 Jan 2017 23:33:17 -0500, Matt Harbison <mharbison72 at gmail.com>
> wrote:
> > On Mon, 09 Jan 2017 05:49:23 -0500, Pierre-Yves David
> >> * predicate
> >> * default behavior
> >> * support 'rich' stringmatcher ?
> >> * are 'literal:' case sensitive ?
> >> * are 're:' case sensitive (and supported at all) ?
> >
> > TL;DR: desc, grep, keyword, and author/user are the oddballs. grep and
> > keyword are the regex and literal halves respectively, of the same
> > search.
> >
> > After stripping out non string predicates, we basically end up with 4
> > groups:
> >
> > - util.stringmatcher based:
> >
> > Predicate: Case Sensitive?
> > "author" N
> > "user" N
> > "bookmark" Y
> > "branch" Y
> > "extra" Y
> > "named" Y
> > "subrepo" [1] Y
> > "tag" Y
> >
> > These all support 'literal:' and 're:'. Case sensitivity applies the
> > same to both prefixes, and raw pattern.
> >
> > [1] Not documented to support 'literal:' or 're:' prefixes.
> >
> >
> > - Local method implementation based:
> >
> > Predicate: Case Sensitive?
> > "desc" N
> > "grep" Y
> > "keyword" N
> >
> > None of these support prefixes. The grep param is a regex without
> > 're:', so it doesn't make a lot of sense to support stringmatcher here-
> > what stringmatcher thinks is literal is really regex. If we internally
> > bolt on 're:', it still can't support literal matches.
> >
> >
> > - match.py based (not relevant, but for completeness):
> >
> > "adds", "contains", "file", "filelog", "follow",
> > "followlines", "modifies", "removes"
> >
> > - "bisect" (I can't see how 're:' support here would be meaningful.)
> >
> >
> >> From there we'll be able to see if a pattern emerge and pick the best
> >> way to move forward.
Thanks for the nice summary. Inline comments follow.
> > The following thoughts come to mind:
> >
> > A) author/user has been thoroughly corrupted with the lower casing.
> > Maybe come up with a 3rd one that follows modern rules, and
> > deprecate/hide these? Sad, because these seem natural. OTOH, having
> > raw pattern and prefixed patterns behave the same everywhere is elegant
> > and simple. Not sure what to call it. 'committer' comes to mind, but
> > that has git implications. That was suggested when I proposed recording
> > who performed a graft in the 'extra' dict. Then it was pointed out this
> > tracking was related to the chain of custody proposal by Greg, so I let
> > it drop [1]. My enthusiasm was dampened some when I saw templates also
> > have {author} and {user}, though the latter is a filter, not a reference
> > to the field.
> >
> > [1]
> > https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-April/068692.html
Yeah, that would be unfortunate to deprecate author/user() since they are very
common terms. Also, adding case-sensitive author() would not provide much user
benefit even though that could help eliminating the inconsistency.
So I'm -1 for (A).
On the other hand, I think we should fix the re: case to not lowercase the
pattern but to perform case-insensitive matching so '\B' won't be '\b'.
> > B) We should maybe fold grep and keyword into a new predicate
> > ('search'?) that follows modern formatting, and deprecate/hide these
> > two. Simple, and drops the visible revset count by 1. I'm not sure if
> > 'grep' was borrowed from git, or if it's just inspired by the unix
> > command.
>
> I had second thoughts about this. I think if we just add documentation to
> 'keyword' that says "If you want to search these fields by regex or case
> sensitively, use 'grep'". Then the user can see how to get everything
> stringmatcher will provide, and it makes it clear to developers that
> 'keyword' doesn't need string stringmatcher support. I'll submit a patch
> during the freeze.
The documentation thing sounds nice. I also like the idea of deprecating
grep() since grep() sounds like searching the file contents.
> That leaves only 'author' and 'desc' as not providing the full
> functionality of the others.
>
> > C) If we do A + B, that means 'desc' is the only oddball left. I don't
> > like the idea that case sensitivity for a raw pattern and a 'literal:'
> > prefixed pattern would differ. They are both literals in my mind, and
> > it would be the one remaining exception. The 're:' prefix could follow
> > regular rules.
I won't insist that 'literal:' must be case-sensitive (because of (A).)
However, I would guess 're:' is also case-insensitive if 'literal:' is.
In my mindset, desc() is a case-insensitive matcher in that case.
So I lean towards adding case-insensitive desc('re:'), which would be at least
consistent in that desc() always ignores cases.
> > D) I guess we could hide/replace 'desc' with something new as well.
> > ('message'?) But I'm wondering about the BC rules. I think generally,
> > we don't change behavior without the user doing some action to opt in.
> > But the 're:' prefix was added after many of these predicates existed
> > for years. Obviously, that broke anybody's search that started with
> > 're:', just by them upgrading Mercurial. If making them change the
> > query to prefix 'literal:' was acceptable, why would changing the case
> > sensitivity not be, assuming we provide 'icase-literal:'? Is it just a
> > general feeling that searches for 're:' are rare? I understand the
> > principle and motivation, but sometimes it's hard to figure out how they
> > are applied.
My intuition about the BC rule is that we can slightly change the behavior if
a) vast majority would be unaffected (e.g. 're:' is rare)
b) there's a good way to get along with the BC (e.g. using 'literal:')
I bet some people would rely on the current case-insensitive desc(), so
I don't think we can change the default.
> >> I've missed that '{firstline}' proposal from Sean, can you point me at
> >> it? (or summarize it ?)
> >
> > Not a proposal, so much as historical knowledge about the template?
> >
> > https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-January/092099.html
I thought that the proposal was something like
search_in_template_output('re:issue\d+', '{desc|firstline'}').
More information about the Mercurial-devel
mailing list