[PATCH] revset: introduce the summary predicate

Matt Harbison mharbison72 at gmail.com
Thu Jan 12 00:40:36 EST 2017


On Wed, 11 Jan 2017 09:15:28 -0500, Yuya Nishihara <yuya at tcha.org> wrote:

> On Tue, 10 Jan 2017 20:09:45 -0500, Matt Harbison wrote:
>> On Mon, 09 Jan 2017 23:33:17 -0500, Matt Harbison  
>> <mharbison72 at gmail.com>
>> wrote:
>> > On Mon, 09 Jan 2017 05:49:23 -0500, Pierre-Yves David
>> >> * predicate
>> >> * default behavior
>> >> * support 'rich' stringmatcher ?
>> >> * are 'literal:' case sensitive ?
>> >> * are 're:' case sensitive (and supported at all) ?
>> >
>> > TL;DR: desc, grep, keyword, and author/user are the oddballs.  grep  
>> and
>> > keyword are the regex and literal halves respectively, of the same
>> > search.
>> >
>> > After stripping out non string predicates, we basically end up with 4
>> > groups:
>> >
>> > - util.stringmatcher based:
>> >
>> >      Predicate:            Case Sensitive?
>> >      "author"                   N
>> >      "user"                     N
>> >      "bookmark"                 Y
>> >      "branch"                   Y
>> >      "extra"                    Y
>> >      "named"                    Y
>> >      "subrepo"  [1]             Y
>> >      "tag"                      Y
>> >
>> > These all support 'literal:' and 're:'.  Case sensitivity applies the
>> > same to both prefixes, and raw pattern.
>> >
>> > [1] Not documented to support 'literal:' or 're:' prefixes.
>> >
>> >
>> > - Local method implementation based:
>> >
>> >      Predicate:            Case Sensitive?
>> >      "desc"                     N
>> >      "grep"                     Y
>> >      "keyword"                  N
>> >
>> > None of these support prefixes.  The grep param is a regex without
>> > 're:', so it doesn't make a lot of sense to support stringmatcher  
>> here-
>> > what stringmatcher thinks is literal is really regex.  If we  
>> internally
>> > bolt on 're:', it still can't support literal matches.
>> >
>> >
>> > - match.py based (not relevant, but for completeness):
>> >
>> >      "adds", "contains", "file", "filelog", "follow",
>> >      "followlines", "modifies", "removes"
>> >
>> > - "bisect" (I can't see how 're:' support here would be meaningful.)
>> >
>> >
>> >>  From there we'll be able to see if a pattern emerge and pick the  
>> best
>> >> way to move forward.
>
> Thanks for the nice summary. Inline comments follow.
>
>> > The following thoughts come to mind:
>> >
>> > A) author/user has been thoroughly corrupted with the lower casing.
>> > Maybe come up with a 3rd one that follows modern rules, and
>> > deprecate/hide these?  Sad, because these seem natural.  OTOH, having
>> > raw pattern and prefixed patterns behave the same everywhere is  
>> elegant
>> > and simple.  Not sure what to call it.  'committer' comes to mind, but
>> > that has git implications.  That was suggested when I proposed  
>> recording
>> > who performed a graft in the 'extra' dict.  Then it was pointed out  
>> this
>> > tracking was related to the chain of custody proposal by Greg, so I  
>> let
>> > it drop [1].  My enthusiasm was dampened some when I saw templates  
>> also
>> > have {author} and {user}, though the latter is a filter, not a  
>> reference
>> > to the field.
>> >
>> > [1]
>> >  
>> https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-April/068692.html
>
> Yeah, that would be unfortunate to deprecate author/user() since they  
> are very
> common terms. Also, adding case-sensitive author() would not provide  
> much user
> benefit even though that could help eliminating the inconsistency.
>
> So I'm -1 for (A).

Yeah... And given what Augie said, I can agree with that.

> On the other hand, I think we should fix the re: case to not lowercase  
> the
> pattern but to perform case-insensitive matching so '\B' won't be '\b'.
>
>> > B) We should maybe fold grep and keyword into a new predicate
>> > ('search'?) that follows modern formatting, and deprecate/hide these
>> > two.  Simple, and drops the visible revset count by 1.  I'm not sure  
>> if
>> > 'grep' was borrowed  from git, or if it's just inspired by the unix
>> > command.
>>
>> I had second thoughts about this.  I think if we just add documentation  
>> to
>> 'keyword' that says "If you want to search these fields by regex or case
>> sensitively, use 'grep'".  Then the user can see how to get everything
>> stringmatcher will provide, and it makes it clear to developers that
>> 'keyword' doesn't need string stringmatcher support.  I'll submit a  
>> patch
>> during the freeze.
>
> The documentation thing sounds nice.

I threw it into this next series, since it's pretty trivial.

> I also like the idea of deprecating
> grep() since grep() sounds like searching the file contents.

I don't have a strong feeling on that.  If somebody makes a new search  
function though, I wonder if it should be like revset.matching() (but with  
stringmatcher support), where the user can control the fields searched, in  
order to avoid this sort of ambiguity.  I wouldn't want to fold grep()  
into author() because of the clashing case sensitive/insensitive you  
mention below.

>> That leaves only 'author' and 'desc' as not providing the full
>> functionality of the others.
>>
>> > C) If we do A + B, that means 'desc' is the only oddball left.  I  
>> don't
>> > like the idea that case sensitivity for a raw pattern and a 'literal:'
>> > prefixed pattern would differ.  They are both literals in my mind, and
>> > it would be the one remaining exception.  The 're:' prefix could  
>> follow
>> > regular rules.
>
> I won't insist that 'literal:' must be case-sensitive (because of (A).)
> However, I would guess 're:' is also case-insensitive if 'literal:' is.
> In my mindset, desc() is a case-insensitive matcher in that case.
>
> So I lean towards adding case-insensitive desc('re:'), which would be at  
> least
> consistent in that desc() always ignores cases.

The only reason I would guess 're:' is case sensitive, is because I've  
never run into one that hasn't been.  I do like the consistency argument  
though, so let's try that.  I wonder if in addition to the  
'icase-literal:' you suggested, if this also means we need a 'case-re:',  
since it doesn't look like you can force re.I off.  I don't see any real  
benefit for author(), but I can maybe see it for desc().  The series I'm  
about to submit hints at the ability to add these with one or two lines.   
See what you think.


More information about the Mercurial-devel mailing list