[PATCH] revset: introduce the summary predicate

Thu Jan 12 00:40:36 EST 2017

On Wed, 11 Jan 2017 09:15:28 -0500, Yuya Nishihara <yuya at tcha.org> wrote:

> On Tue, 10 Jan 2017 20:09:45 -0500, Matt Harbison wrote:
>> On Mon, 09 Jan 2017 23:33:17 -0500, Matt Harbison  
>> <mharbison72 at gmail.com>
>> wrote:
>> > On Mon, 09 Jan 2017 05:49:23 -0500, Pierre-Yves David
>> >> * predicate
>> >> * default behavior
>> >> * support 'rich' stringmatcher ?
>> >> * are 'literal:' case sensitive ?
>> >> * are 're:' case sensitive (and supported at all) ?
>> >
>> > TL;DR: desc, grep, keyword, and author/user are the oddballs.  grep  
>> and
>> > keyword are the regex and literal halves respectively, of the same
>> > search.
>> >
>> > After stripping out non string predicates, we basically end up with 4
>> > groups:
>> >
>> > - util.stringmatcher based:
>> >
>> >      Predicate:            Case Sensitive?
>> >      "author"                   N
>> >      "user"                     N
>> >      "bookmark"                 Y
>> >      "branch"                   Y
>> >      "extra"                    Y
>> >      "named"                    Y
>> >      "subrepo"  [1]             Y
>> >      "tag"                      Y
>> >
>> > These all support 'literal:' and 're:'.  Case sensitivity applies the
>> > same to both prefixes, and raw pattern.
>> >
>> > [1] Not documented to support 'literal:' or 're:' prefixes.
>> >
>> >
>> > - Local method implementation based:
>> >
>> >      Predicate:            Case Sensitive?
>> >      "desc"                     N
>> >      "grep"                     Y
>> >      "keyword"                  N
>> >
>> > None of these support prefixes.  The grep param is a regex without
>> > 're:', so it doesn't make a lot of sense to support stringmatcher  
>> here-
>> > what stringmatcher thinks is literal is really regex.  If we  
>> internally
>> > bolt on 're:', it still can't support literal matches.
>> >
>> >
>> > - match.py based (not relevant, but for completeness):
>> >
>> >      "adds", "contains", "file", "filelog", "follow",
>> >      "followlines", "modifies", "removes"
>> >
>> > - "bisect" (I can't see how 're:' support here would be meaningful.)
>> >
>> >
>> >>  From there we'll be able to see if a pattern emerge and pick the  
>> best
>> >> way to move forward.
>
> Thanks for the nice summary. Inline comments follow.
>
>> > The following thoughts come to mind:
>> >
>> > A) author/user has been thoroughly corrupted with the lower casing.
>> > Maybe come up with a 3rd one that follows modern rules, and
>> > deprecate/hide these?  Sad, because these seem natural.  OTOH, having
>> > raw pattern and prefixed patterns behave the same everywhere is  
>> elegant
>> > and simple.  Not sure what to call it.  'committer' comes to mind, but
>> > that has git implications.  That was suggested when I proposed  
>> recording
>> > who performed a graft in the 'extra' dict.  Then it was pointed out  
>> this
>> > tracking was related to the chain of custody proposal by Greg, so I  
>> let
>> > it drop [1].  My enthusiasm was dampened some when I saw templates  
>> also
>> > have {author} and {user}, though the latter is a filter, not a  
>> reference
>> > to the field.
>> >
>> > [1]
>> >  
>> https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-April/068692.html
>
> Yeah, that would be unfortunate to deprecate author/user() since they  
> are very
> common terms. Also, adding case-sensitive author() would not provide  
> much user
> benefit even though that could help eliminating the inconsistency.
>
> So I'm -1 for (A).

Yeah... And given what Augie said, I can agree with that.

> On the other hand, I think we should fix the re: case to not lowercase  
> the
> pattern but to perform case-insensitive matching so '\B' won't be '\b'.
>
>> > B) We should maybe fold grep and keyword into a new predicate
>> > ('search'?) that follows modern formatting, and deprecate/hide these
>> > two.  Simple, and drops the visible revset count by 1.  I'm not sure  
>> if
>> > 'grep' was borrowed  from git, or if it's just inspired by the unix
>> > command.
>>
>> I had second thoughts about this.  I think if we just add documentation  
>> to
>> 'keyword' that says "If you want to search these fields by regex or case
>> sensitively, use 'grep'".  Then the user can see how to get everything
>> stringmatcher will provide, and it makes it clear to developers that
>> 'keyword' doesn't need string stringmatcher support.  I'll submit a  
>> patch
>> during the freeze.
>
> The documentation thing sounds nice.

I threw it into this next series, since it's pretty trivial.

> I also like the idea of deprecating
> grep() since grep() sounds like searching the file contents.

I don't have a strong feeling on that.  If somebody makes a new search  
function though, I wonder if it should be like revset.matching() (but with  
stringmatcher support), where the user can control the fields searched, in  
order to avoid this sort of ambiguity.  I wouldn't want to fold grep()  
into author() because of the clashing case sensitive/insensitive you  
mention below.

>> That leaves only 'author' and 'desc' as not providing the full
>> functionality of the others.
>>
>> > C) If we do A + B, that means 'desc' is the only oddball left.  I  
>> don't
>> > like the idea that case sensitivity for a raw pattern and a 'literal:'
>> > prefixed pattern would differ.  They are both literals in my mind, and
>> > it would be the one remaining exception.  The 're:' prefix could  
>> follow
>> > regular rules.
>
> I won't insist that 'literal:' must be case-sensitive (because of (A).)
> However, I would guess 're:' is also case-insensitive if 'literal:' is.
> In my mindset, desc() is a case-insensitive matcher in that case.
>
> So I lean towards adding case-insensitive desc('re:'), which would be at  
> least
> consistent in that desc() always ignores cases.

The only reason I would guess 're:' is case sensitive, is because I've  
never run into one that hasn't been.  I do like the consistency argument  
though, so let's try that.  I wonder if in addition to the  
'icase-literal:' you suggested, if this also means we need a 'case-re:',  
since it doesn't look like you can force re.I off.  I don't see any real  
benefit for author(), but I can maybe see it for desc().  The series I'm  
about to submit hints at the ability to add these with one or two lines.   
See what you think.