[PATCH] revset: introduce the summary predicate

Matt Harbison mharbison72 at gmail.com
Tue Jan 10 20:09:45 EST 2017


On Mon, 09 Jan 2017 23:33:17 -0500, Matt Harbison <mharbison72 at gmail.com>  
wrote:

> On Mon, 09 Jan 2017 05:49:23 -0500, Pierre-Yves David  
> <pierre-yves.david at ens-lyon.org> wrote:
>
>>
>>
>> On 01/08/2017 09:34 PM, Matt Harbison wrote:
>>> On Sun, 08 Jan 2017 07:59:36 -0500, Pierre-Yves David
>>> <pierre-yves.david at ens-lyon.org> wrote:
>>>
>>>> (ha, I wrote my previous reply in a train and it got sent when I
>>>> connected again (and received that one). I'm going to try to adress
>>>> the new content in this email and sometime repeat some of my other
>>>> reply content for clarity)
>>>>
>>>> On 01/08/2017 04:23 AM, Matt Harbison wrote:
>>>>> On Sat, 07 Jan 2017 02:56:48 -0500, Yuya Nishihara <yuya at tcha.org>
>>>>> wrote:
>>>>>
>>>>>> On Fri, 6 Jan 2017 21:29:43 -0500, Matt Harbison wrote:
>>>>>>> > On Jan 6, 2017, at 11:19 AM, Pierre-Yves David
>>>>>>> <pierre-yves.david at ens-lyon.org> wrote:
>>>>>>> >> On 01/04/2017 07:04 PM, Matt Harbison wrote:
>>>>>>> >> # HG changeset patch
>>>>>>> >> # User Matt Harbison <matt_harbison at yahoo.com>
>>>>>>> >> # Date 1483550016 18000
>>>>>>> >> #      Wed Jan 04 12:13:36 2017 -0500
>>>>>>> >> # Node ID 76d95ab94b9e206363629059fb7824002e19a9e5
>>>>>>> >> # Parent  0064a1eb28e246ded9b726c696d048143d1b23f1
>>>>>>> >> revset: introduce the summary predicate
>>>>>>> >>
>>>
>>>>>> Perhaps stringmatcher can have 3 types, icase literal, literal, and
>>>>>> re, and
>>>>>> the default of desc() is icase literal for backward compatibility.  
>>>>>> You
>>>>>> can
>>>>>> build a case-insensitive regexp object from a literal pattern.
>>>>>>
>>>>>> https://docs.python.org/2/library/re.html#re.I
>>>>>
>>>>> Yep, that's the API I was thinking of.
>>>>>
>>>>> I'm confused by the rest of your comments.  When I first skimmed your
>>>>> message, adding support for 'icasere:' using this API popped into my
>>>>> mind.  And that could support a case insensitive literal, because
>>>>> 'icasere:foo' should be equivalent to looking for the substring 'foo'
>>>>> (leaving aside efficiency, how discoverable that is, and that
>>>>> stringmatcher matches the whole string for literals).  But you seem  
>>>>> to
>>>>> be suggesting adding 'icaseliteral:'.
>>>>
>>>> I'm not 100% sure of what Yuya actually has in mind but here is my
>>>> understanding of the situation and how we could move forward.
>>>>
>>>> Currently:
>>>> ----------
>>>>
>>>>    desc(X) → X is customly matched as a case insensitive litteral,
>>>>
>>>>    We have a "generic" pattern definition syntax used by various other
>>>> reveset (implemented in "stringmatcher")
>>>>
>>>>      foo(X)
>>>>        → X is matched as a case sensitive litteral
>>>>      foo('literal:X')
>>>>        → X is matched as a case sensitive literal (same as the above)
>>>>      food('re:X')
>>>>        → X is matched as a regular expression (case sensitive)
>>>>
>>>> Proposal: (might be what yuya says)
>>>> ---------
>>>>
>>>> extend the string matcher to
>>>>
>>>>    foo('literal:X')
>>>>      → X is matched as a case sensitive literal
>>>
>>> See the comment in the new patch I sent about 'user()' already
>>> lowercasing 'literal:' and 're:'.  I'd consider it a bug, but it's been
>>> in since mid 2012.  Attempting to channel Matt, I'm guessing we are
>>> stuck with that since it is so old, but wanted to see what others  
>>> think.
>>
>> 1) Yep, we are stuck with whatever existing behavior we have for  
>> existing predicate because of BC. (but we can augment it)
>>
>> 2) Congratulation you seems to have unearthed an area where we have  
>> many predicated with close but slightly different behavior. At that  
>> point I'll ask you an inventory of what we currently have so that we  
>> can devise a sound and as consistent as possible way forward.
>>
>>    Can you provide us with a table that at least keep track of:
>>
>> * predicate
>> * default behavior
>> * support 'rich' stringmatcher ?
>> * are 'literal:' case sensitive ?
>> * are 're:' case sensitive (and supported at all) ?
>
> TL;DR: desc, grep, keyword, and author/user are the oddballs.  grep and  
> keyword are the regex and literal halves respectively, of the same  
> search.
>
> After stripping out non string predicates, we basically end up with 4  
> groups:
>
> - util.stringmatcher based:
>
>      Predicate:            Case Sensitive?
>      "author"                   N
>      "user"                     N
>      "bookmark"                 Y
>      "branch"                   Y
>      "extra"                    Y
>      "named"                    Y
>      "subrepo"  [1]             Y
>      "tag"                      Y
>
> These all support 'literal:' and 're:'.  Case sensitivity applies the  
> same to both prefixes, and raw pattern.
>
> [1] Not documented to support 'literal:' or 're:' prefixes.
>
>
> - Local method implementation based:
>
>      Predicate:            Case Sensitive?
>      "desc"                     N
>      "grep"                     Y
>      "keyword"                  N
>
> None of these support prefixes.  The grep param is a regex without  
> 're:', so it doesn't make a lot of sense to support stringmatcher here-  
> what stringmatcher thinks is literal is really regex.  If we internally  
> bolt on 're:', it still can't support literal matches.
>
>
> - match.py based (not relevant, but for completeness):
>
>      "adds", "contains", "file", "filelog", "follow",
>      "followlines", "modifies", "removes"
>
> - "bisect" (I can't see how 're:' support here would be meaningful.)
>
>
>>  From there we'll be able to see if a pattern emerge and pick the best  
>> way to move forward.
>
> The following thoughts come to mind:
>
> A) author/user has been thoroughly corrupted with the lower casing.   
> Maybe come up with a 3rd one that follows modern rules, and  
> deprecate/hide these?  Sad, because these seem natural.  OTOH, having  
> raw pattern and prefixed patterns behave the same everywhere is elegant  
> and simple.  Not sure what to call it.  'committer' comes to mind, but  
> that has git implications.  That was suggested when I proposed recording  
> who performed a graft in the 'extra' dict.  Then it was pointed out this  
> tracking was related to the chain of custody proposal by Greg, so I let  
> it drop [1].  My enthusiasm was dampened some when I saw templates also  
> have {author} and {user}, though the latter is a filter, not a reference  
> to the field.
>
> [1]  
> https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-April/068692.html
>
> B) We should maybe fold grep and keyword into a new predicate  
> ('search'?) that follows modern formatting, and deprecate/hide these  
> two.  Simple, and drops the visible revset count by 1.  I'm not sure if  
> 'grep' was borrowed  from git, or if it's just inspired by the unix  
> command.

I had second thoughts about this.  I think if we just add documentation to  
'keyword' that says "If you want to search these fields by regex or case  
sensitively, use 'grep'".  Then the user can see how to get everything  
stringmatcher will provide, and it makes it clear to developers that  
'keyword' doesn't need string stringmatcher support.  I'll submit a patch  
during the freeze.

That leaves only 'author' and 'desc' as not providing the full  
functionality of the others.


> C) If we do A + B, that means 'desc' is the only oddball left.  I don't  
> like the idea that case sensitivity for a raw pattern and a 'literal:'  
> prefixed pattern would differ.  They are both literals in my mind, and  
> it would be the one remaining exception.  The 're:' prefix could follow  
> regular rules.
>
> D) I guess we could hide/replace 'desc' with something new as well.  
> ('message'?)  But I'm wondering about the BC rules.  I think generally,  
> we don't change behavior without the user doing some action to opt in.   
> But the 're:' prefix was added after many of these predicates existed  
> for years.  Obviously, that broke anybody's search that started with  
> 're:', just by them upgrading Mercurial.  If making them change the  
> query to prefix 'literal:' was acceptable, why would changing the case  
> sensitivity not be, assuming we provide 'icase-literal:'?  Is it just a  
> general feeling that searches for 're:' are rare?  I understand the  
> principle and motivation, but sometimes it's hard to figure out how they  
> are applied.
>
>
>>>>    foo('icase-literal:X')
>>>>      → X is matched as a case insensitive literal
>>>>    food('re:X')
>>>>      → X is matched as a regular expression (case sensitive)
>>>>
>>>> Then, desc move to use string matcher (default to "icase-literal").
>>>>
>>>> We do not need a 'icase-re:' spec, because one can easily achieve it
>>>> using 're:(?i)foo'
>>>
>>> Ah! I missed the part in the docs where flags could be set in the  
>>> string
>>> with (?<flag>). I thought you needed to compile with re.FLAG.  When he
>>> said string literal, my mind went right to the 'literal:' prefix.
>>> Agreed, no need for 'icase-re:'.
>>
>> Someone getting slightly confused with regular expression? Impossible!  
>> ;-)
>>
>>>>> […]
>>>>> I'm about to submit a patch to add the current 're:' support to  
>>>>> 'desc'
>>>>> in the meantime, to hopefully move this along.
>>>>
>>>> Great!
>>>>
>>>>>  I'd also be curious if
>>>>> you have thoughts on how to conditionally limit this predicate to the
>>>>> first line, without limiting future functionality.
>>>>
>>>> So having digged the regexp part a bit more, it seems like one could
>>>> just use 're:^.*issue1337' to match "issue1337" on the first line ('.'
>>>> does not match new-line by default).
>>>
>>> Thanks for looking at that.  It's way less horrible than I thought it
>>> would be.  I'm curious what Sean thinks, since he mentioned {firstline}
>>> being put in as a substitute for a complex regex.  I'd be fine with
>>> skipping the firstline=True param if this case is mentioned in the help
>>> for desc().
>>
>> I've missed that '{firstline}' proposal from Sean, can you point me at  
>> it? (or summarize it ?)
>
> Not a proposal, so much as historical knowledge about the template?
>
> https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-January/092099.html
>
>> Thanks a lot for looking into this!
>>
>> Cheers,


More information about the Mercurial-devel mailing list