[PATCH] revset: add the 'subrev' symbol

Matt Harbison mharbison72 at gmail.com
Sat Jun 27 22:48:00 CDT 2015


On Sat, 27 Jun 2015 04:42:45 -0400, Yuya Nishihara <yuya at tcha.org> wrote:

> On Wed, 24 Jun 2015 22:16:10 -0400, Matt Harbison wrote:
>> On Wed, 24 Jun 2015 21:34:54 -0400, Pierre-Yves David
>> <pierre-yves.david at ens-lyon.org> wrote:
>> > On 06/24/2015 02:55 PM, Matt Harbison wrote:
>> >> # HG changeset patch
>> >> # User Matt Harbison <matt_harbison at yahoo.com>
>> >> # Date 1435179910 14400
>> >> #      Wed Jun 24 17:05:10 2015 -0400
>> >> # Node ID 74b79bc0f22824f1858750f987915671e3ecf1b9
>> >> # Parent  1de5b35cecd32a4691d0542d03348c13e5813c95
>> >> revset: add the 'subrev' symbol
>> >>
>> >> This finds revisions in the outer repo that track a named revision  
>> of a
>> >> subrepo.
>> >> It allows for a partial match, as long as the named revision is at
>> >> least as long
>> >> as the 'shortid' form defined by the subrepo.  Allowing shorter  
>> strings
>> >> seems
>> >> risky- the user may specify a "{rev}" value, and not realize it only
>> >> partially
>> >> matches on a full "{node}".
>> >>
>> >> A more general query is to figure out what revisions in the outer  
>> repo
>> >> track the
>> >> named revision, _or a descendant of it_.  This would make it much
>> >> easier to
>> >> figure out where a subrepo change becomes visible if there are a  
>> series
>> >> of
>> >> subrepo changes, with one lockin commit to the outer repo.  But that
>> >> looks much
>> >> more complicated.  If that makes sense to implement in this revset in
>> >> the
>> >> future, we can probably do so by adding an argument to the revset  
>> that
>> >> controls
>> >> the behavior, similar to the -key arg to sort().
>> >
>> > What I would expect from such feature is
>> >
>> >    hg log --rev 'subrepo([pattern], revs="anyrevset")'
>> >
>> > 1) This would allow to select the subrepo we are targeting (which have
>> > usability and performance benefit).
>> >
>> > 2) having revset would allow any kind of operation (including  
>> revision X
>> > or descendant to be done).
>> >
>> > Doing (1) seems simple. Doing (2) is a bit tricker because as far as I
>> > know, the subrepo is not necessarly checked out. So we should maybe
>> > design the API so it could support any revsets but only support  
>> hex-node
>> > for now.
>> > We can probably use revset now if the repo is checkout in the  
>> repository
>> > or if the subrepo cache is operational. But supporting (2) seems a bit
>> > premature(as this changeset already points).
>> >
>> > What do you think about (1) (reusing the subrepo revset?
>>
>> The only reason I didn't add this to the existing subrepos() is that I
>> don't see a precedent for how to skip the first parameter and specify  
>> the
>> second.  I assume the 'revs=' bit is for clarity here, not part of the
>> actual query that needs parsing?
>
> Tried just for fun:
>
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -133,6 +133,7 @@ elements = {
>      "or": (4, None, ("or", 4)),
>      "|": (4, None, ("or", 4)),
>      "+": (4, None, ("or", 4)),
> +    "=": (3, None, ("keyvalue", 3)),
>      ",": (2, None, ("list", 2)),
>      ")": (0, None, None),
>      "symbol": (0, ("symbol",), None),
> @@ -190,7 +191,7 @@ def tokenize(program, lookup=None, symin
>          elif c == '#' and program[pos:pos + 2] == '##': # look ahead  
> carefully
>              yield ('##', None, pos)
>              pos += 1 # skip ahead
> -        elif c in "():,-|&+!~^%": # handle simple operators
> +        elif c in "():=,-|&+!~^%": # handle simple operators
>              yield (c, None, pos)
>          elif (c in '"\'' or c == 'r' and
>                program[pos:pos + 2] in ("r'", 'r"')): # handle quoted  
> strings
> @@ -281,6 +282,25 @@ def getargs(x, min, max, err):
>          raise error.ParseError(err)
>      return l
> +def getkwargs(x, keys, err):
> +    keys = keys.split()
> +    l = getlist(x)
> +    args = {}
> +    # consume positional args
> +    for k, y in zip(keys, l):
> +        if y[0] == 'keyvalue':
> +            break
> +        args[k] = y
> +    # remainder should be keyword args
> +    for y in l[len(args):]:
> +        if y[0] != 'keyvalue' or not isvalidsymbol(y[1]):
> +            raise error.ParseError(err)  # XXX better error message
> +        k = getsymbol(y[1])
> +        if k not in keys or k in args:
> +            raise error.ParseError(err)  # XXX better error message
> +        args[k] = y[2]
> +    return args
> +
>  def isvalidsymbol(tree):
>      """Examine whether specified ``tree`` is valid ``symbol`` or not
>      """
> @@ -385,6 +405,9 @@ def orset(repo, subset, *xs):
>  def notset(repo, subset, x):
>      return subset - getset(repo, subset, x)
> +def keyvalueset(repo, subset, k, v):
> +    raise error.ParseError(_("can't use a key-value pair in this  
> context"))
> +
>  def listset(repo, subset, a, b):
>      raise error.ParseError(_("can't use a list in this context"))
> @@ -1832,9 +1855,10 @@ def subrepo(repo, subset, x):
>      pattern is named, any subrepo changes are returned.
>      """
>      # i18n: "subrepo" is a keyword
> -    args = getargs(x, 0, 1, _('subrepo takes at most one argument'))
> -    if len(args) != 0:
> -        pat = getstring(args[0], _("subrepo requires a pattern"))
> +    args = getkwargs(x, 'pattern revs', _('subrepo takes at most two  
> argument'))
> +    if 'pattern' in args:
> +        pat = getstring(args['pattern'], _("subrepo requires a  
> pattern"))
> +    print args
>     m = matchmod.exact(repo.root, repo.root, ['.hgsubstate'])
> @@ -2176,6 +2200,7 @@ methods = {
>      "and": andset,
>      "or": orset,
>      "not": notset,
> +    "keyvalue": keyvalueset,
>      "list": listset,
>      "func": func,
>      "ancestor": ancestorspec,

I spent all day playing with this, and it's pretty neat.  Implementing the  
'revs=' bit is turning out to be a bit complex, so I implemented a  
'status=' where the value is some combination of 'MARC', to optionally  
tune the original behavior of 'MAR' always.  I didn't get to the part of  
parsing the 'revs=' value, but I didn't have any problems with commas or  
internal quoting.

If you submit the kwargs addition, I can submit the status work, and work  
on the subquery part more.


More information about the Mercurial-devel mailing list