[PATCH] revset: add the 'subrev' symbol
Matt Harbison
mharbison72 at gmail.com
Sat Jun 27 22:48:00 CDT 2015
On Sat, 27 Jun 2015 04:42:45 -0400, Yuya Nishihara <yuya at tcha.org> wrote:
> On Wed, 24 Jun 2015 22:16:10 -0400, Matt Harbison wrote:
>> On Wed, 24 Jun 2015 21:34:54 -0400, Pierre-Yves David
>> <pierre-yves.david at ens-lyon.org> wrote:
>> > On 06/24/2015 02:55 PM, Matt Harbison wrote:
>> >> # HG changeset patch
>> >> # User Matt Harbison <matt_harbison at yahoo.com>
>> >> # Date 1435179910 14400
>> >> # Wed Jun 24 17:05:10 2015 -0400
>> >> # Node ID 74b79bc0f22824f1858750f987915671e3ecf1b9
>> >> # Parent 1de5b35cecd32a4691d0542d03348c13e5813c95
>> >> revset: add the 'subrev' symbol
>> >>
>> >> This finds revisions in the outer repo that track a named revision
>> of a
>> >> subrepo.
>> >> It allows for a partial match, as long as the named revision is at
>> >> least as long
>> >> as the 'shortid' form defined by the subrepo. Allowing shorter
>> strings
>> >> seems
>> >> risky- the user may specify a "{rev}" value, and not realize it only
>> >> partially
>> >> matches on a full "{node}".
>> >>
>> >> A more general query is to figure out what revisions in the outer
>> repo
>> >> track the
>> >> named revision, _or a descendant of it_. This would make it much
>> >> easier to
>> >> figure out where a subrepo change becomes visible if there are a
>> series
>> >> of
>> >> subrepo changes, with one lockin commit to the outer repo. But that
>> >> looks much
>> >> more complicated. If that makes sense to implement in this revset in
>> >> the
>> >> future, we can probably do so by adding an argument to the revset
>> that
>> >> controls
>> >> the behavior, similar to the -key arg to sort().
>> >
>> > What I would expect from such feature is
>> >
>> > hg log --rev 'subrepo([pattern], revs="anyrevset")'
>> >
>> > 1) This would allow to select the subrepo we are targeting (which have
>> > usability and performance benefit).
>> >
>> > 2) having revset would allow any kind of operation (including
>> revision X
>> > or descendant to be done).
>> >
>> > Doing (1) seems simple. Doing (2) is a bit tricker because as far as I
>> > know, the subrepo is not necessarly checked out. So we should maybe
>> > design the API so it could support any revsets but only support
>> hex-node
>> > for now.
>> > We can probably use revset now if the repo is checkout in the
>> repository
>> > or if the subrepo cache is operational. But supporting (2) seems a bit
>> > premature(as this changeset already points).
>> >
>> > What do you think about (1) (reusing the subrepo revset?
>>
>> The only reason I didn't add this to the existing subrepos() is that I
>> don't see a precedent for how to skip the first parameter and specify
>> the
>> second. I assume the 'revs=' bit is for clarity here, not part of the
>> actual query that needs parsing?
>
> Tried just for fun:
>
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -133,6 +133,7 @@ elements = {
> "or": (4, None, ("or", 4)),
> "|": (4, None, ("or", 4)),
> "+": (4, None, ("or", 4)),
> + "=": (3, None, ("keyvalue", 3)),
> ",": (2, None, ("list", 2)),
> ")": (0, None, None),
> "symbol": (0, ("symbol",), None),
> @@ -190,7 +191,7 @@ def tokenize(program, lookup=None, symin
> elif c == '#' and program[pos:pos + 2] == '##': # look ahead
> carefully
> yield ('##', None, pos)
> pos += 1 # skip ahead
> - elif c in "():,-|&+!~^%": # handle simple operators
> + elif c in "():=,-|&+!~^%": # handle simple operators
> yield (c, None, pos)
> elif (c in '"\'' or c == 'r' and
> program[pos:pos + 2] in ("r'", 'r"')): # handle quoted
> strings
> @@ -281,6 +282,25 @@ def getargs(x, min, max, err):
> raise error.ParseError(err)
> return l
> +def getkwargs(x, keys, err):
> + keys = keys.split()
> + l = getlist(x)
> + args = {}
> + # consume positional args
> + for k, y in zip(keys, l):
> + if y[0] == 'keyvalue':
> + break
> + args[k] = y
> + # remainder should be keyword args
> + for y in l[len(args):]:
> + if y[0] != 'keyvalue' or not isvalidsymbol(y[1]):
> + raise error.ParseError(err) # XXX better error message
> + k = getsymbol(y[1])
> + if k not in keys or k in args:
> + raise error.ParseError(err) # XXX better error message
> + args[k] = y[2]
> + return args
> +
> def isvalidsymbol(tree):
> """Examine whether specified ``tree`` is valid ``symbol`` or not
> """
> @@ -385,6 +405,9 @@ def orset(repo, subset, *xs):
> def notset(repo, subset, x):
> return subset - getset(repo, subset, x)
> +def keyvalueset(repo, subset, k, v):
> + raise error.ParseError(_("can't use a key-value pair in this
> context"))
> +
> def listset(repo, subset, a, b):
> raise error.ParseError(_("can't use a list in this context"))
> @@ -1832,9 +1855,10 @@ def subrepo(repo, subset, x):
> pattern is named, any subrepo changes are returned.
> """
> # i18n: "subrepo" is a keyword
> - args = getargs(x, 0, 1, _('subrepo takes at most one argument'))
> - if len(args) != 0:
> - pat = getstring(args[0], _("subrepo requires a pattern"))
> + args = getkwargs(x, 'pattern revs', _('subrepo takes at most two
> argument'))
> + if 'pattern' in args:
> + pat = getstring(args['pattern'], _("subrepo requires a
> pattern"))
> + print args
> m = matchmod.exact(repo.root, repo.root, ['.hgsubstate'])
> @@ -2176,6 +2200,7 @@ methods = {
> "and": andset,
> "or": orset,
> "not": notset,
> + "keyvalue": keyvalueset,
> "list": listset,
> "func": func,
> "ancestor": ancestorspec,
I spent all day playing with this, and it's pretty neat. Implementing the
'revs=' bit is turning out to be a bit complex, so I implemented a
'status=' where the value is some combination of 'MARC', to optionally
tune the original behavior of 'MAR' always. I didn't get to the part of
parsing the 'revs=' value, but I didn't have any problems with commas or
internal quoting.
If you submit the kwargs addition, I can submit the status work, and work
on the subquery part more.
More information about the Mercurial-devel
mailing list