[RFC] revision sets

Greg Ward greg-hg at gerg.ca
Tue Apr 20 16:08:54 CDT 2010


On Mon, Apr 19, 2010 at 6:32 PM, Matt Mackall <mpm at selenic.com> wrote:
> I've been talking about expanding this into a more powerful system that
> would allow specifying dates, keywords, branches, etc. My current
> thought is to make it look like this:
>
>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"

Cool.  Feels right to me.  I disagree with Bill about the simplicity
of parsing; this absolutely calls out for EBNF and a real parser.
Hopefully it will be a small and simple grammar with a small and
simple parser, but if you have booleans and nesting, you just gotta
have a formal grammar.

> Further, we'd be able to add lots of interesting primitives:
>
>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> author(george) and sorted(date) and reversed()"
>
> Read that as: every cset that is descended from the second parent of
> revision 1.0 and is also an ancestor of 2.0 and was written by george,
> sorted by date in reverse order.

I was right with you up until sorted() and reversed().  The others are
predicates, which makes sense... and then you introduce something that
*looks* like a predicate but actually has a completely different
effect.

Thinking out loud: in SQL, there is a clear distinction between
selection criteria and ordering directives:

  select * from changelog
  where cond1 AND cond2 AND ... AND condN             # selection
  order by date descending                      # ordering

"order by" is a different part of the grammar than "where", and that
is one good thing about SQL.  Based on that idea, here's a different
way to formulate your example:

  descendant(parent2(1.0)) and
  ancestor(2.0) and
  author(george)
  sort(date, reversed)

The idea is that you could throw in an optional "sorted(KEY[, ORDER])"
at the end of a query.  There is deliberately no "and" there, because
it's not part of the boolean logic that specifies which changesets you
want to see; it's separate, specifying how to present those
changesets.

Responding to Dirkjan's comments:
> 1. Lose "and" as a separator, use ',' or '&'

Sure, but you also have to support "|" for "or".  IMHO this is
screaming out for full boolean logic with nesting.

> 2. Allow non-ambiguous abbreviations like we do everywhere else

Should be doable if everything is a keyword in this mini-language.
But if predicates can be added dynamically (by an extension, say),
then it might get tricky.

> 3. Optional parens if unambiguous

Yuck.  Keep it simple and consistent -- keep the parens.

> Also, this proposal might perhaps benefit from a small list of use cases.

How about, "anything I can do with git-rev-parse I should be able to
do with hg". ;-)

(Yeah, I know, that's a requirement not a use case.  Sue me.)

Greg


More information about the Mercurial-devel mailing list