[RFC] external revsets and templates

Jun Wu quark at fb.com
Wed Jul 13 16:17:38 EDT 2016

We have similar URL concepts for [hooks] and "hg import". I wonder if the
customized protocols like "shell:", "python:", etc. make sense to be shared
among different places, at least "hg import".

A practical use-case is Phabricator or Patchwork integration.

Excerpts from Matt Mackall's message of 2016-07-13 13:37:08 -0500:
> These are features we've been playing with a bit in the new review flow:
> https://www.mercurial-scm.org/wiki/AcceptProcess#Setting_up_the_revset_helper 
> These have been extremely useful and I'd like to make them core feature, so I'd
> like to further iron out the syntax and feature set before moving forward.
> Currently, external revsets works like this:
>  [extrevset]
>  foo = shell:some-shell-command
> Then some-shell-command is expected to return a series of Mercurial identifiers
> (hash, rev, tag..), one per line. When "foo" is used in a revset, Mercurial
> calls the shell command, looks up each result, and returns a corresponding
> revset.
> I think we should also be able to support arguments:
>  [extrevset]
>  cvs = shell:/path/to/lookup-cvs-rev $1
> Then we can do:
>  $ hg log -r "cvs(123)"
> Also, we should allow data sources that are arbitrary URLs:
>  [extrevset]
>  tested = url:http://build.corp.example.com/hg-tested.dat
>  good = url:http://build.corp.example.com/hg-passed.dat
>  deployed = url:http://prod.example.com/hg-deployed.cgi
>  fulltext = url:http://hg
> -fulltext-db.example.com/query?string=$1
> ..which will allow very easy integration with complex production automation. The
> url: piece might be redundant here? We might also allow calling Python, similar
> to how we allow it in hooks.

+1 for removing "url:".
> My current implementation has no caching, which is usually fine. My plan is to
> cache the non-argument version for the repo object lifetime and leave the
> argument version uncached, but the chg use case might need a better plan.

If invocation time is negligible, I think the cache can also be implemented
in the scripts, which would work across hg processes and have better control
on TTL.

chg does not make the repo object persistent and it would be hard to do that
correctly with positive perf impact. I think it's not a concern.
> External templates are very similar and allow adding data to the display side
> (including in hgweb!). Instead of simply getting a list of revisions, it gets a
> list of revision[space]description pairs. For instance, I can currently get a
> list of reviewers on draft changesets thusly:
>  [exttemplate]
>  reviewers = shell:ssh mercurial-cm accept/reviewed
> ..and simply add {reviewers} to my log template. Again, this can be used for
> many things, like displaying number of test failures, deployment status,
> mappings to other SCMs or review tools.
> Caching here is more important as templates get evaluated once per changeset. My
> current hack keeps a global cache, but caching per repo is probably saner.
> Because the data format for external templates is a superset of the one used by
> external revsets, the same source can probably be shared in the cases where it
> makes sense.
> Thoughts?

With HTTP considered, if we respect the Cache-Control header, we may want to
do similar stuff for "shell:", maybe by parsing stderr for special text. But
that sounds unnecessarily complex at least for the first version.

Same as the above, I think it will be cleaner if we let the script do its
own caching and don't think about HTTP and TTLs.

More information about the Mercurial-devel mailing list