[RFC] external revsets and templates

Thu Jul 14 17:22:14 EDT 2016

On Wed, 2016-07-13 at 21:17 +0100, Jun Wu wrote:
> We have similar URL concepts for [hooks] and "hg import". I wonder if the
> customized protocols like "shell:", "python:", etc. make sense to be shared
> among different places, at least "hg import".
> 
> A practical use-case is Phabricator or Patchwork integration.
> 
> Excerpts from Matt Mackall's message of 2016-07-13 13:37:08 -0500:
> > 
> > These are features we've been playing with a bit in the new review flow:
> > 
> > https://www.mercurial-scm.org/wiki/AcceptProcess#Setting_up_the_revset_helpe
> > r 
> > 
> > These have been extremely useful and I'd like to make them core feature, so
> > I'd
> > like to further iron out the syntax and feature set before moving forward.
> > 
> > Currently, external revsets works like this:
> > 
> >  [extrevset]
> >  foo = shell:some-shell-command
> > 
> > Then some-shell-command is expected to return a series of Mercurial
> > identifiers
> > (hash, rev, tag..), one per line. When "foo" is used in a revset, Mercurial
> > calls the shell command, looks up each result, and returns a corresponding
> > revset.
> > 
> > I think we should also be able to support arguments:
> > 
> >  [extrevset]
> >  cvs = shell:/path/to/lookup-cvs-rev $1
> > 
> > Then we can do:
> > 
> >  $ hg log -r "cvs(123)"
> > 
> > Also, we should allow data sources that are arbitrary URLs:
> > 
> >  [extrevset]
> >  tested = url:http://build.corp.example.com/hg-tested.dat
> >  good = url:http://build.corp.example.com/hg-passed.dat
> >  deployed = url:http://prod.example.com/hg-deployed.cgi
> >  fulltext = url:http://hg
> > -fulltext-db.example.com/query?string=$1
> > 
> > ..which will allow very easy integration with complex production automation.
> > The
> > url: piece might be redundant here? We might also allow calling Python,
> > similar
> > to how we allow it in hooks.
> +1 for removing "url:".
>  
> > 
> > My current implementation has no caching, which is usually fine. My plan is
> > to
> > cache the non-argument version for the repo object lifetime and leave the
> > argument version uncached, but the chg use case might need a better plan.
> If invocation time is negligible, I think the cache can also be implemented
> in the scripts, which would work across hg processes and have better control
> on TTL.
> 
> chg does not make the repo object persistent and it would be hard to do that
> correctly with positive perf impact. I think it's not a concern.
>  
> > 
> > External templates are very similar and allow adding data to the display
> > side
> > (including in hgweb!). Instead of simply getting a list of revisions, it
> > gets a
> > list of revision[space]description pairs. For instance, I can currently get
> > a
> > list of reviewers on draft changesets thusly:
> > 
> >  [exttemplate]
> >  reviewers = shell:ssh mercurial-cm accept/reviewed
> > 
> > ..and simply add {reviewers} to my log template. Again, this can be used for
> > many things, like displaying number of test failures, deployment status,
> > mappings to other SCMs or review tools.
> > 
> > Caching here is more important as templates get evaluated once per
> > changeset. My
> > current hack keeps a global cache, but caching per repo is probably saner.
> > 
> > Because the data format for external templates is a superset of the one used
> > by
> > external revsets, the same source can probably be shared in the cases where
> > it
> > makes sense.
> > 
> > Thoughts?
> With HTTP considered, if we respect the Cache-Control header, we may want to
> do similar stuff for "shell:", maybe by parsing stderr for special text. But
> that sounds unnecessarily complex at least for the first version.
> 
> Same as the above, I think it will be cleaner if we let the script do its
> own caching and don't think about HTTP and TTLs.

We want to avoid calling out to a data source 50 times to show 50 log entries..
so we need to cache results across more than one template invocation. And since
we don't know whether a template element is needed until we evaluate a template
and we also don't want to call data sources we don't use, we need to do some
sort of caching-at-first-use.

-- 
Mathematics is the supreme nostalgia of our time.