[PATCH 1 of 5] extdata: add extdatasource reader

Fri Sep 23 13:47:18 EDT 2016

On Thu, 2016-09-22 at 18:20 -0500, Kevin Bullock wrote:
> > 
> > On Sep 22, 2016, at 13:21, Matt Mackall <mpm at selenic.com> wrote:
> > 
> > # HG changeset patch
> > # User Matt Mackall <mpm at selenic.com>
> > # Date 1473794045 18000
> > #      Tue Sep 13 14:14:05 2016 -0500
> > # Node ID 19bf2776dfe39befdc479253e1e7d030b41c08f9
> > # Parent  5271ae66615207f39cc41d78f4541bc6f8ca6ff6
> > extdata: add extdatasource reader
> > 
> > This adds basic support for extdata, a way to add external data
> > sources for revsets and templates. An extdata data source is simply a
> > list of lines of the form:
> > 
> > <revision identifier>[<space><freeform text>]\n
> > 
> > An extdata source is configured thusly:
> > 
> > [extdata]
> > name = <a url or path>
> > 
> > urls of the form shell: are launch shell commands to generate data.
> > 
> > diff -r 5271ae666152 -r 19bf2776dfe3 mercurial/scmutil.py
> > --- a/mercurial/scmutil.py	Wed Sep 21 17:05:27 2016 -0400
> > +++ b/mercurial/scmutil.py	Tue Sep 13 14:14:05 2016 -0500
> > @@ -29,6 +29,7 @@
> >     phases,
> >     revset,
> >     similar,
> > +    url,
> >     util,
> > )
> > 
> > @@ -1418,3 +1419,66 @@
> >             return
> > 
> >         self._queue.put(fh, block=True, timeout=None)
> > +
> > +def extdatasources(repo):
> > +    sources = set()
> > +    for k, v in repo.ui.configitems("extdata"):
> > +        sources.add(k)
> > +    return sources
> > +
> > +def extdatasource(repo, source):
> > +    """gather a map of rev -> value dict from the specified source
> > +
> > +    A source spec is treated as a URL, with a special case shell: type
> > +    for parsing the output from a shell command.
> > +
> > +    The data is parsed as a series of newline-separated records where
> > +    each record is a revision specifier optionally followed by a space
> > +    and a freeform string value. If the revision is known locally, it
> > +    is converted to a rev, otherwise the record is skipped.
> > +
> > +    Note that both key and value are treated as UTF-8 and converted to
> > +    the local encoding. This allows uniformity between local and
> > +    remote data sources.
> That's a bit unfortunate. If we're expecting them to be read as UTF-8, can't
> we just keep them in UTF-8 all the way thru?

We always work internally in the local encoding. Sane local encodings include:
utf-8. We've normally got two strategies: files owned by users and stored in
local encoding (hgrc) and files owned by Mercurial and stored in utf-8
(bookmarks). So this note is to point out that this is different from the usual
case.. because URLs might (or might not) be remote, shared resources that need
to be in utf-8 for portability.

> > 
> > +    """
> > +
> > +    spec = repo.ui.config("extdata", source)
> > +    if not spec:
> > +        raise util.Abourt(_("unknown extdata source '%s'") % source)
> Typo here. I suppose there's no great way to test against this.

Indeed, both of the current callers prevent this code from being reached.

> > 
> > +
> > +    try:
> > +        # prepare for future expansion
> > +        expand = spec % ()
> > +    except TypeError:
> > +        raise error.Abort(_("extdata doesn't support parameters yet"),
> > +                          hint=_("use double % for escaping"))
> > +
> > +    data = {}
> > +    if spec.startswith("shell:"):
> > +        # external commands should be run relative to the repo root
> > +        cmd = spec[6:]
> > +        cwd = os.getcwd()
> > +        os.chdir(repo.root)
> > +        try:
> > +            src = util.popen(cmd)
> Erm, don't we want to use util.popen2 or one of the other variants that use
> subprocess instead?

The universal advantages of subprocess are overstated. For the simple task of
reading stdout from a subprocess, util.popen is perfectly suited. If it wasn't..
we'd fix util.popen.

> ...and maybe handle ENOENT gracefully?

We can't, because cmd is an arbitrary shell expression.

> pacem in terris / мир / शान्ति / ‎‫سَلاَم‬ / 平和
> Kevin R. Bullock
-- 
Mathematics is the supreme nostalgia of our time.