[PATCH 2 of 2 V2] filterlang: add a small language to filter files

Yuya Nishihara yuya at tcha.org
Thu Jan 11 10:16:31 EST 2018


On Thu, 11 Jan 2018 00:17:39 -0500, Matt Harbison wrote:
> # HG changeset patch
> # User Matt Harbison <matt_harbison at yahoo.com>
> # Date 1515641014 18000
> #      Wed Jan 10 22:23:34 2018 -0500
> # Node ID 548e748cb3f4eea0aedb36a2b2e9fe3b77ffb263
> # Parent  962b2bdd70d094ce4bf9a8135495788166b04510
> filterlang: add a small language to filter files

> I also made the 'always' token a
> predicate for consistency, and introduced 'never' to improve readability.

Perhaps '**' or '.' could be an "always" symbol given patterns are relative
to the repository root in filterlang.

> Finally, I changed the extension operator from '.' to '*'.  This matches how git
> tracks by extension, but might be slightly confusing here because '**' recurses
> in Mercurial, but '*' usually doesn't.

I prefer '**' or 'relglob:*' for fileset compatibility.

> diff --git a/mercurial/filterlang.py b/mercurial/filterlang.py
> new file mode 100644
> --- /dev/null
> +++ b/mercurial/filterlang.py
> @@ -0,0 +1,73 @@
> +# filterlang.py - a simple language to select files

The module name seems too generic.
minifileset.py, ufileset.py, etc. or merge these functions into fileset.py?

> +from . import (
> +    error,
> +    fileset,
> +    util,
> +)

Missing i18n._().

> +def _compile(tree):
> +    op = tree[0]
> +    if op in ('symbol', 'string'):
> +        name = fileset.getstring(tree, 'invalid file pattern')
> +        op = name[0]
> +        if op == '*': # file extension test, ex. "*.tar.gz"
> +            return lambda n, s: n.endswith(name[1:])

Better to make sure no metacharacters in name[1:].

> +        elif op == '/': # directory or full path test
> +            p = name[1:].rstrip('/') # prefix
> +            pl = len(p)
> +            f = lambda n, s: n.startswith(p) and (len(n) == pl or n[pl] == '/')
> +            return f

Perhaps this could be 'path:'.

> +        else:
> +            raise error.ParseError('invalid symbol: %s' % name)

_()

> +    elif op in ['or', 'and']:
> +        funcs = [_compile(t) for t in tree[1:]]
> +        summary = {'or': any, 'and': all}[op]
> +        return lambda n, s: summary(f(n, s) for f in funcs)

IIRC, ('or'/'and', x, y) isn't flattened in fileset.py, so the tree would have
exactly 2 operands.

> +    elif op == 'not':
> +        return lambda n, s: not _compile(tree[1])(n, s)
> +    elif op == 'group':
> +        return _compile(tree[1])
> +    elif op == 'func':
> +        name = tree[1][1]
> +        symbols = {
> +            'always': lambda n, s: True,
> +            'never': lambda n, s: False,
> +            'size': lambda n, s: fileset.sizematcher(tree[2])(s),
> +        }
> +
> +        if name in symbols:
> +            return symbols[name]
> +
> +        raise error.UnknownIdentifier(name, symbols.keys())
> +    elif op in ('negate', 'minus'):
> +        raise error.ParseError('unsupported operator: %s' % '-')
> +    elif op in ('list'):

== 'list', in ('list',) or in {'list'}.

> +        raise error.ParseError(_("can't use a list in this context"),
> +                               hint=_('see hg help "filesets.x or y"'))


More information about the Mercurial-devel mailing list