[PATCH] revset: introduce the summary predicate

Gregory Szorc gregory.szorc at gmail.com
Sat Jan 7 13:43:54 EST 2017


On Wed, Jan 4, 2017 at 10:04 AM, Matt Harbison <mharbison72 at gmail.com>
wrote:

> # HG changeset patch
> # User Matt Harbison <matt_harbison at yahoo.com>
> # Date 1483550016 18000
> #      Wed Jan 04 12:13:36 2017 -0500
> # Node ID 76d95ab94b9e206363629059fb7824002e19a9e5
> # Parent  0064a1eb28e246ded9b726c696d048143d1b23f1
> revset: introduce the summary predicate
>
> This is similar to the 'desc()' predicate, however the search is limited
> to the
> first line, which I couldn't figure out how to do with the existing
> functionality.  Unlike 'desc()', this supports regular expressions.  The
> 'matching()' revset already treats the summary line distinctly from the
> full
> description, but that finds similar revisions instead of searching with a
> string.
>
> I'm torn on whether it should support case insensitivity, like 'desc()' and
> 'keyword()'.  It can be handy, but it seems like it would be surprising if
> literals and regex are handled differently, or if the regex was adjusted
> to be
> case insensitive internally.
>
> The use case is to be able to select bug fixes out of the summary line,
> while
> ignoring similar markers within the body of the commit.  I've seen
> previously
> where the bug bot will erroneously update a bug that is mentioned as a
> parenthetical within the detail of the commit message.  To illustrate on
> the
> local Mercurial repo (some old markers have an internal space):
>
>   $ hg log -r "desc(r' (issue')" -T "{rev}\n" | wc -l
>   1672
>   $ hg log -r "desc(r' (issue')" -T "{desc|firstline}\n"  \
>   >            | egrep '\(issue' | wc -l
>   1633
>   $ hg log -r "desc(r' (issue')" -T "{desc|firstline}\n"  \
>   >            | egrep '\(issue[0-9]+\)' | wc -l
>   1560
>   $ hg log -r "summary(r're: \(issue\d+\)')" -T "{rev}\n" | wc -l
>   1560
>
> I'm not sure why the splitlines() logic is that complicated, but I
> copy/pasted
> from templatefilters.firstline() for consistency.
>

Yeah, that is unfortunate. I'm not sure if it matters, but calling
splitlines() may lead to performance issues having to create tons of
throwaway strings as part of the returned list. Normally I'd say to not
worry about it, but revsets run very quickly and it could create enough GC
overhead to cause problems on large repos.

A `desc[:desc.index('\n')]` (inside a try..except ValueError) feels like it
should be faster (and it should just work unless we care about bare CR,
which was used on some older platforms). Even a `desc.split('\n', 1)` feels
better than full splitlines(). The patch shouldn't be blocked on it, but it
would be nice to have measurements for peace of mind.


>
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -2197,6 +2197,32 @@ def subrepo(repo, subset, x):
>
>      return subset.filter(matches, condrepr=('<subrepo %r>', pat))
>
> + at predicate('summary(string)', safe=True)
> +def summary(repo, subset, x):
> +    """Changesets with the given string in the summary line.  The match
> is case
> +    sensitive.
> +
> +    If `value` starts with `re:`, the remainder of the value is treated as
> +    a regular expression. To match a value that actually starts with
> `re:`,
> +    use the prefix `literal:`.
> +    """
> +
> +    # i18n: "summary" is a keyword
> +    s = getstring(x, _("summary requires a string"))
> +    kind, value, matcher = _substringmatcher(s)
> +
> +    def _matchvalue(r):
> +        _desc = repo[r].description()
> +        _sum = ''
> +        try:
> +            _sum = _desc.splitlines(True)[0].rstrip('\r\n')
> +        except IndexError:
> +            pass
> +
> +        return matcher(_sum)
> +
> +    return subset.filter(lambda r: _matchvalue(r), condrepr=('<summary
> %r>', s))
> +
>  def _substringmatcher(pattern):
>      kind, pattern, matcher = util.stringmatcher(pattern)
>      if kind == 'literal':
> diff --git a/tests/test-commit.t b/tests/test-commit.t
> --- a/tests/test-commit.t
> +++ b/tests/test-commit.t
> @@ -252,13 +252,50 @@ full log
>    commit-foo-subdir
>
>
> +  $ cat > ../commit-log-test <<EOF
> +  > issue fix (bug1234)
> +  >
> +  > multiline description here
> +  > EOF
> +  $ echo "edited" >> foo/plain-file
> +  $ hg ci -l ../commit-log-test
> +
> +Summary doesn't search past the first line
> +  $ hg log -r "summary('description')"
> +
> +Summary works with regex
> +  $ hg log -r "summary(r're:\(bug\d+\)')"
> +  changeset:   2:5364ddbe8265
> +  tag:         tip
> +  user:        test
> +  date:        Thu Jan 01 00:00:00 1970 +0000
> +  summary:     issue fix (bug1234)
> +
> +
> +Summary works with partial matches
> +  $ hg log -r "summary('foo')"
> +  changeset:   0:65d4e9386227
> +  user:        test
> +  date:        Thu Jan 01 00:00:00 1970 +0000
> +  summary:     commit-foo-subdir
> +
> +  changeset:   1:95b38e3a5b2e
> +  user:        test
> +  date:        Thu Jan 01 00:00:00 1970 +0000
> +  summary:     commit-foo-dot
> +
>
>  subdir log
>
>    $ cd foo
>    $ hg log .
> +  changeset:   2:5364ddbe8265
> +  tag:         tip
> +  user:        test
> +  date:        Thu Jan 01 00:00:00 1970 +0000
> +  summary:     issue fix (bug1234)
> +
>    changeset:   1:95b38e3a5b2e
> -  tag:         tip
>    user:        test
>    date:        Thu Jan 01 00:00:00 1970 +0000
>    summary:     commit-foo-dot
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20170107/5290b447/attachment.html>


More information about the Mercurial-devel mailing list