[PATCH 4 of 9] grep: add -e option for specifying multiple patterns

Pierre-Yves David pierre-yves.david at logilab.fr
Tue Oct 16 06:37:01 CDT 2012


On Sun, Oct 14, 2012 at 10:54:21PM +0200, Idan Kamara wrote:
> # HG changeset patch
> # User Idan Kamara <idankk86 at gmail.com>
> # Date 1350073273 -7200
> # Node ID 06edbeb509881597122b46de831da4332076c4a1
> # Parent  806bab6620c0308b61a4cf9c0e95927650ccb513
> grep: add -e option for specifying multiple patterns

Please highlight that this option is the same than GNU grep one.

> 
> diff --git a/mercurial/commands.py b/mercurial/commands.py
> --- a/mercurial/commands.py
> +++ b/mercurial/commands.py
> @@ -2885,12 +2885,15 @@
>       _('print only filenames and revisions that match')),
>      ('n', 'line-number', None, _('print matching line numbers')),
>      ('', 'no-filename', None, _('suppress filename from output')),
> +    ('e', 'regexp', [],
> +     _('use this pattern to find matches (must be used if a pattern'
> +       ' starts with -), multiple patterns are or-ed'), _('PATTERN')),

This help text need to be improved. Not sure of how yet.

>      ('r', 'rev', [],
>       _('only search files changed within revision range'), _('REV')),
>      ('u', 'user', None, _('list the author (long with -v)')),
>      ('d', 'date', None, _('list the date (short with -q)')),
>      ] + walkopts,
> -    _('[OPTION]... PATTERN [FILE]...'))
> +    _('[OPTION]... [-e] PATTERN... [FILE]...'))
>  def grep(ui, repo, pattern, *pats, **opts):
>      """search for a pattern in specified files and revisions
>  
> @@ -2913,7 +2916,13 @@
>      if opts.get('ignore_case'):
>          reflags |= re.I
>      try:
> -        regexp = re.compile(pattern, reflags)
> +        regexps = []
> +        if opts.get('regexp'):
> +            for p in opts.get('regexp'):
> +                regexps.append(re.compile(p, reflags))
> +            pats = (pattern,) + pats
> +        else:
> +            regexps.append(re.compile(pattern, reflags))

This hunk confused me a bit until I realise that if -e is used we "pattern"
must be seen as a filename.

Adding a small comment to highlight this would help imo.

Moreover what happen if "this pattern" is not specified? You do not seem to
tests this case at all.


>      except re.error, inst:
>          ui.warn(_("grep: invalid match pattern: %s\n") % inst)
>          return 1
> @@ -2927,9 +2936,18 @@
>          begin = 0
>          linenum = 0
>          while True:
> -            match = regexp.search(body, begin)
> -            if not match:
> +            # Search all regexps and pick the one that starts earlier in the
> +            # file. We could probably be more efficient here by only searching
> +            # the regexp that was picked in the previous iteration and those
> +            # whose span intersected with it.

I was confused by this comment at first reading. You actually matches again
*all* regexps but only the first match will be highlighted by color. Right?

Please be more specific about what you mean by "pick".

None of my interpretation seems to be consistent with GNU grep -e. Can you
explain why you diverged from it ?

> +            matches = []
> +            for regexp in regexps:
> +                match = regexp.search(body, begin)
> +                if match:
> +                    matches.append(match)
> +            if not matches:
>                  break
> +            match = min(matches, key=lambda m: m.span()[0])
>              mstart, mend = match.span()
>              linenum += body.count('\n', begin, mstart) + 1
>              lstart = body.rfind('\n', begin, mstart) + 1 or begin

-- 
Pierre-Yves David

http://www.logilab.fr/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20121016/4a9ac2c6/attachment.pgp>


More information about the Mercurial-devel mailing list