[PATCH 1 of 2] grep: prevent "hg grep" from sending binary data to stdout when it's a tty

Craig Leres leres at ee.lbl.gov
Sun Apr 10 12:10:27 CDT 2011


# HG changeset patch
# User Craig Leres <leres at ee.lbl.gov>
# Date 1301365553 25200
# Node ID fd97715da1bbcc27d5781420649e0112d37b0743
# Parent  b4f55a927ea02d5cd70c025853aea12e35a3ee2f
grep: prevent "hg grep" from sending binary data to stdout when it's a tty

The old hg grep output behaviour is kept when stdout is not a tty
or when the (new) -a/--text flag is used. Also, warnings are sent
to stderr unlike gnu grep which uses stdout.  But it's probably
better to send warnings to stderr and since stdout is (probably) a
tty, this gives the user the ability to redirect the warnings
somewhere other than the tty.

In the case where hg grep is used in like gnu grep (the first
matching revision), one warning is displayed per binary file that
matches the search pattern.

The other case is where hg grep is used to search multiple revisions
with -all or --follow. In this case one warning per file+rev is
output and the per match output lines are the same as before except
the last part which consists of the seperator (usually ':') and
matching text is suppressed.

Examples:

    # plain grep
    $ hg grep -r5 port port
    Binary file port rev 5 matches
    toxic 32 %

    # all
    $ hg grep --all port port
    port:6:-
    Binary file port rev 5 matches
    port:5:+
    port:4:-:import/export
    port:3:+:import/export
    port:2:-:import
    port:2:-:export
    port:2:+:export
    port:2:+:vaportight
    port:2:+:import/export
    port:1:+:export
    port:0:+:import
* * *
tests: create a revision of the test file containing a null to test the
binary grep mods

diff -r b4f55a927ea0 -r fd97715da1bb mercurial/commands.py
--- a/mercurial/commands.py	Sun Mar 20 19:17:54 2011 -0700
+++ b/mercurial/commands.py	Mon Mar 28 19:25:53 2011 -0700
@@ -1778,12 +1778,15 @@
 
     matches = {}
     copies = {}
+    binaries = {}
     def grepbody(fn, rev, body):
         matches[rev].setdefault(fn, [])
         m = matches[rev][fn]
         for lnum, cstart, cend, line in matchlines(body):
             s = linestate(line, lnum, cstart, cend)
             m.append(s)
+        if m and ui.formatted() and (fn, rev) not in binaries:
+            binaries[(fn, rev)] = util.binary(body)
 
     def difflinestates(a, b):
         sm = difflib.SequenceMatcher(None, a, b)
@@ -1809,7 +1812,17 @@
             iter = difflinestates(pstates, states)
         else:
             iter = [('', l) for l in states]
+        binary = binaries.get((fn, rev), False)
+        # Potentially suppress per line output
+        suppress = ui.formatted() and (opts.get('all') or follow)
         for change, l in iter:
+            if ui.formatted() and binary:
+                ui.warn(_('Binary file %s rev %d matches\n') % (fn, rev))
+                # Warn at most once per file+rev
+                binary = False
+                # Don't display anything else if not looking at mulitple revs
+                if not (opts.get('all') or follow):
+                    return True
             cols = [fn, str(rev)]
             before, match, after = None, None, None
             if opts.get('line_number'):
@@ -1830,7 +1843,9 @@
                 match = l.line[l.colstart:l.colend]
                 after = l.line[l.colend:]
             ui.write(sep.join(cols))
-            if before is not None:
+            # Omit final sep and match text when it's binary
+            if (before is not None and not (suppress and
+                util.binary(''.join((before, match, after))))):
                 ui.write(sep + before)
                 ui.write(match, label='grep.match')
                 ui.write(after)
diff -r b4f55a927ea0 -r fd97715da1bb tests/test-grep.t
--- a/tests/test-grep.t	Sun Mar 20 19:17:54 2011 -0700
+++ b/tests/test-grep.t	Mon Mar 28 19:25:53 2011 -0700
@@ -15,6 +15,11 @@
   $ head -n 3 port > port1
   $ mv port1 port
   $ hg commit -m 4 -u spam -d '4 0'
+  $ cp port port.save
+  $ printf '\0portable\n' >> port
+  $ hg commit -m 5 -u beans -d '5 0'
+  $ mv port.save port
+  $ hg commit -m 6 -u beans -d '6 0'
 
 pattern error
 
@@ -25,21 +30,23 @@
 simple
 
   $ hg grep port port
-  port:4:export
-  port:4:vaportight
-  port:4:import/export
+  port:6:export
+  port:6:vaportight
+  port:6:import/export
 
 simple with color
 
   $ hg --config extensions.color= grep --config color.mode=ansi \
   >     --color=always port port
-  port:4:ex\x1b[0;31;1mport\x1b[0m (esc)
-  port:4:va\x1b[0;31;1mport\x1b[0might (esc)
-  port:4:im\x1b[0;31;1mport\x1b[0m/export (esc)
+  port:6:ex\x1b[0;31;1mport\x1b[0m (esc)
+  port:6:va\x1b[0;31;1mport\x1b[0might (esc)
+  port:6:im\x1b[0;31;1mport\x1b[0m/export (esc)
 
 all
 
   $ hg grep --traceback --all -nu port port
+  port:6:4:-:beans:\x00portable (esc)
+  port:5:4:+:beans:\x00portable (esc)
   port:4:4:-:spam:import/export
   port:3:4:+:eggs:import/export
   port:2:1:-:spam:import
@@ -53,11 +60,32 @@
 other
 
   $ hg grep import port
-  port:4:import/export
+  port:6:import/export
 
   $ hg cp port port2
   $ hg commit -m 4 -u spam -d '5 0'
 
+binary and formatted
+
+  $ hg grep -r5 port port --config ui.formatted=true
+  Binary file port rev 5 matches
+
+binary and formatted and all revs
+
+  $ hg grep --all port port --config ui.formatted=true
+  port:6:-
+  Binary file port rev 5 matches
+  port:5:+
+  port:4:-:import/export
+  port:3:+:import/export
+  port:2:-:import
+  port:2:-:export
+  port:2:+:export
+  port:2:+:vaportight
+  port:2:+:import/export
+  port:1:+:export
+  port:0:+:import
+
 follow
 
   $ hg grep --traceback -f 'import$' port2
@@ -65,7 +93,9 @@
   $ echo deport >> port2
   $ hg commit -m 5 -u eggs -d '6 0'
   $ hg grep -f --all -nu port port2
-  port2:6:4:+:eggs:deport
+  port2:8:4:+:eggs:deport
+  port:6:4:-:beans:\x00portable (esc)
+  port:5:4:+:beans:\x00portable (esc)
   port:4:4:-:spam:import/export
   port:3:4:+:eggs:import/export
   port:2:1:-:spam:import


More information about the Mercurial-devel mailing list