[PATCH] revsetlang: do not pass in non-bytes to parse()

Yuya Nishihara yuya at tcha.org
Tue Apr 17 13:21:19 UTC 2018


# HG changeset patch
# User Yuya Nishihara <yuya at tcha.org>
# Date 1523969998 -32400
#      Tue Apr 17 21:59:58 2018 +0900
# Node ID 235258eb2600f6d41ce9bc5c8ab5a2601b19dce8
# Parent  925707ac2855944b0607bec68986a273fb5321ae
revsetlang: do not pass in non-bytes to parse()

Since parse() isn't a simple function, we shouldn't expect it would raise
TypeError or ValueError for invalid inputs. Before, TypeError was raised
at 'if pos != len(spec)', which was quite late to report an error.

This patch also makes tokenize() detect invalid object before converting
it to a py3-safe bytes.

Spotted while adding the 'revset(...)' hack to _parsewith().

diff --git a/mercurial/revsetlang.py b/mercurial/revsetlang.py
--- a/mercurial/revsetlang.py
+++ b/mercurial/revsetlang.py
@@ -89,6 +89,9 @@ def tokenize(program, lookup=None, symin
     [('symbol', '@', 0), ('::', None, 1), ('end', None, 3)]
 
     '''
+    if not isinstance(program, bytes):
+        raise error.ProgrammingError('revset statement must be bytes, got %r'
+                                     % program)
     program = pycompat.bytestr(program)
     if syminitletters is None:
         syminitletters = _syminitletters
@@ -581,6 +584,8 @@ def _formatargtype(c, arg):
     elif c == 's':
         return _quote(arg)
     elif c == 'r':
+        if not isinstance(arg, bytes):
+            raise TypeError
         parse(arg) # make sure syntax errors are confined
         return '(%s)' % arg
     elif c == 'n':


More information about the Mercurial-devel mailing list