[PATCH STABLE] templater: parse \"...\" as string for 2.9.2-3.4 compatibility (issue4733)

Yuya Nishihara yuya at tcha.org
Sat Jun 27 08:35:09 CDT 2015


# HG changeset patch
# User Yuya Nishihara <yuya at tcha.org>
# Date 1435237658 -32400
#      Thu Jun 25 22:07:38 2015 +0900
# Branch stable
# Node ID 72f4416c0f70b5671cc009c9819f2e0ad2902a4a
# Parent  6047b60cdd0984b60191bdb3b272d87c2691ffe4
templater: parse \"...\" as string for 2.9.2-3.4 compatibility (issue4733)

As of Mercurial 3.4, there were several syntax rules to process nested
template strings. Unfortunately, they were inconsistent and conflicted
each other.

 a. buildmap() rule
    - template string is _parsed_ as string, and parsed as template
    - <\"> is not allowed in nested template:
      {xs % "{f(\"{x}\")}"} -> parse error
    - template escaping <\{> is handled consistently:
      {xs % "\{x}"} -> escaped
 b. _evalifliteral() rule
    - template string is _interpreted_ as string, and parsed as template
      in crafted environment to avoid double processing of escape sequences
    - <\"> is allowed in nested template:
      {if(x, "{f(\"{x}\")}")}
    - <\{> and escape sequences in string literal in nested template are not
      handled well
 c. pad() rule
    - template string is first interpreted as string, and parsed as template,
      which means escape sequences are processed twice
    - <\"> is allowed in nested template:
      {pad("{xs % \"{x}\"}', 10)}

Because of the issue of template escaping, issue4714, 7298da81f5a9 (in stable)
unified the rule (b) to (a). Then, 576d6c74784b (in default) unified the rule
(c) to (b) = (a). But they disabled the following syntax that was somewhat
considered valid.

  {if(rev, "{if(rev, \"{rev}\")}")}
  {pad("{files % \"{file}\"}", 10)}

So, this patch introduces \"...\" literal to work around the escaped-quoted
nested template strings. Because this parsing rule exists only for the backward
compatibility, it is designed to copy the behavior of old _evalifliteral() as
possible.

Future patches will introduce a better parsing rule similar to a command
substitution of POSIX shells or a string interpolation of Ruby, where extra
escapes won't be necessary at all.

  {pad("{files % "{file}"}", 10)}
        ~~~~~~~~~~~~~~~~~~
        parsed as a template, not as a string

Because <\> character wasn't allowed in a template fragment, this patch won't
introduce more breakages. But the syntax of nested templates are interpreted
differently by people, there might be unknown issues. So if we want, we could
instead remove db7463aa080f, 890845af1ac2 and 7298da81f5a9 from the stable
branch as the bug fixed by these patches existed for longer periods.

554d6fcc3c8, "strip single backslash before quotation mark in quoted template",
should be superseded by this patch. I'll remove it later.

diff --git a/mercurial/templater.py b/mercurial/templater.py
--- a/mercurial/templater.py
+++ b/mercurial/templater.py
@@ -59,6 +59,41 @@ def tokenizer(data):
                 pos += 1
             else:
                 raise error.ParseError(_("unterminated string"), s)
+        elif (c == '\\' and program[pos:pos + 2] in (r"\'", r'\"')
+              or c == 'r' and program[pos:pos + 3] in (r"r\'", r'r\"')):
+            # handle escaped quoted strings for compatibility with 2.9.2-3.4,
+            # where some of nested templates were preprocessed as strings and
+            # then compiled. therefore, \"...\" was allowed. (issue4733)
+            #
+            # processing flow of _evalifliteral() at 5ab28a2e9962:
+            # outer template string    -> stringify()  -> compiletemplate()
+            # ------------------------    ------------    ------------------
+            # {f("\\\\ {g(\"\\\"\")}"}    \\ {g("\"")}    [r'\\', {g("\"")}]
+            #             ~~~~~~~~
+            #             escaped quoted string
+            if c == 'r':
+                pos += 1
+                token = 'rawstring'
+            else:
+                token = 'string'
+            quote = program[pos:pos + 2]
+            s = pos = pos + 2
+            while pos < end: # find closing escaped quote
+                if program.startswith('\\\\\\', pos, end):
+                    pos += 4 # skip over double escaped characters
+                    continue
+                if program.startswith(quote, pos, end):
+                    try:
+                        # interpret as if it were a part of an outer string
+                        data = program[s:pos].decode('string-escape')
+                    except ValueError: # unbalanced escapes
+                        raise error.ParseError(_("syntax error"), s)
+                    yield (token, data, s)
+                    pos += 1
+                    break
+                pos += 1
+            else:
+                raise error.ParseError(_("unterminated string"), s)
         elif c.isalnum() or c in '_':
             s = pos
             pos += 1
diff --git a/tests/test-command-template.t b/tests/test-command-template.t
--- a/tests/test-command-template.t
+++ b/tests/test-command-template.t
@@ -2291,6 +2291,50 @@ Test string escaping of quotes:
   $ hg log -Ra -r0 -T '{r"\\\""}\n'
   \\\"
 
+Test compatibility with 2.9.2-3.4 of escaped quoted strings in nested
+_evalifliteral() templates (issue4733):
+
+  $ cd latesttag
+
+  $ hg log -r 2 -T '{if(rev, "\"{rev}")}\n'
+  "2
+  $ hg log -r 2 -T '{if(rev, "{if(rev, \"\\\"{rev}\")}")}\n'
+  "2
+  $ hg log -r 2 -T '{if(rev, "{if(rev, \"{if(rev, \\\"\\\\\\\"{rev}\\\")}\")}")}\n'
+  "2
+
+  $ hg log -r 2 -T '{if(rev, "\\\"")}\n'
+  \"
+  $ hg log -r 2 -T '{if(rev, "{if(rev, \"\\\\\\\"\")}")}\n'
+  \"
+  $ hg log -r 2 -T '{if(rev, "{if(rev, \"{if(rev, \\\"\\\\\\\\\\\\\\\"\\\")}\")}")}\n'
+  \"
+
+  $ hg log -r 2 -T '{if(rev, r"\\\"")}\n'
+  \\\"
+  $ hg log -r 2 -T '{if(rev, "{if(rev, r\"\\\\\\\"\")}")}\n'
+  \\\"
+  $ hg log -r 2 -T '{if(rev, "{if(rev, \"{if(rev, r\\\"\\\\\\\\\\\\\\\"\\\")}\")}")}\n'
+  \\\"
+
+escaped single quotes and errors:
+
+  $ hg log -r 2 -T "{if(rev, '{if(rev, \'foo\')}')}"'\n'
+  foo
+  $ hg log -r 2 -T "{if(rev, '{if(rev, r\'foo\')}')}"'\n'
+  foo
+  $ hg log -r 2 -T '{if(rev, "{if(rev, \")}")}\n'
+  hg: parse error at 11: unterminated string
+  [255]
+  $ hg log -r 2 -T '{if(rev, \"\\"")}\n'
+  hg: parse error at 11: syntax error
+  [255]
+  $ hg log -r 2 -T '{if(rev, r\"\\"")}\n'
+  hg: parse error at 12: syntax error
+  [255]
+
+  $ cd ..
+
 Test leading backslashes:
 
   $ cd latesttag


More information about the Mercurial-devel mailing list