[PATCH RFC] tests: require regexes in unified tests to be marked with " (re)"

Brodie Rao brodie at bitheap.org
Tue Sep 21 18:11:31 CDT 2010


On Tue, Sep 21, 2010 at 5:52 PM, Adrian Buehlmann <adrian at cadifra.com> wrote:
> On 21.09.2010 20:36, Brodie Rao wrote:
>> # HG changeset patch
>> # User Brodie Rao <brodie at bitheap.org>
>> # Date 1285094123 18000
>> # Node ID 2ba3c74f410285bb2da18f6f5fa6ea3b863b8c94
>> # Parent  dbca8f134f00fb10f7154b2f6cd9b8188095b1de
>> tests: require regexes in unified tests to be marked with " (re)"
>>
>> Consider this test:
>>
>>   $ hg glog --template '{rev}:{node|short} "{desc}"\n'
>>   @  2:20c4f79fd7ac "3"
>>   |
>>   | o  1:38f24201dcab "2"
>>   |/
>>   o  0:2a18120dc1c9 "1"
>>
>> Because each line beginning with "|" can be compiled as a regular
>> expression (equivalent to ".*|"), they will match any output.
>>
>> By requiring regular expressions to be marked with " (re)", we reduce
>> the potential for false negatives.
>>
>> To match a line ending in " (re)", the line can be suffixed with "
>> (re) (plain)". " (plain)" has no effect anywhere else.
>
> I don't really like that. You justify this (re) syntax with an example
> (graphlog output) that won't work anymore if combined with regular
> expressions (unless you put a \ before every glog |).

The point of the new syntax is to make it possible to write tests like
those without having to do any escaping, and to prevent tests from
silently passing when they're really failing. If you're intentionally
writing regular expressions in graphlog output, I think having to
escape pipes is reasonable.

If you don't want to escape everything and you just want very simple
pattern matching, you could extend the syntax to add a (glob) marker.
It could match the line using something like
re.match(fnmatch.translate([expected line]), [actual line]).

Another option would be to mark up blocks of output instead of
individual lines with a syntax like this:

 $ echo '1\n...\n2' # compare with regex
 \d
 \.\.\.
 \d

That's easier to read, but it's not as flexible.

> Can't we just solve the graphlog | problem instead?
>
> The only problematic character is '|' and near as I can tell it hasn't
> been used anywhere in it's regex meaning in the tests so far.
>
> Can't we simply declare that if a line starts with a '|', then '|' won't
> be interpreted as regex '|' anywhere for the rest of that line?

You don't think this would be a problem with other special characters
or lines containing | that don't start with |? I admit that the pipe
example is probably the only one that would be overly problematic, but
I don't like the idea of having the ambiguity there.

I could just be making a mountain out of a molehill, though.

> (two other inline comments follow below)

(re) isn't needed in those cases. I modified run-tests.py to
automatically insert the markup for me, and those happened to be
regular expressions already. I'll make them plain strings.


More information about the Mercurial-devel mailing list