Unified tests and graphlog output

Brodie Rao brodie at bitheap.org
Mon Sep 20 16:45:40 CDT 2010


On Mon, Sep 20, 2010 at 3:04 PM, Peter Arrenbrecht
<peter.arrenbrecht at gmail.com> wrote:
> On Mon, Sep 20, 2010 at 8:18 PM, Brodie Rao <brodie at bitheap.org> wrote:
>> On Mon, Sep 20, 2010 at 12:48 PM, Adrian Buehlmann <adrian at cadifra.com> wrote:
>>> I found a problem with unified tests containing glog calls (I noticed
>>> this while trying to unify tests-rebase*).
>>>
>>> Assume having the following unified test file (test-test.t):
>>>
>>>
>>>  $ echo "[extensions]" >> $HGRCPATH
>>>  $ echo "graphlog=" >> $HGRCPATH
>>>  $ hg init
>>>  $ echo "foo" > a
>>>  $ hg ci -A -m 1
>>>  adding a
>>>  $ echo "bar" >> a
>>>  $ hg ci -m 2
>>>  $ hg up 0
>>>  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
>>>  $ echo "bla" >> a
>>>  $ hg ci -m 3
>>>  created new head
>>>
>>>  $ hg glog --template '{rev}:{node|short} "{desc}"\n'
>>>  @  2:20c4f79fd7ac "3"
>>>  |
>>>  | o  1:38f24201dcab "2"
>>>  |/
>>>  o  0:2a18120dc1c9 "1"
>>>
>>>
>>> This all looks shiny and nice.
>>>
>>> But...
>>>
>>> Assume you edit the graphlog output part of that file to:
>>>
>>>
>>>  $ hg glog --template '{rev}:{node|short} "{desc}"\n'
>>>  @  2:20c4f79fd7ac "3"
>>>  |
>>>  | o  1:38f24201dcab "2" EVIL HACKERS WERE HERE
>>>  |/
>>>  o  0:2a18120dc1c9 "1"
>>>
>>>
>>> Then running this test in a terminal goes like this:
>>>
>>>
>>>  $ python run-tests.py test-test.t -i
>>>  .
>>>  # Ran 1 tests, 0 skipped, 0 failed.
>>>
>>>
>>> Huh?
>>>
>>> The test doesn't detect that the command output does not match the
>>> expected output?!
>>>
>>> The symptom is that iff there is a '|' character at the beginning of a
>>> line, the rest of the line is irrelevant.
>>>
>>> Yikes.
>>>
>>> I haven't looked at the code of the test framework, but my first guess
>>> is that we are hit here by the regular expression matching feature of
>>> the unified tests.
>>>
>>> Even more evil: if -- alternatively -- you instead change the graphlog
>>> part to:
>>>
>>>
>>>  $ hg glog --template 'HACKED {rev}:{node|short} "{desc}"\n'
>>>  @  2:20c4f79fd7ac "3"
>>>  |
>>>  | o  1:38f24201dcab "2"
>>>  |/
>>>  o  0:2a18120dc1c9 "1"
>>>
>>>
>>> (note the "HACKED" inserted into the template option)
>>>
>>> and then accept the changed test file (test-test.t.err) after running
>>> that test, you will end having:
>>>
>>>
>>>  $ hg glog --template 'HACKED {rev}:{node|short} "{desc}"\n'
>>>  @  HACKED 2:20c4f79fd7ac "3"
>>>  |
>>>  | o  1:38f24201dcab "2"
>>>  |/
>>>  o  HACKED 0:2a18120dc1c9 "1"
>>>
>>>
>>> in your "test-test.t" file (note that "HACKED" has -not- been inserted
>>> for node 1, because it is behind a '|' !).
>>>
>>> Can this be improved, so that unified tests are more graphlog-proof?
>>
>> You're right:
>>
>>  >>> import re
>>  >>> re.match('|foo', 'bar')
>>  <_sre.SRE_Match object at 0x1007f2ed0>
>>
>> This is a general problem of falling back to regular expression
>> matching. It works both ways too; you could have a regular expression
>> that literally matches the output before being compiled/processed.
>>
>> You could do one of the following:
>>
>> 1. Escape your graphlog output.
>>
>> 2. Require regular expressions to be specially marked. You'd have to
>> update all the existing unified tests, but there's no chance you'd
>> accidentally run into this issue again.
>>
>> #2 seems like the better long-term choice to me.
>
> Agreed. Of course we'd have to escape output that conflicts with this
> markup. I myself would prefer a trailing mark. So for instance:
>
>  $ hg rebase
>  rebasing...
>  saved backup file to .* (re)
>  $ echo '(re)'
>  (re) (plain)
>
> The idea being that if you find a closing ) on the line, you look for
> the closest ( to its left and interpret the inner part as a directive.

I like this. You can make it even simpler by just looking for "
(plain)\n" or " (re)\n" at the end of each line.

Another option might be to prefix regexes with " ^" instead of two
spaces. I can't decide if that's easier or harder to read, though.

I managed to write a patch in pretty short order that implements (re)
and (plain), using the test runner itself to insert the (re) marks for
me. As of f85338f509e4, there are 195 regexes in the test suite.


More information about the Mercurial-devel mailing list