[PATCH 1 of 1] tests: asciify output of tsttest

Sun Sep 19 08:49:52 CDT 2010

Mads Kiilerich wrote:
>   Yuya Nishihara wrote, On 09/19/2010 08:53 AM:
> > # HG changeset patch
> > # User Yuya Nishihara<yuya at tcha.org>
> > # Date 1284876231 -32400
> > # Node ID 617a551466acb0fc5f04ef8973d779b4f8939785
> > # Parent  d643ae555a4de42ca2798a43a48a7e7c2c729d0a
> > tests: asciify output of tsttest
> >
> > It encodes non-ascii and most control characters of outputs to python-style
> > \-escapes. It aims to avoid trouble with outputs for non-ascii, color
> > and progress tests.
> >
> > This patch is based on Mads Kiilerich's.
> 
> Thanks for picking it up. I haven't had the time to finish it - but I 
> always have time to give comments ;-)
> 
> > diff --git a/tests/run-tests.py b/tests/run-tests.py
> > --- a/tests/run-tests.py
> > +++ b/tests/run-tests.py
> > @@ -460,6 +460,11 @@ def battest(test, options):
> >       vlog("# Running", cmd)
> >       return run(cmd, options)
> >
> > +def asciify(line):
> > +    """Escape non-ascii and most control characters"""
> > +    return re.sub(r'[^\x20-\x7f\n\t]',
> 
> Regexp compilation is a relatively expensive process, so I think that a 
> general rule of thumb is that regexps used in loops always should be 
> compiled outside the loop.

Hmm, but re.py looks implementing compiled cache of max 100 patterns.

> My intent was to make the encoding as un-intrusive as possible and thus 
> never escape ' or ".
> 
> I didn't know that encoding. It seems like that encoding is defined as
>      eval("'" + x.encode('string-escape') + "'") == x
> and it will thus always escape '.
> 
> An alternative is to use repr(x)[1:-1] which only will escape ' if the 
> string also contains ".
> 
> It probably doesn't matter.
> 
> BUT if you use string-escape then there is probably no reason to use 
> regexps first and it is probably faster to always just string-escape. 
> And then there is no need to define an asciify function at all.

It'll be simple if we can just write s.encode('string-escape') or
repr(s)[1:-1] without re.sub(), but it's necessary to avoid unwanted
escapes like \t, \n, \'.

> >   def tsttest(test, options):
> >       t = open(test)
> >       out = []
> > @@ -501,6 +506,8 @@ def tsttest(test, options):
> >       finally:
> >           os.remove(name)
> >
> > +    output = [asciify(l) for l in output]
> 
> Hmm. An extra iteration through the list and changing the "type" of 
> output from "list of raw output lines" to "list of asciified output 
> lines". I would prefer to asciify while we loop through the list.

I thought it was readable than l = asciify(l) at if-else clause inside loop,
but you're right, it shouldn't change the "type" of output.

> I think that with this patch in place tests/printrepr.py should go away. 
> I'm not sure if it should be done in the same patch or in another.

After all test cases are migrated to tsttest?
Currently shell-based tests use printrepr.py.

Yuya,