Test timeouts

Greg Ward greg at gerg.ca
Fri May 29 08:47:27 CDT 2009


I've just noticed that run-tests.py implements timeouts at two levels,
both by setting up a SIGALRM.  First, runtests() imposes a timeout for
the entire set of tests (or it tries to).  Then, runone() imposes a
timeout for each individual test.  If I recall Unix Signal Handling
101, the second SIGALRM cancels the first, so the SIGALRM setup by
runtests() is pointless and mostly harmless.

However, I can make it cause problems as follows:

  * run some tests with a short timeout, e.g. "./run-tests.py -j 2
--timeout 10 -v test-symlink*"
  * after the children are spawned, hit Ctrl-S to pause the terminal
and keep it paused for 10 sec
  * hit Ctrl-Q to unpause the terminal
  * ka-boom: parent process dies with this exception:

Traceback (most recent call last):
  File "./run-tests.py", line 746, in <module>
    main()
  File "./run-tests.py", line 740, in main
    runchildren(options, tests)
  File "./run-tests.py", line 565, in runchildren
    test, skip, fail = map(int, l[:3])
ValueError: need more than 0 values to unpack

What happens is this: at least one of the children is hit by the
timeout arranged by runtests(), the one I claim is pointless.  Since
it's not expecting a Timeout exception at the right scope, it dies
like this:

Traceback (most recent call last):
  File "./run-tests.py", line 716, in <module>
    main()
  File "./run-tests.py", line 712, in main
    runtests(options, expecthg, tests)
  File "./run-tests.py", line 606, in runtests
    ret = runone(options, test, skips, fails)
  File "./run-tests.py", line 416, in runone
    vlog("# Ret was:", ret)
  File "./run-tests.py", line 102, in vlog
    print
  File "./run-tests.py", line 308, in alarmed
    raise Timeout
__main__.Timeout

That child dies *without* writing the 3-line test summary to the
reporting FD passed by the parent.  So when the parent tries to read
that 3-line test summary, it blows up in turn with the ValueError
above.

(If the "Ctrl-S to pause terminal" scenario sounds artificial, it's
not: I was using it while trying to understand what run-tests.py does,
hopping from its output back and forth to the code.  It's reasonable
for some tests to suffer timeout failures when I do that, but I did
not expect the parent run-tests.py to crash!)

I think the SIGALRM setup by runtests() adds no value.  I suggest
removing it.  Patch coming soon...

Greg


More information about the Mercurial-devel mailing list