Potential improvements to run-tests.py

Sun Oct 26 08:29:15 CDT 2014

On 10/24/2014 06:50 PM, Gregory Szorc wrote:
> I've been writing a lot of tests for Mozilla's Mercurial 
> infrastructure recently [1]. Our test "harness" is a glorified (and 
> very hacky) wrapper around run-tests.py [2].
>
> As part of writing all these tests, I've identified a number of areas 
> for improving run-tests.py. I hope to find time to contribute some 
> patches upstream once things quiet down here. In the mean time, I 
> thought I'd send ideas to get people thinking (and possibly to entice 
> other contributors).
>
> The common theme for the following improvements is increasing 
> re-usability of the test harness. This will enable extension authors 
> and other Mercurial users to more easily write tests. Yes, we have 
> cram [3] today. But I'm mainly focused on Mercurial. And, I think that 
> with a little work, we could refactor Mercurial's testing code to have 
> a generic base layer that is effectively cram.

As a general comment: Cleanup and improvements of the Mercurial test 
runner is of course much appreciated. Features and complexity that don't 
have any use in Mercurial itself is more questionable. The purpose of 
the test suite is to test Mercurial as efficiently as possible. Because 
of that we can be very agile and refactor anything. Opening up for other 
use cases would give a bigger maintenance burden ... even though we 
probably wouldn't promise any kind of backward compatibility anyway.

> * Ability to declare your own variables for substitutions. When 
> comparing output, we substitute variables/strings like $HGPORT. These 
> come from a hard-coded list. I want to make it possible for the test 
> to define its own variables.

What syntax would you use for that? Do you see any use cases for that in 
the Mercurial tests?

> * Consider dynamic port numbers. Once variable substitutions can be 
> declared from tests, we don't need to do the silly pre-defined port 
> numbers mess ($HGPORT, $HGPORT1, etc). Instead, we can let things bind 
> to an open port automatically. This will help prevent failures from 
> tests trying to use a port some other program is using or from tests 
> that don't clean up well causing cascading test failures.

It is silly to call things silly(!). The port number handling is simple 
and has been working ok so far. It could definitely be done in a much 
more sophisticated way. That might be worth it - I'm not sure.

If using automatic listen port assignment, after the process (hgweb or 
something else) has started listening on some automatic port, how would 
you retrieve the port number and so other commands could use it, 
efficiently and in shell script? It could of course be done, but it is 
not completely trivial.

AFAICS, it would also require that substitutions should be dynamic and 
change while the test output is processed. Again, more complexity, and 
I'm not sure it is worth it.

Instead, I would suggest that the runner should pre-scan the test text 
for which ports a test needs, allocate random ports and verify they 
really are free, set the environment variables, and finally fail the 
tests if the ports still are in used after the test has completed. 
(TIMEWAITish issues might make that tricky to implement reliably.)

> * Consider a generic "register cleanup" mechanism from .t tests. Many 
> of the .t tests I'm writing start Docker containers and stop/remove 
> them when the test is complete. If we ctrl+c the test, the containers 
> keep running. I'd like to add a mechanism that allows tests to declare 
> what cleanup actions to run when the test exists. This is probably 
> "write a list of commands to a file which will be executed by the 
> shell." This could eventually eliminate $DAEMON_PIDS handling from the 
> test harness itself.

LGTM. The common case of just having daemon pids should remain as simple 
as it is.

(We could of course also use cgroups or docker separation on Linux. But 
the stability issues we see are mainly on other platforms so the gain 
from using that on linux would not be big.)

> * hghave as a module. Calling hghave as a separate process is silly. 
> mpm added #require a little while back. run-tests.py should load 
> hghave from a module and invoke it as an API.

Yes, there is a TODO for that in the code. There is also a few direct 
hghave invocations left.

One reason we still use separate processes is that it gives some 
separation. run-tests is testing Mercurial and shouldn't (?) depend on 
it. Currently there are a couple of mercurial.util imports in hghave. No 
big deal as long as it is separate short-lived processes, more of a 
problem if it is done in the runner process.

> * Extensible hghave. Downstream test authors may need to write custom 
> hghave checks. For example, I added "#require docker." The current 
> design of hghave makes this difficult.

AFAICS it is very easy now:
   $ docker.io version | grep -q "^Client version:" || exit 80

But yes, if hghave is completely replaced by direct invocation as Python 
module, it could be nice to have ways to extend that.
Something like:
#require mymodule.docker
and perhaps make hghave explicit:
#require hghave.symlink
?

> * Allow the running of the same .t test from multiple Mercurial 
> versions or Python versions, all from the same TestRunner invocation. 
> Currently, we can't do this because things like result recording 
> assume the test name is globally unique.

That seems to contradict the direct use of hghave as a module that 
checks which capabilities the current Python has.

/Mads