Potential improvements to run-tests.py

Fri Oct 24 11:50:32 CDT 2014

I've been writing a lot of tests for Mozilla's Mercurial infrastructure 
recently [1]. Our test "harness" is a glorified (and very hacky) wrapper 
around run-tests.py [2].

As part of writing all these tests, I've identified a number of areas 
for improving run-tests.py. I hope to find time to contribute some 
patches upstream once things quiet down here. In the mean time, I 
thought I'd send ideas to get people thinking (and possibly to entice 
other contributors).

The common theme for the following improvements is increasing 
re-usability of the test harness. This will enable extension authors and 
other Mercurial users to more easily write tests. Yes, we have cram [3] 
today. But I'm mainly focused on Mercurial. And, I think that with a 
little work, we could refactor Mercurial's testing code to have a 
generic base layer that is effectively cram.

* Ability to declare your own variables for substitutions. When 
comparing output, we substitute variables/strings like $HGPORT. These 
come from a hard-coded list. I want to make it possible for the test to 
define its own variables.

* Consider dynamic port numbers. Once variable substitutions can be 
declared from tests, we don't need to do the silly pre-defined port 
numbers mess ($HGPORT, $HGPORT1, etc). Instead, we can let things bind 
to an open port automatically. This will help prevent failures from 
tests trying to use a port some other program is using or from tests 
that don't clean up well causing cascading test failures.

* Consider a generic "register cleanup" mechanism from .t tests. Many of 
the .t tests I'm writing start Docker containers and stop/remove them 
when the test is complete. If we ctrl+c the test, the containers keep 
running. I'd like to add a mechanism that allows tests to declare what 
cleanup actions to run when the test exists. This is probably "write a 
list of commands to a file which will be executed by the shell." This 
could eventually eliminate $DAEMON_PIDS handling from the test harness 
itself.

* hghave as a module. Calling hghave as a separate process is silly. mpm 
added #require a little while back. run-tests.py should load hghave from 
a module and invoke it as an API.

* Extensible hghave. Downstream test authors may need to write custom 
hghave checks. For example, I added "#require docker." The current 
design of hghave makes this difficult.

* Allow the running of the same .t test from multiple Mercurial versions 
or Python versions, all from the same TestRunner invocation. Currently, 
we can't do this because things like result recording assume the test 
name is globally unique.

* Make it easier to run tests from multiple directories. Our repository 
has many different components and .t tests in various directories. When 
you run a .t test from a child directory, $TESTDIR is the directory 
where the tests started executing from, not where the .t test is. 
Relative paths, etc are a bit wonky, making it a little harder to write 
tests and maintain.

* Consider moving all the testing code into reusable modules. Right now, 
run-tests.py can't be used with "import" (because of the hyphen). There 
are also a handful of support files (hghave.py, heredoctest.py, 
killdaemons.py, silenttestrunner.py, etc) that run-tests assumes are 
importable. I'd like to establishing *mercurial.testing* (or something 
like it) and move all the potentially reusable test code there. 3rd 
party consumers could potentially consume these files from a Mercurial 
install. Or, they could install a separate pypi package containing just 
the testing code. Worst case, they copy a directory from the Mercurial 
source tree.

* Support running via nose. I'd like to be able to run .t tests via 
nose. This requires more refactoring of the unittest-derived classes and 
some layer violations to be cleaned up. Benefits to doing this are 
things like code coverage generation and result capturing could be 
deferred to nose and not implemented in run-tests.py. This may require 
Mercurial to drop Python 2.4/2.5 before this is fully achievable.

* Better handle varying output from different Mercurial versions. The 
test harness I built runs tests against Mercurial 2.5.4 through @ 
(because I want to test extensions against all the versions users could 
be running). The output of some Mercurial commands has changed over that 
time. e.g. 3.2 adds the "entering bookmark" messages. I don't like 
|>/dev/null| because that undermines some of the utility of a test. I 
think we may need to introduce a "this output is optional" syntax or 
something similar.

I know I have more ideas. But these are the major ones.

Feedback before I start patchbombing is encouraged. You likely have a 
month or two before I find the time to work on these.

[1] 
http://gregoryszorc.com/blog/2014/10/14/robustly-testing-version-control-at-mozilla/
[2] 
https://hg.mozilla.org/hgcustom/version-control-tools/file/default/run-mercurial-tests.py
[3] https://bitbucket.org/brodie/cram