Potential improvements to run-tests.py

Tue Nov 4 15:54:19 CST 2014

On 10/24/2014 06:50 PM, Gregory Szorc wrote:
> I've been writing a lot of tests for Mozilla's Mercurial infrastructure
> recently [1]. Our test "harness" is a glorified (and very hacky) wrapper
> around run-tests.py [2].
>
> As part of writing all these tests, I've identified a number of areas
> for improving run-tests.py. I hope to find time to contribute some
> patches upstream once things quiet down here. In the mean time, I
> thought I'd send ideas to get people thinking (and possibly to entice
> other contributors).

First your message is far too long. In my opinion, if you want people on 
this list to read it and provide reply in descent time you must write 
more concise email. The wiki is a good place to write more detailled 
plan (but not an excuse to write novel there either)

I recommand killing this discussion and resuming each topic individually 
(on at a time).

> The common theme for the following improvements is increasing
> re-usability of the test harness. This will enable extension authors and
> other Mercurial users to more easily write tests. Yes, we have cram [3]
> today. But I'm mainly focused on Mercurial. And, I think that with a
> little work, we could refactor Mercurial's testing code to have a
> generic base layer that is effectively cram.

Our current test tools is awesome and I would be super happy to reuse it 
in other context (other project, education, etc).

However, generic code is extra work and maintainance burden and I'm not 
sure we are ready to take it.

But we really should work on making our testing tool easy to use for the 
Mercurial eco-system. Copying the test tools around in extension 
repository sounds like a terrible idea that will bite use continuously 
in the future.

> * Ability to declare your own variables for substitutions. When
> comparing output, we substitute variables/strings like $HGPORT. These
> come from a hard-coded list. I want to make it possible for the test to
> define its own variables.

Looks simple and good. Could be used to replace some of the (glob) we have.

> * Consider dynamic port numbers. Once variable substitutions can be
> declared from tests, we don't need to do the silly pre-defined port
> numbers mess ($HGPORT, $HGPORT1, etc). Instead, we can let things bind
> to an open port automatically. This will help prevent failures from
> tests trying to use a port some other program is using or from tests
> that don't clean up well causing cascading test failures.

It may be seen as a feature that failure to cleanup are made visible by 
future test failure (arguable, but I did not felt a massive pain from 
the HGPORT thing for now)

> * Consider a generic "register cleanup" mechanism from .t tests. Many of
> the .t tests I'm writing start Docker containers and stop/remove them
> when the test is complete. If we ctrl+c the test, the containers keep
> running. I'd like to add a mechanism that allows tests to declare what
> cleanup actions to run when the test exists. This is probably "write a
> list of commands to a file which will be executed by the shell." This
> could eventually eliminate $DAEMON_PIDS handling from the test harness
> itself.

+1 for proper generic cleanup mechanism

> * hghave as a module. Calling hghave as a separate process is silly. mpm
> added #require a little while back. run-tests.py should load hghave from
> a module and invoke it as an API.

Not sure what you mean here. Is it related to explicitly decorating test 
(and test section?) and let run-test.py decide to run them by himself?

> * Extensible hghave. Downstream test authors may need to write custom
> hghave checks. For example, I added "#require docker." The current
> design of hghave makes this difficult.

Sounds reasonable

> * Allow the running of the same .t test from multiple Mercurial versions
> or Python versions, all from the same TestRunner invocation. Currently,
> we can't do this because things like result recording assume the test
> name is globally unique.

I'm a bit confused about how you plan to handle multiple output. And I'm 
not sure I get this test name things.

> * Make it easier to run tests from multiple directories. Our repository
> has many different components and .t tests in various directories. When
> you run a .t test from a child directory, $TESTDIR is the directory
> where the tests started executing from, not where the .t test is.
> Relative paths, etc are a bit wonky, making it a little harder to write
> tests and maintain.

I agree that current TESTDIR behavior makes it hard to run the tests in 
a flexible way. patches welcome.

> * Consider moving all the testing code into reusable modules. Right now,
> run-tests.py can't be used with "import" (because of the hyphen). There
> are also a handful of support files (hghave.py, heredoctest.py,
> killdaemons.py, silenttestrunner.py, etc) that run-tests assumes are
> importable. I'd like to establishing *mercurial.testing* (or something
> like it) and move all the potentially reusable test code there. 3rd
> party consumers could potentially consume these files from a Mercurial
> install. Or, they could install a separate pypi package containing just
> the testing code. Worst case, they copy a directory from the Mercurial
> source tree.

We have to find a proper balance between generic code and maintenance. 
But have tools reuable by extension seems the way to go.

Nowaday, I've stop using copied version of the test-tools in extension 
repo and alway use the run-tests.py from my mercurial checkout. It works 
well for developed, but give some trouble to Continuous Integration that 
does not have a mercurial checkout handy.

> * Support running via nose. I'd like to be able to run .t tests via
> nose. This requires more refactoring of the unittest-derived classes and
> some layer violations to be cleaned up. Benefits to doing this are
> things like code coverage generation and result capturing could be
> deferred to nose and not implemented in run-tests.py. This may require
> Mercurial to drop Python 2.4/2.5 before this is fully achievable.

+2 integration with other tools is the candy we are running after.

> * Better handle varying output from different Mercurial versions. The
> test harness I built runs tests against Mercurial 2.5.4 through @
> (because I want to test extensions against all the versions users could
> be running). The output of some Mercurial commands has changed over that
> time. e.g. 3.2 adds the "entering bookmark" messages. I don't like
> |>/dev/null| because that undermines some of the utility of a test. I
> think we may need to introduce a "this output is optional" syntax or
> something similar.

Evolve have this kind of pain too. We could have a (optional) flag as 
well as we have a (glob) and (re) flag. This could maybe works alongside 
`hghave` to have (skipif prehg32) but we may be over engineering there.

> I know I have more ideas. But these are the major ones.
>
> Feedback before I start patchbombing is encouraged. You likely have a
> month or two before I find the time to work on these.

This time, you should avoid patchbombing 200 patches at a time. 
Otherwise Matt Mackall may decide to run for Minnesota office, declare 
it independent from the united-state, invade British Colombia and burn 
your computer to the ground.

-- 
Pierre-Yves David