Diff for "WritingTests"

Differences between revisions 3 and 4

Mercurial contains a simple regression test framework that allows both Python unit tests and shell-script driven regression tests.

Running the test suite

To run the tests, do:

$ make tests
cd tests && ./run-tests
............................................
Ran 44 tests, 0 failed.

This finds all scripts in the tests/ directory named test-* and executes them. The scripts can be either shell scripts or Python. Each test is run in a temporary directory that is removed when the test is complete.

You can also run tests individually:

$ cd tests/
$ ./run-tests test-pull test-undo
..
Ran 2 tests, 0 failed.

A test-<x> succeeds if the script returns success and its output matches test-<x>.out. If the new output doesn't match, it is stored in test-<x>.err.

Also, run-tests has some useful options:

$ ./run-tests.py --help
usage: run-tests.py [options] [tests]

options:
  -h, --help            show this help message and exit
  -v, --verbose         output verbose messages
  -t TIMEOUT, --timeout=TIMEOUT
                        kill errant tests after TIMEOUT seconds
  -c, --cover           print a test coverage report
  -s, --cover_stdlib    print a test coverage report inc. standard libraries
  -C, --annotate        output files annotated with coverage
  -r, --retest          retest failed tests
  -f, --first           exit on the first test failure
  -R, --restart         restart at last error
  -i, --interactive     prompt to accept changed output

Writing a shell script test

Creating a regression test is easy. Simply create a shell script that executes the necessary commands to exercise Mercurial.

Here's an example:

hg init
touch a
hg add a
hg commit -m "Added a" -d "0 0"

touch main
hg add main
hg commit -m "Added main" -d "0 0"
hg checkout 0

echo Main should be gone
ls

Then run your test:

$ ./run-tests test-example
.
test-example generated unexpected output:
Main should be gone
a

Ran 1 tests, 1 failed.

Double-check your script's output, then save the output so that future runs can check for the expected output:

$ mv test-example.err test-example.out
$ ./run-tests test-example
.
Ran 1 tests, 0 failed.

Writing a Python unit test

A unit test operates much like a regression test, but is written in Python. Here's an example:

   1 #!/usr/bin/env python
   2 
   3 import sys
   4 from mercurial import bdiff, mpatch
   5 
   6 def test1(a, b):
   7     d = bdiff.bdiff(a, b)
   8     c = a
   9     if d:
  10         c = mpatch.patches(a, [d])
  11     if c != b:
  12         print "***", `a`, `b`
  13         print "bad:"
  14         print `c`[:200]
  15         print `d`
  16 
  17 def test(a, b):
  18     print "***", `a`, `b`
  19     test1(a, b)
  20     test1(b, a)
  21 
  22 test("a\nc\n\n\n\n", "a\nb\n\n\n")
  23 test("a\nb\nc\n", "a\nc\n")
  24 test("", "")
  25 test("a\nb\nc", "a\nb\nc")
  26 test("a\nb\nc\nd\n", "a\nd\n")
  27 test("a\nb\nc\nd\n", "a\nc\ne\n")
  28 test("a\nb\nc\n", "a\nc\n")
  29 test("a\n", "c\na\nb\n")
  30 test("a\n", "")
  31 test("a\n", "b\nc\n")
  32 test("a\n", "c\na\n")
  33 test("", "adjfkjdjksdhfksj")
  34 test("", "ab")
  35 test("", "abc")
  36 test("a", "a")
  37 test("ab", "ab")
  38 test("abc", "abc")
  39 test("a\n", "a\n")
  40 test("a\nb", "a\nb")
  41 
  42 print "done"

Making Tests Repeatable

There are some tricky points here that you should be aware of when writing tests:

hg commit wants user interaction - use -m "text"
hg up -m wants user interaction, set HGMERGE to something noninteractive:

cat <<'EOF' > merge
#!/bin/sh
echo merging for `basename $1`
EOF
chmod +x merge

env HGMERGE=./merge hg update -m 1

changeset hashes will change based on user and date which make
- things like hg history output change - use -d:

hg commit -m "test" -u test -d "0 0"

diff will show the current time - strip with sed:

hg diff | sed "s/\(\(---\|+++\) [a-zA-Z0-9_/.-]*\).*/\1/"

Making tests portable

You also need to be careful that the tests are portable from one platform to another. You're probably working on Linux, where the GNU toolchain has more (or different) functionality than on MacOS, *BSD, Solaris, AIX, etc. While testing on all platforms is the only sure-fire way to make sure that you've written portable code, here's a list of problems that have been found and fixed in the tests. Another, more comprehensive list may be found in the GNU Autoconf manual, online here:

http://www.gnu.org/software/autoconf/manual/html_node/Portable-Shell.html

1. sh

The Bourne shell is a very basic shell. /bin/sh on Linux is typically bash, which even in Bourne-shell mode has many features that Bourne shells on other Unix systems don't have (and even on Linux /bin/sh isn't guaranteed to be bash). You'll need to be careful about constructs that seem ubiquitous, but are actually not available in the least common denominator. While using another shell (ksh, bash explicitly, posix shell, etc.) explicitly may seem like another option, these may not exist in a portable location, and so are generally probably not a good idea. You may find that rewriting the test in python will be easier.

- don't use pushd/popd; save the output of "pwd" and use "cd" in place of

the pushd, and cd back to the saved pwd instead of popd.

- don't use math expressions like let, (( ... )), or $(( ... )); use "expr"

instead.

2. grep

- don't use the -q option; redirect stdout to /dev/null instead.

- don't use extended regular expressions with grep; use egrep instead, and

don't escape any regex operators.

3. sed

- make sure that the beginning-of-line matcher ("^") is at the very

beginning of the expression -- it may not be supported inside parens.

4. echo

- echo may interpret "\n" and print a newline; use printf instead if you

want a literal "\n" (backslash + n).

5. false

- false is guaranteed only to return a non-zero value; you cannot depend on

it being 1. On Solaris in particular, /bin/false returns 255. Rewrite your test to not depend on a particular return value, or create a temporary "false" executable, and call that instead.

6. diff

- don't use the -N option. There's no particularly good workaround short

of writing a reasonably complicated replacement script, but substituting gdiff for diff if you can't rewrite the test not to need -N will probably do.

-  ⇤ ← Revision 3 as of 2005-09-14 21:26:23 → 
  Size: 3227
  Editor: mpm
  Comment:
+   ← Revision 4 as of 2006-12-19 22:24:46 → ⇥
  Size: 6549
  Editor: mpm
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
+[[TableOfContents]]
-Line 30:
+Line 32:
+Also, run-tests has some useful options:

{{{
$ ./run-tests.py --help
usage: run-tests.py [options] [tests]

options:
  -h, --help            show this help message and exit
  -v, --verbose         output verbose messages
  -t TIMEOUT, --timeout=TIMEOUT
                        kill errant tests after TIMEOUT seconds
  -c, --cover           print a test coverage report
  -s, --cover_stdlib    print a test coverage report inc. standard libraries
  -C, --annotate        output files annotated with coverage
  -r, --retest          retest failed tests
  -f, --first           exit on the first test failure
  -R, --restart         restart at last error
  -i, --interactive     prompt to accept changed output
}}}
-Line 73:
+Line 95:
-}}}

There are some tricky points here that you should be aware of when
writing tests:

 * hg commit wants user interaction - use -m "text"

 * hg up -m wants user interaction, set HGMERGE to something noninteractive:

{{{
cat <<'EOF' > merge
#!/bin/sh
echo merging for `basename $1`
EOF
chmod +x merge

env HGMERGE=./merge hg update -m 1
}}}

 * changeset hashes will change based on user and date which make
  things like hg history output change - use -d:

{{{
hg commit -m "test" -u test -d "0 0"
}}}

 * diff will show the current time - strip with sed:

{{{
hg diff | sed "s/\(\(---\|+++\) [a-zA-Z0-9_/.-]*\).*/\1/"
-Line 154:
+Line 146:
+== Making Tests Repeatable ==

There are some tricky points here that you should be aware of when
writing tests:

 * hg commit wants user interaction - use -m "text"

 * hg up -m wants user interaction, set HGMERGE to something noninteractive:

{{{
cat <<'EOF' > merge
#!/bin/sh
echo merging for `basename $1`
EOF
chmod +x merge

env HGMERGE=./merge hg update -m 1
}}}

 * changeset hashes will change based on user and date which make
  things like hg history output change - use -d:

{{{
hg commit -m "test" -u test -d "0 0"
}}}

 * diff will show the current time - strip with sed:

{{{
hg diff | sed "s/\(\(---\|+++\) [a-zA-Z0-9_/.-]*\).*/\1/"
}}}

== Making tests portable ==

You also need to be careful that the tests are portable from one platform
to another.  You're probably working on Linux, where the GNU toolchain has
more (or different) functionality than on MacOS, *BSD, Solaris, AIX, etc.
While testing on all platforms is the only sure-fire way to make sure that
you've written portable code, here's a list of problems that have been
found and fixed in the tests.  Another, more comprehensive list may be
found in the GNU Autoconf manual, online here:

    http://www.gnu.org/software/autoconf/manual/html_node/Portable-Shell.html

=== sh ===

The Bourne shell is a very basic shell.  /bin/sh on Linux is typically
bash, which even in Bourne-shell mode has many features that Bourne shells
on other Unix systems don't have (and even on Linux /bin/sh isn't
guaranteed to be bash).  You'll need to be careful about constructs that
seem ubiquitous, but are actually not available in the least common
denominator.  While using another shell (ksh, bash explicitly, posix shell,
etc.) explicitly may seem like another option, these may not exist in a
portable location, and so are generally probably not a good idea.  You may
find that rewriting the test in python will be easier.

- don't use pushd/popd; save the output of "pwd" and use "cd" in place of
  the pushd, and cd back to the saved pwd instead of popd.

- don't use math expressions like let, (( ... )), or $(( ... )); use "expr"
  instead.

=== grep ===

- don't use the -q option; redirect stdout to /dev/null instead.

- don't use extended regular expressions with grep; use egrep instead, and
  don't escape any regex operators.

=== sed ===

- make sure that the beginning-of-line matcher ("^") is at the very
  beginning of the expression -- it may not be supported inside parens.

=== echo ===

- echo may interpret "\n" and print a newline; use printf instead if you
  want a literal "\n" (backslash + n).

=== false ===

- false is guaranteed only to return a non-zero value; you cannot depend on
  it being 1.  On Solaris in particular, /bin/false returns 255.  Rewrite
  your test to not depend on a particular return value, or create a
  temporary "false" executable, and call that instead.

=== diff ===

- don't use the -N option.  There's no particularly good workaround short
  of writing a reasonably complicated replacement script, but substituting
  gdiff for diff if you can't rewrite the test not to need -N will probably
  do.