Making tests hashing-algorithm-agnostic

Tue Oct 28 14:51:41 CDT 2014

I want to play around with different hashing algorithms, and the current
hard-coded hashes in the tests make that very difficult.  I'd been working
on a way to store and re-use named captures later in the test, so you would
have test fragments that looked something like this:

  $ hg log
  changeset:   3:(?P<hash3>[0-9a-f]{12}) (re)
  tag:         tip
  parent:      0:(?P=hash2) (re)

This would capture a 12 hex characters and label it 'hash3', and now you're
able to compare against that later on, using the syntax shown on the last
line.  This doesn't work as well for times when hashes are on the
command-line, so I had a way of solving that as well (run a command,
capture the output in to a file, and then use `cat <filename>` in the
commands.

After I had this mostly working, I was informed that generic captures like
this would not work well, because things like a missing sorted() could
change hashes in tests, so the hard-coded ones were preferred.  So I went a
different direction.

Attempt 2 involved passing a --replacements <filename> flag to
run-tests.py, where a replacements file looked something like:
0189ba417d34=\b251d831eeec5\b
055a42cdd887=\b10517e47bbbb\b
... etc ...

This was also pretty easy to implement in run-tests.py, but had the
drawback of being extremely difficult to generate the replacements file
properly.  With the first method, I could look for any sequence of
10-or-more hex characters and just throw a generic hash on them, and be
done with it.  With this one, I need to compare before and after output
from a run, and try to diff to pull everything out, and this isn't pleasant.

So, I'm trying to figure out how I can use the first method, but gain the
hardcoded hashes.  I think the best way of doing this would be to just dump
the hashes at the end of the file and teach run-tests to optionally compare
it or not.  Something like this, but I honestly haven't tried this out so
the syntax will almost certainly change:

#if comparecaptures==sha1
  #comparecapture: hash1 = 0189ba417d34
  #comparecapture: hash2 = 055a42cdd887
  # ...
#endif

This way, if you want to run the tests generically without the comparison,
because you're working on a different hashing strategy, you can just pass
something like --compare-captures=none to run-tests.py.  It'll default to
sha1, and if it's anything else, then it'll fail the #if block (this way
when we have sha256 or something else available in the main repo, we just
add another block at the end of the tests with the canonical hashes and
we're done.

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20141028/166272a0/attachment.html>