Making tests hashing-algorithm-agnostic

Pierre-Yves David pierre-yves.david at ens-lyon.org
Tue Nov 4 15:55:51 CST 2014



On 10/29/2014 08:23 PM, Kyle Lippincott wrote:
>
>
> On Wed, Oct 29, 2014 at 8:54 AM, Mads Kiilerich <mads at kiilerich.com
> <mailto:mads at kiilerich.com>> wrote:
>
>     On 10/28/2014 08:51 PM, Kyle Lippincott wrote:
>
>         I want to play around with different hashing algorithms, and the
>         current hard-coded hashes in the tests make that very difficult.
>
>
>         Thoughts?

Any strategy that implies to directly touch the test files are going to 
be either impractical or awful or fragile or all of that.

>     I do not see how swappable hash algorithms possibly could be a
>     feature we should care about in the test suite.
>
>
> Because otherwise you can't run the tests and get any useful information
> from them.  It's the same reasoning behind b87acfda5268, where
> generaldelta changed a bunch of 'base' values in 'hg debugindex' output,
> so the tests were modified to make them agnostic to this change.

The difference is that this general delta changes impacted a limited 
amount of test and specific commands. Whether general is enabled or not 
does not change the pervasive changeset hash.

>     For local testing while hacking, you can apply whatever hacks gets
>     the job done for you.
>
>
> There aren't really any local hacks available.  It'd take a week, at
> least, to replicate the kind of stuff I'm doing here, which isn't
> pleasant and conducive to experimentation.

Moreover, you need to touch ALL the hash for ALL the test file. We 
absolutely need a generic and non invasive solution to get anywhere.


>     The actual hashes matters for some tests. Ignoring hash changes will
>     probably make it very hard to debug and is not a good idea.
>
>
> Right, that's why I want to have a way of keeping the hash values (the
> final part of my original email, that basically just verifies that what
> was captured matched what we expected to see).

I agree on that too. I think the first proposal is properly dead and 
buried by now. It also have the drawback of turning all the mercurial 
test into hard to unread gibberish different from the expected mercurial 
output (yes I'm shooting an ambulance here)

>     Instead, I can imagine putting some hacks in the code that for all
>     content calculates both hashes and append them to a file in /tmp .
>     Once that has been created, it will be quite trivial to make a
>     search'n'replace, either in run-tests to replace your hashes with
>     the good old ones in the test output, or just modifying the test
>     files once and for all. The latter would probably also automatically
>     handle the places where a test specify hashes explicitly ... except
>     for the places where short prefixes are specified. You will still
>     see some failures in places where the test depends on the sort order
>     of hashes or uniqueness of short prefixes.
>
>
> Putting that code in the actual hg binary would likely not fly, but
> getting an extension to do it would be possible.  My biggest concern
> with this,

Yes, it should be easy to build and extension that wrap all hashes 
computation to compute both variants and happen all that to a gigantic 
mapping file.

Such mapping file could be used by run-tests.py to pre-process all 
test-files to replace all hashes (run-test being unaware they are hashes)

This should catch 99% of the hash issue.

Proposal summary:

1) build an extension that store transalation for all hashes compute by 
mercurial (both long and short version)

2) extend run-test.py to make it possible to "pre-process tests" with a 
list of substitution

3) feed the generated file to run-tests.py

4) optimize the thing because the number of test line and the number of 
different hashes will be huge.

5) play with different hash approach

6) build a way to share the hashes with other so that they can tests 
your patches without spending hours getting all the hashes in line.

> though, is what happens during the sha1 -> sha256
> transition?  Do the tests keep specifying only the sha1 hashes, or do we
> build a replacement map, or just have two completely separate tests
> directories, or..?  Coming up with an answer now that is as easy as
> possible for experimenting with even less 'likely' transitions would be
> a net win, I think.

The sha1 and sha256 will likely have to be in different field to 
compatibility purpose. We do not want to ask user to make a permanent 
repository migration that will change all hashes. The new hashes will 
come along the old hashes for some times.

-- 
Pierre-Yves David

PS: you also appear to use a broken MUA.


More information about the Mercurial-devel mailing list