Hash collision
Martin Geisler
mg at lazybytes.net
Wed Apr 29 08:17:00 CDT 2009
Brian Mearns <bmearns at ieee.org> writes:
> I'm curious about the possibility for collisions in the hashes that
> mercurial uses to identify all of its revisions. I realize there are
> yet to be any reported collisions in sha-1, but isn't it conceivable?
It is a fact that there *must* be collisions for SHA-1 and all other
hash functions, for that matter. SHA-1 takes an arbitrary long string
and turns it into 20 bytes (160 bit). So if you hash all 160 bit
strings through SHA-1, and then hash any 161 bit string, you must find
a collision.
> Does mercurial do anything to detect or defend against collisions,
> or is it really so unlikely that it's not worth worrying about?
Don't worry about it. There are 2^160 different hash values, and this
is a truely astronomical number. If a collision is ever found it will
be a major scientific discovery.
> And what exactly would be the result if, say, two revisions in a
> repo ended up with the same hash? Would it be disastrous for the
> repository, or just a minor snag?
I'm not really sure what would happen.
--
Martin Geisler
VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.
More information about the Mercurial
mailing list