SHA-1 and changeset signatures

Matt Mackall mpm at selenic.com
Fri Aug 26 16:40:38 CDT 2005


On Fri, Aug 26, 2005 at 02:08:46PM -0700, Eric Hopper wrote:
> On Fri, Aug 26, 2005 at 10:40:50AM -0700, Matt Mackall wrote:
> > But the trouble is that the recently discovered attacks are not
> > finding a collisions with the hash of an existing text (a preimage
> > attack), but finding one between two random hashes. The existing text
> > case is still expected to be well beyond the reach of currently
> > available hardware.
> 
> Yes, but source code is text someone creates.  So it's possible for
> someone to create two sets of source changes (especially as I said if
> some of the changes are to binary files) that hash to the same value.

I don't think this is correct.

The attack finds two completely arbitrary texts, X and Y, that hash to
the same value, with absolutely no control over the content of X or Y.
In other words, the end result is two meaningless blobs X and Y.

Let's assume you could insert meaningless blob X as file F into a repo
in such a way that it was innocuous and unsuspicious.

Now you've got to insert meaningless blob Y in X's place in such a way
that it actually does something useful.

You could, for instance, make an exploit that triggered off a
particular bit changing. This doesn't meet the useful criterion: if
you can insert such an exploit, you don't need a hash collision.

Or you could make an exploit E, calculate C = Y^E, then put file C and
file F=X in the repo, and arrange to execute C^F. Which means you need
to choose a C such that C^X is harmless and C^Y is not. And somehow
your convoluted mechanism must also not be suspicious. Again, if you can do
that, you can almost certainly sneak in a more mundane exploit.

-- 
Mathematics is the supreme nostalgia of our time.


More information about the Mercurial mailing list