SHA-1 and changeset signatures

Matt Mackall mpm at
Fri Aug 26 12:24:22 CDT 2005

On Fri, Aug 26, 2005 at 09:39:54AM -0700, Eric Hopper wrote:
> One thing I would really like to see with Mercurial is the ability to
> verify changesets come from the person they say they come from.  One way
> to do this is to sign the hash for the changeset.  And to some extent,
> this can be handled outside Mercurial though having a piece of changeset
> meta-data explicitly earmarked for storing such signatures would be
> nice.

Putting aside the hash strength issues, we can get very close to this
now. By signing <commit text> + <manifest hash>, we've signed
everything except the author (implicit in the signature anyway) and
the date. Here's an example from the Mercurial repo:

Hash: SHA1

Make annotate use option --rev instead od --revision like other

manifest hash: fe9c9cd9d42657f60d302b557f1f33640fd51199
Version: GnuPG v1.4.1 (GNU/Linux)


This was done by turning on the signing code in hgeditor.
> The thing is, SHA-1 is broken, and the specifics of the break affect
> this in a very direct way.  It would be good to find another hash
> function to use.
> Sadly, there are currently no good candidate hash functions.
> SHA-{256,384,512} are similar in structure to SHA-1 and could probably
> be broken by similar analytical techniques.  And from what I know, there
> are no other good, well-tested hash functions around.
> Even so, having the ability to switch what kind of identifiers are used
> for changesets would be good.  Then when everybody agrees on some new
> hash functions as being pretty safe, Mercurial can switch to them
> without a lot of pain.

Any change in hash function is going to require a conversion script to
rewrite your repo (and break compatibility with other repos for that
project). As SHA-1 is still strong enough for our purposes, I'm in a
wait and see mode.

If and when we do commit to a format change, I've got some other index
changes that I'd like to do. Currently our index format is:

4  bytes  offset
4  bytes  compressed length
4  bytes  delta base
4  bytes  link revision
20 bytes  revision hash
20 bytes  p1 hash
20 bytes  p2 hash
76 bytes  total

I'd eventually like to change this to something like:

8  bytes  offset (allow more than 4G of compressed history per file)
4  bytes  uncompressed length (we occassionally want this)
4  bytes  compressed length
4  bytes  delta base
4  bytes  link revision
4  bytes  p1 revision number (the hash appears earlier in the index)
4  bytes  p2 revision number
32 bytes  revision hash (room for SHA-256)
64 bytes total

Mathematics is the supreme nostalgia of our time.

More information about the Mercurial mailing list