[PATCH] lazymanifest: don't depend on printf's 'hh' format to work

Matt Harbison matt_harbison at yahoo.com
Thu Mar 12 20:37:20 CDT 2015


On Thu, 12 Mar 2015 12:18:46 -0400, Martin von Zweigbergk  
<martinvonz at google.com> wrote:

> # HG changeset patch
> # User Martin von Zweigbergk <martinvonz at google.com>
> # Date 1426176405 25200
> #      Thu Mar 12 09:06:45 2015 -0700
> # Node ID ffee672e53d8861b75f4190e3ab4e17428134776
> # Parent  7cf9a9e0cf893e7ae82dc576a03c843fd6640438
> lazymanifest: don't depend on printf's 'hh' format to work
>
> Where we convert a 20-byte binary to a 40-byte hex string in
> lazymanifest_setitem(), we use sprintf("%02hhx", hash[i]). As Matt
> Harbison found out, 'hh' seems to be ignored on some platforms (Visual
> Studio?). If char is signed on such platforms, the value gets
> sign-extended as it gets promoted into an int when passed into the
> variadic sprintf(). The resulting integer value will then be printed
> with leading f's (14 of them on 64-bit systems), since the '2' in the
> format string indicates only minimum number of characters. This is
> both incorrect and runs the risk of writing outside of allocated
> memory (as Matt reported).
>
> To fix, let's cast the value to unsigned char before passing it to
> sprintf(). Also drop the poorly supported 'hh' formatting that now
> becomes unnecessary.
>
> diff -r 7cf9a9e0cf89 -r ffee672e53d8 mercurial/manifest.c
> --- a/mercurial/manifest.c	Wed Mar 11 15:22:34 2015 -0700
> +++ b/mercurial/manifest.c	Thu Mar 12 09:06:45 2015 -0700
> @@ -466,7 +466,10 @@
>  	}
>  	memcpy(dest, path, plen + 1);
>  	for (i = 0; i < 20; i++) {
> -		sprintf(dest + plen + 1 + (i * 2), "%02hhx", hash[i]);
> +		/* Cast to unsigned, so it will not get sign-extended when promoted
> +		 * to int (as is done when passing to a variadic function)
> +		 */
> +		sprintf(dest + plen + 1 + (i * 2), "%02x", (unsigned char)hash[i]);
>  	}
>  	memcpy(dest + plen + 41, flags, flen);
>  	dest[plen + 41 + flen] = '\n';

To wrap this up, this fixes it on my 64bit system as well.  I added a  
couple more traces prior to applying this, and the "%02hhx" was converting  
the last char into "fffe", explaining the overrun.  The full hash was  
printed as "a", so verify wouldn't have been happy.

FWIW, it looks like MSVC can build with char either signed or unsigned  
based on limits.h, as mentioned in one of the links you referenced.  But  
this certainly seems safer.

Thanks for picking this up and figuring it out.

--Matt


More information about the Mercurial-devel mailing list