Finding latent encoding bugs

Adrian Buehlmann adrian at cadifra.com
Wed Oct 29 03:46:36 CDT 2008


On 29.10.2008 01:23, Matt Mackall wrote:
> Python likes to pretend that Unicode objects are just like strings, an
> idea that seems nice in practice, but generally results in code working
> for the developer but not in the field. Because Unicode strings can
> 'infect' normal strings, the bug can crop up far from where the Unicode
> string was introduced.
> 
> So we try to follow three guidelines:
> 
> (a) never pass Unicode objects inside hg, only utf-8 or local strings
> (b) explicitly transcode strings (with util.tolocal or fromlocal)
> (c) minimize transcoding by doing everything in the local encoding where
> possible, centralizing transcoding to the (very few) places that need it

I've started the new page
http://www.selenic.com/mercurial/wiki/index.cgi/DevelopmentGuidelines

linked from DeveloperInfo


More information about the Mercurial-devel mailing list