[PATCH 0 of 6] Unicode support in commit messages

Matt Mackall mpm at selenic.com
Sun Nov 12 18:29:10 CST 2006


On Mon, Nov 13, 2006 at 03:03:08AM +0600, Andrey wrote:

> Oh, I forgot to tell I had to patch my sources a little to be able
> to run that benchmarks. That's because kernel tree has log messages
> in both Latin-1 and UTF-8. Right now my patched hg uses hardcoded
> UTF-8 encoding for log messages and chokes on Latin-1 messages, so I
> had to change it to use Latin-1. Perhaps an option should be added
> to hgrc for that.

Throwing exceptions for existing repos is not acceptable. Adding an
option to make things start working again is not really a good idea
either.

Once you've moved the conversion routines into util, you can do
something like:

   try:
      t = t.decode("utf-8")
   except UnicodeDecodeError:
      t = t.decode(localencoding())
   except UnicodeDecodeError:
      t = t.decode("utf-8", "replace")
   return t

-- 
Mathematics is the supreme nostalgia of our time.


More information about the Mercurial-devel mailing list