Unicode support in log messages and file names

Andrey grooz-work at gorodok.net
Sat Nov 11 14:46:44 CST 2006


> I've tried exactly this one year ago when Mercurial was much smaller
> and after talking to other people we (including Matt) decided that
> the desired way is to immediately convert from local encoding to
> UTF-8, like Vicent Seguí Pascual originally proposed.
>
> Unfortunately I wanted to do it exactly like you by that time, the
> result is that we have no unified log encoding yet.
>
> You can see his patches in the list archive from July 2005.
>
> Thomas

Well, I still believe that using unicode strings internally is the right way. 
In fact Python-3000 is going to use them by default instead of bytestrings.

As for changelog encoding, it should be possible to add a config option for it 
(and default to UTF-8). And probably another option shoud be added for source 
file encoding (for properly displaying diffs and other stuff). I think it 
could be implemented without too much effort.



More information about the Mercurial mailing list