[PATCH] fixutf8 module

Stefan Rusek stefan at rusek.org
Mon Feb 2 08:34:52 CST 2009


On Mon, Feb 2, 2009 at 2:48 PM, Shun-ichi GOTO <shunichi.goto at gmail.com> wrote:
> Excellent!
> This extension will completely obsoletes win32mbcs extension.
> My local tests are all passed with your extension.
>

Thanks, for the positive feedback. Please, report any issues you have
with it so we can get them fixed.

> BTW, for the internal-utf-8 way, there is an odd issue of text conversion
> in some place (e.g. changelog, localrepo) depending local encoding
> inside API.
> When I tried the utf-8 aproach before, this issue bothered me...

One thing the extension does to deal with that problem is to wrap
util.fromlocal() so that it allows strings to be double fromlocal()ed
safely and so that tolocal() does nothing. That way ui.write() always
gets utf8 strings passed to it. It is not quite ideal, but it works.

It would not be a difficult change to push the calls to util.tolocal()
into the ui class, pull all the calls to util.fromlocal() into
dispatch._parse(), and make i18n.gettext() return utf-8. This change
would illuminate about 1/3 of what the fixutf8 extension needs to do,
and make it so mercurial used utf-8 everywhere internally. (In a few
places there are comments to the effect that the code is written with
the expectation that it would be passed utf-8 strings anyway.) This
change would make it so the majority of hg would only have to deal
with one string encoding instead of the current reality of multiple
string encodings across multiple platforms.

--Stefan


More information about the Mercurial-devel mailing list