Encoding inconsistencies

Azra Aiyl azraiyl at gmail.com
Sat Jun 14 11:33:32 CDT 2008


Hello,

I installed the lastest mercurial under windows. Afterwards I created
a new repo
and added files. My first commit i created with the -m option. The message had
umlauts in it. My second commit i created with an editor that defaults to utf-8
(i have validated this with a hex editor).

1. hg log locks ugly without an --encoding parameter on cmd.exe altough cmd.exe
is able to print the umlauts I normally use.

In my opinion either I should be able to specify (defaults may be a workaround)
it in Mercurial.ini or hg should try to recognize the encoding. In
python this is
easy and should work in most cases.

import sys
s1 = u"\xe4\xf6\xfc" # auml, ouml, uuml
print s1
s2 = s1.encode(sys.stdout.encoding)
print s2

2. When my favorite editor writes the files as utf-8 hg does recognize it as
cp1251 i think (perhaps because it can decode it without any error), therefore
my commits are wrong in hg serve. With --encoding utf-8 (or defaults) i can get
around this.

In my opinion utf-8 should be de default value or i should be able to declare
a default encoding for each editor. Maybe one should introduce an editor section
as with merge-tools.

3. When I use hg serve without an encoding cp1251 is used. When I use
--encoding utf-8 the html output (btw it isn't valid trasitional according to
the w3c validator) is encoded in utf-8 - so far so good. But when I
specify utf-8
in the web section in Mercurial.ini the encoding is utf-8 but the output from hg
ist not encoded correctly. Besides point 1 and 2 this is not intended I
think.

Thanks in advance for any clarification and/or hints.

P.S. Is there any possibility to search the archives online without opening each
thread?


More information about the Mercurial mailing list