Managing multiple encodings in one repository
davidrushby at gmail.com
Fri Apr 6 00:03:15 CDT 2007
On 4/5/07, Matt Mackall <mpm at selenic.com> wrote:
> On Thu, Apr 05, 2007 at 08:11:06PM +0400, David Rushby wrote:
> > On 4/5/07, Matt Mackall <mpm at selenic.com> wrote:
> > >> If I save Mercurial.ini as (for example) UTF-8, then specify
> > >> "--encoding=utf8" or environment variable HGENCODING=utf8, the
> > >> username emerges is garbled.
> > >
> > >What precisely is happening? Is Mercurial properly reading your .ini
> > >as UTF-8 and then displaying it as UTF-8, which your console tries to
> > >interpret as Windows-1251? This will manifest as all the non-ASCII
> > >characters being represented as multiple characters.
> > No, that's not what's happening. Mercurial is try to pretend that the
> > contents of Mercurial.ini are stored in the system default encoding,
> > even when I specify another encoding.
> > Here's a simple way to reproduce the problem (on Windows, at least):
> > ...
> UTF-16 is a whole 'nother story. If this had been UTF-8 without the
> stupid BOM marker (\xff\xfe), this would have worked just fine.
You're right. Excuse me if my tone about this issue was presumptuous.
When I saved Mercurial.ini as UTF-8, set HGENCODING=utf8, and worked
with text files whose contents were purely UTF-8, everything was
> > >> 2) Be able to see encoding-normalized output from commands that
> > >> might operate on files with different encodings.
> See the encode and decode filters:
Thanks for the advice. I'll delve into these, because the
_fallbackencoding hack didn't help. Unfortunately, it really is
necessary for me to work with source files in multiple encodings.
More information about the Mercurial