Initial support of Unicode filenames
victor.stinner at haypocalc.com
Fri Oct 28 18:42:52 CDT 2011
Le samedi 29 octobre 2011 00:58:46, Matt Mackall a écrit :
> On Sat, 2011-10-29 at 00:28 +0200, Victor Stinner wrote:
> > Hi,
> > On Windows, filenames are stored as Unicode. There is a bytes API
> > providing a backward compatibility, but it should not be used, because
> > you may get invalid filename (with question marks, ?) if a filename is
> > not encodable to the ANSI code page.
> > Attached patch uses Unicode filenames to avoid encoding issues on
> > Windows. The patch on ui.py uses backslashreplace to escape unencodable
> > characters when writing filenames to the console (and so not fail if a
> > character is not encodable to the console code page).
> I'm afraid I've already vetoed about a dozen variants of this suggestion
> over the years. For starters, it is not backward-compatible with
> existing Windows users.
The goal of the patch is not to provide a full Unicode support. It's just a
step forward to improve Mercurial. In my case, I just want to fix "hg st" if
the directory contains an unencodable filename. I shouldn't change how filenames
are stored in Mercurial.
> Suggested reading:
If I understood correctly, filenames are stored as bytes in Mercurial, in the
locale encoding (so the ANSI code page on Windows). I understand that the
migration from bytes to Unicode is not trivial in this case.
I will have to read a little bit more to understand correctly the situation
More information about the Mercurial-devel