Add a Unicode mode, but keep the bytes mode

Victor Stinner victor.stinner at haypocalc.com
Fri Nov 4 18:48:29 CDT 2011


Le vendredi 4 novembre 2011 18:20:28, Andrey a écrit :
> Great work.
> 
> On Friday, November 4, 2011 1:47:14 PM UTC+1, Victor Stinner wrote:
> > The default kind will be bytes until enough third-party tools are
> > compatible
> > with Unicode (e.g. make).
> 
> I think the default should be Unicode. Users begin to use non-ASCII file
> names only when they get support from all the tools they use.
> First, you create a file, run the tools, check the result and only then
> commit and push the changeset.
> 
> When new repositories are created with the proper style (UTF-8 encoded file
> names), it helps to solve the backwards compatibility. Otherwise we will be
> forever stuck with the legacy layout.

The problem is that you will need the last Mercurial version to checkout (and 
work on) such repository.

Well, we may alllow to checkout such repository, but old Mercurial versions 
store filenames in the locale encoding (not in UTF-8). So if a new file is added 
and pushed with an old Mercurial version to a "Unicode compliant" (new) 
Mercurial server, "it doesn't work" (I don't think that the server can ask the 
client for its locale encoding and the hash will be different if the filename is 
stored differently...).

That's why I see this new Unicode mode as a requirement (.hg/requires).

Victor


More information about the Mercurial-devel mailing list