Unicode support for non-unicode locales

Matt Mackall mpm at selenic.com
Mon Oct 8 13:07:56 CDT 2007


On Tue, Oct 09, 2007 at 01:59:52AM +0900, Shun-ichi GOTO wrote:
> 2007/10/9, Shun-ichi GOTO <shunichi.goto at gmail.com>:
> > If we treat filename as raw byte data, some filename might be broken
> > in path operation. So the Python code shold handle filename as unicode
> > characters by decoding.
> 
> In fact, current mercurial cannot manage some filename.
> For example, a filename "?$B at 55,I=8=.txt" is the case.
> 4 characters "?$B at 55,I=8=" is Japanese of "regular expression"
> and 2nd byte of 3rd character is '\' (0x5c).
> So, hg ci -Am "test"  fails on adding this file.
> 
> {{{
> [c:\temp\test]hg ci -Am initial
> adding ?$B at 55,!&8=.txt
> removing ?$B at 55,!&8=.txt
> dir1/?$B at 55,!&8=.txt not tracked!
> ?$B at 55,!&8=.txt not tracked!
> nothing changed
> }}}

Yes, Mercurial will be unhappy with wide character sets in various
situations. It's either that or be unhappy with single byte character
sets much more often.

-- 
Mathematics is the supreme nostalgia of our time.


More information about the Mercurial-devel mailing list