[PATCH 3 of 5] Add util.splitpath() and use it instead of using split() directly

Matt Mackall mpm at selenic.com
Tue Jan 8 19:07:47 CST 2008


On Wed, 2008-01-09 at 09:15 +0900, Shun-ichi GOTO wrote:
> 2008/1/9, Matt Mackall <mpm at selenic.com>:
> > > > On Tue, 2008-01-08 at 16:24 +0900, Shun-ichi GOTO wrote:
> > > > > 2008/1/8, Matt Mackall <mpm at selenic.com>:
> > > > > > Here's a big question that we have to answer with regard to MBCS: what
> > > > > > happens if you check in a path with a 0x5c on a shift-jis machine and I
> > > > > > check it out in ascii-land? I suspect the answer is: I get an extra
> > > > > > directory level and much confusion.
> > > > >
> > > > > Is this question for the behaviour on patched mercurial?
> > > > > Or for current hg?
> > > > >
> > > > > If former, the people in ascii-windows-land may gets confuzed
> > > > > by error.
> > > > > The pepole asii-unix-land people are good to play, no extra directory.
> > > > >  This is limitation of my patch
> > > > > first I described in [PATCH 0 of 5],  "effects and save local only".
> > > >
> > > > How does it work? What byte sequence gets stored in the manifest?
> > >
> > > ??
> > > shift_jis encoded bytes in manifest.
> >
> > Ok, so when an ascii-land user checks this out, they'll see extra
> > slashes, right? They'll have no idea it's in shift-jis.
> 
> No.
> Shift_JIS (and big5) doesn't have slash ('/' 0x2f) in second byte.
> sjis 1st byte is in range : 81-9F, E0-FC
> 2nd byte is in range: 40-7E, 80-FC

Ahh, ok. Then we might be able to (and probably should) skip the UTF-8
step, which makes things much easier.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial-devel mailing list