[PATCH 1 of 8 RFC] vfs: replace invocation of file APIs of os module by ones via vfs

Matt Mackall mpm at selenic.com
Sun Jun 17 01:25:52 CDT 2012


On Sat, 2012-06-16 at 11:07 +0200, Adrian Buehlmann wrote:
> On 2012-06-15 20:00, FUJIWARA Katsunori wrote:
> > 
> > At Fri, 15 Jun 2012 10:31:45 -0500,
> > Matt Mackall wrote:
> >>
> >> On Fri, 2012-06-15 at 23:45 +0900, FUJIWARA Katsunori wrote:
> >>> # HG changeset patch
> >>> # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> >>> # Date 1339768793 -32400
> >>> # Node ID a14b63be9a04e7fac445fea69bbaf840ca3f4063
> >>> # Parent  622aa57a90b1d1f09b3204458b087de12ce2de82
> >>> vfs: replace invocation of file APIs of os module by ones via vfs
> >>
> >> You seem to have missed the importance of step 1:
> >>
> >> "Rename opener to vfs"
> >> http://mercurial.selenic.com/wiki/WindowsUTF8Plan#Steps
> >>
> >> The whole point of this exercise is to have one (or just a few) central
> >> objects we route all our file operations through that are attached to
> >> repository objects so that repository objects can easily switch their
> >> modes as needed. Conveniently, we have something very much like that
> >> already: it's called an opener. It's even beginning to grow some of
> >> these sorts of methods.
> >>
> >> In particular, we'll want to take one of these objects, wopener, and
> >> switch it to UTF-8 mode.. while leaving the other two in native mode.
> >> Which means we need to be calling methods at a high enough level that we
> >> know which part of the repository we're operating on...
> > 
> > Sorry, I mis-understand, because I also think about "create a
> > filesystem abstraction object in util.py" in "Abstracting filesystem
> > API for UTF-8 support on Windows" which you posted to devel-ml.
> > 
> > # http://www.selenic.com/pipermail/mercurial-devel/2011-December/036385.html
> 
> Have you guys put some thought into how to deal with repo root paths
> that contain "wide" characters already?

No, and in fact I'd rather not think about that now. This topic is
already muddy/complex enough, so I've tried to carve off a solvable
piece to tackle.

Interestingly, of all the complaints we've had about Unicode on Windows,
I'm pretty sure no one has ever mentioned having an issue being unable
to run hg commands in a directory that can't be encoded in their
system's ANSI code page. If I had to guess, that's because doing a
mkdir/chdir into such a directory in the shell is non-trivial. Also, it
just doesn't come up nearly as often as the "my umlaut doesn't look like
an umlaut on UTF-8 systems" issue.

> Also interesting seem to be other paths, like config files or paths to
> merge tools. Interesting paths can also originate from registry keys
> (see 133a7922a900).

Similarly, that path has also worked for years under the assumption that
paths are ANSI-decodable. The bug report arose when someone had
something that wasn't _ASCII_-decodable, which no one noticed for a
year.

http://bz.selenic.com/show_bug.cgi?id=3467

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list