[PATCH 2 of 2] dirstate: normalize on case insensitive filesystems on Mac (issue1663)

Matt Mackall mpm at selenic.com
Thu Jul 23 10:14:28 CDT 2009


On Thu, 2009-07-23 at 12:08 +0200, Simon Heimberg wrote:
> Am Mittwoch, den 22.07.2009, 14:04 -0500 schrieb Matt Mackall:
> > On Wed, 2009-07-22 at 15:22 +0200, Simon Heimberg wrote:
> > > # HG changeset patch
> > > # User Simon Heimberg <simohe at besonet.ch>
> > > # Date 1248264291 -7200
> > > # Node ID f812e62a12b68c035b1aef3b3732f8486c376373
> > > # Parent  ca876099803a9e71497d9deaee3c9fb7ff47ee81
> > > dirstate: normalize on case insensitive filesystems on Mac (issue1663)
> > > 
> > > os.path.normcase does not change the path on Mac OS X (uses possix module)
> > > 
> > > diff -r ca876099803a -r f812e62a12b6 mercurial/dirstate.py
> > > --- a/mercurial/dirstate.py	Mit Jul 22 12:52:02 2009 +0200
> > > +++ b/mercurial/dirstate.py	Mit Jul 22 14:04:51 2009 +0200
> > > @@ -351,8 +351,14 @@
> > >          except KeyError:
> > >              self._ui.warn(_("not in dirstate: %s\n") % f)
> > >  
> > > +    _usenormcase = os.path.normcase("A") == "a"
> > > +
> > >      def _normalize(self, path, knownpath):
> > > -        norm_path = os.path.normcase(path)
> > > +        if self._usenormcase:
> > > +            norm_path = os.path.normcase(path)
> > > +        else:
> > > +            #case insensitive filesystem on Mac OS X
> > > +            norm_path = path.lower()
> > 
> > You're going to have to be more clever than lower(), I'm afraid.
> > Consider a file named 'Ä' and the possibility that your local character
> > set might be set to MacRoman. There's also the whole issue of Unicode
> > normalization.
> > 
> This is as clever as os.path.normcase in windows and mac (not OS X) (see
> ntpath.py and macpath.py from python). We could use macpath.normcase for
> being upgraded when python is.
> 
> > I think we need to have a more general facility for dealing with all
> > forms of folding (ie any non-direct filename matching/mangling) that
> > allows us to deal with all the stupid Windowsisms and Macisms.
> > Case-folding is just the most commonplace form of it.
> > 
> Maybe this could happen too on linux or unix when a special file system 
> is mounted or with some mount options.

Perhaps.

> More general is good, but I do not have an idea how to do it. Any hint?
> What are other cases of folding?

Windows:

foo. -> foo
(there might be some magic with trailing spaces too)

There's also some way-underdocumented folding of Unicode to ASCII when
using the legacy APIs.

Mac:

Unicode names are 'normalized' (aka mangled) into something
approximating 'normalized form D'. That means when you write out a file
named 'Ä' ('\xc3\x84'), you get back a file named 'Ä' ('A\xcc\x88')


-- 
http://selenic.com : development and support for Mercurial and Linux




More information about the Mercurial-devel mailing list