[PATCH 2 of 2] dirstate: normalize on case insensitive filesystems on Mac (issue1663)

Matt Mackall mpm at selenic.com
Fri Jul 24 15:53:18 CDT 2009

On Fri, 2009-07-24 at 22:39 +0200, Dan Villiom Podlaski Christiansen
> On 24/07/2009, at 21.44, Matt Mackall wrote:
> > Posix it ain't. It might be time for a mac.py.
> Indeed; might I suggest the name macosx.py instead? :)
> > Have you considered unicodedata.normalize("NFD", f).lower()?

> I haven't actually, as I wasn't aware of that module :)
> (One question, though: Would the ‘X.lower()’ trick mean that only  
> lowercase names are shown to the user? If so, I really don't like it  
> from a user interface perspective…)

No, it's simply proposed as a way of comparing filenames.

if fold(filename internally) == fold(filename on disk):
  files are the same

> Unfortunately, the issue is slightly more complex than that; the  
> normalisation required for HFS+ doesn't correspond to any standard  
> Unicode normalisation. It might be better to simply implement the  
> normalisation ourselves, based on the HFS volume format specification. 
> [1] One thing though; not all volumes on Mac OS X are case  
> independent, but I suspect the Unicode normalisation is universal.  
> (I'd have to dig much deeper into documentation, references & source  
> to be certain.)

I believe you can mount BSD FFS volumes as well, which are not

> > (there are other hairy issues here, like filenames in Latin1)
> That issue should be ‘solved’ rather simply on Mac OS X, I believe: by  
> definition, such file names cannot exist, ever. I remember mounting an  
> NTFS volume once that used some non-UTF-8 encoding for its file names;  
> whether GUI or CLI, the system *really* doesn't like such file names.

Yes, but Mercurial must handle more or less arbitrary null-terminated
byte strings on other systems, so we should give this corner case some

http://selenic.com : development and support for Mercurial and Linux

More information about the Mercurial-devel mailing list