[PATCH 2 of 2] dirstate: normalize on case insensitive filesystems on Mac (issue1663)

Dan Villiom Podlaski Christiansen danchr at gmail.com
Fri Jul 24 12:38:58 CDT 2009


On 22/07/2009, at 21.04, Matt Mackall wrote:

> You're going to have to be more clever than lower(), I'm afraid.
> Consider a file named 'Ä' and the possibility that your local  
> character
> set might be set to MacRoman. There's also the whole issue of Unicode
> normalization.
>
> I think we need to have a more general facility for dealing with all
> forms of folding (ie any non-direct filename matching/mangling) that
> allows us to deal with all the stupid Windowsisms and Macisms.
> Case-folding is just the most commonplace form of it.

I've attached a proof-of-concept patch below which tries to solve this.

The patch contains a test case: a single revision with a single file,  
‘Å’. The test case was created on FreeBSD and the file name, in bytes,  
is ‘\xc3\x85’, but on Mac OS X, this is transformed by the kernel to ‘A 
\xcc\x8a’. This causes ‘hg status’ to report an unknown file when the  
repository is checked out.

After a bit of digging around, the only reliable way to get the  
‘proper’ name of a file appeared to be a roundtrip to Carbon. I  
implemented two versions of it, depending on whether direct toolbox or  
ctypes bindings are used.

The code may not be pretty, and probably not fast either, but it  
mostly works. I'm not sure about symbolic links, though…

--

Dan Villiom Podlaski Christiansen
danchr at gmail.com

================================================
# HG changeset patch
# User Dan Villiom Podlaski Christiansen <danchr at gmail.com>
# Date 1248453300 -7200
# Node ID ec820e35ba877efc9b93b98881c1f2fff2bb6a02
# Parent  d98cef25b5afed5d8aa325ef87f98789367d8b6e
util: add normalizepath() for getting the 'true' path on Mac OS X.

diff --git a/mercurial/posix.py b/mercurial/posix.py
--- a/mercurial/posix.py
+++ b/mercurial/posix.py
@@ -104,6 +104,62 @@ def pconvert(path):
  def localpath(path):
      return path

+realpath = None
+
+if sys.platform == 'darwin' and not realpath:
+    try:
+        # ctypes interface introduced in Python 2.5
+        from ctypes import cdll
+
+        framework = cdll.LoadLibrary('/System/Library/Frameworks/'
+                                       'CoreServices.framework/ 
CoreServices')
+        __FSPathMakeRef = framework.FSPathMakeRef
+        __FSRefMakePath = framework.FSRefMakePath
+        del framework, cdll
+
+        def realpath(path):
+            '''Obtain the canonical version of a path.
+
+               Mac OS X implementation that queries CoreServices  
using the
+               'ctypes' bindings available in Python 2.5 onwards.'''
+            if not os.path.exists(path):
+                return os.path.realpath(path)
+
+            from ctypes import c_uint8, create_string_buffer
+            fsref = (c_uint8 * 80)()
+            pathbuf = create_string_buffer('', 1024)
+
+            # assume these always return 0 (noErr)
+            __FSPathMakeRef(path, fsref, None)
+            __FSRefMakePath(fsref, pathbuf, len(pathbuf))
+
+            # the string buffer is large, so intern the result; just  
in case...
+            return intern(pathbuf.value)
+    except:
+        pass
+
+if sys.platform == 'darwin' and not realpath:
+    try:
+        # Mac toolbox glue interface: possibly disabled, absent in  
Python 3.0+
+        import Carbon.File, MacOS
+
+        def realpath(path):
+            '''Obtain the canonical version of a path.
+
+               Mac OS X fallback implementation using the Carbon  
toolbox glue
+               interface, which was removed in Python 3.0, and  
possibly disabled
+               in other versions.'''
+            try:
+                return Carbon.File.FSPathMakeRef(path)[0].as_pathname()
+            except MacOS.Error:
+                return os.path.realpath(path)
+    except ImportError:
+        pass
+
+# fall back to stdlib implementation
+if not realpath:
+    realpath = os.path.realpath
+
  def shellquote(s):
      if os.sys.platform == 'OpenVMS':
          return '"%s"' % s
diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -635,10 +635,19 @@ def fspath(name, root):
      with root. Note that this function is unnecessary, and should  
not be
      called, for case-sensitive filesystems (simply because it's  
expensive).
      '''
+
+    # If name is relative, make it absolute
+    if not os.path.isabs(name):
+        name = os.path.join(root, name)
+
+    # Obtain canonical forms
+    name = realpath(name)
+    root = realpath(root)
+
      # If name is absolute, make it relative
-    if name.lower().startswith(root.lower()):
+    if name.startswith(root):
          l = len(root)
-        if name[l] == os.sep or name[l] == os.altsep:
+        if len(name) > l and(name[l] == os.sep or name[l] ==  
os.altsep):
              l = l + 1
          name = name[l:]

diff --git a/mercurial/windows.py b/mercurial/windows.py
--- a/mercurial/windows.py
+++ b/mercurial/windows.py
@@ -126,6 +126,10 @@ def localpath(path):
  def normpath(path):
      return pconvert(os.path.normpath(path))

+def realpath(path):
+    '''Obtain the canonical version of a path.'''
+    return os.path.normpath(os.path.normcase(os.path.realpath(path)))
+
  def samestat(s1, s2):
      return False

diff --git a/tests/test-path-normalization b/tests/test-path- 
normalization
new file mode 100755
--- /dev/null
+++ b/tests/test-path-normalization
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+hg clone --quiet $TESTDIR/test-path-normalization.hg t
+exec hg st -R t
diff --git a/tests/test-path-normalization.hg b/tests/test-path- 
normalization.hg
new file mode 100644
index  
0000000000000000000000000000000000000000 
..73ea086c424d1d011310533a43f74582f1835ff5
GIT binary patch
literal 404
zc$@*00c-w9M=>x$T4*^jL0KkKStkp}5&!@R|Nq~TKxj-4f938JM*u(X-2nmwAP|az
z0s;&$10huyG9fSl8pgpYk?AI$L at -87ni^muCL`4J0s|%l+5^=wri4-IYGX{1noLQi
znWl!ON2!PgOcCl~44PyOG%$$N2*?4D0B8UJ05kv^001FOxrvJ)oInhWL;!$1gcJZN
zHZl at eeuSoK!cnRk*OImct{HOm*$28g5RV}0Q)qZk!TafYGR#&(9PZYW4Opx#oDjRt
zXR7Q0(0w=z^+uQ?2z9ZtxtUbzFjEt-rUu77!kqwjg{AlpOJ<rA)to$4`M at Q3CLPIr
z2tMnSAC^Yql68MDi2N`@n=Kfzs1UbRVQ&<f*+BjwM?_6n?kQx6tEGjftTqthPX<VU
zIZ5a_y=AUg(p at v0MO83}c}Q_o3h4|%DNO~$*6X at F!f%ZY7RZ-yut_j5z+MQPbzuV9
y)YKcl8)+f;{cSo!gJGv<(`jfj0`924OM$pI*#|fSE_%4gaTjt$I8cx$3&#@aQlq2*


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1943 bytes
Desc: not available
Url : http://selenic.com/pipermail/mercurial-devel/attachments/20090724/04f8240f/attachment.bin 


More information about the Mercurial-devel mailing list