[PATCH 2 of 2] dirstate: normalize on case insensitive filesystems on Mac (issue1663)
Dan Villiom Podlaski Christiansen
danchr at gmail.com
Fri Jul 24 16:37:15 CDT 2009
On 24/07/2009, at 22.53, Matt Mackall wrote:
> if fold(filename internally) == fold(filename on disk):
> files are the same
Ah, that makes things much simpler :) Attached below is another stab,
this time using the F_GETPATH fcntl, which I just remembered existed.
It opens the file and asks for its path. This seems much simpler and
more reliable than trying to re-do the all the logic ourselves.
>> Unfortunately, the issue is slightly more complex than that; the
>> normalisation required for HFS+ doesn't correspond to any standard
>> Unicode normalisation. It might be better to simply implement the
>> normalisation ourselves, based on the HFS volume format
>> specification.
>> [1] One thing though; not all volumes on Mac OS X are case
>> independent, but I suspect the Unicode normalisation is universal.
>> (I'd have to dig much deeper into documentation, references & source
>> to be certain.)
>
> I believe you can mount BSD FFS volumes as well, which are not
> UTF16-impaired.
Indeed you can; I just tried with a disk image. It allowed me to
create both ISO-8859-1 ‘å’ and composed UTF-8 ‘å’. Interestingly, if
you move them to an HFS+ volume, the former is converted to ‘%E5’, and
the latter to the familiar decomposed form…
>
>>> (there are other hairy issues here, like filenames in Latin1)
>>
>> That issue should be ‘solved’ rather simply on Mac OS X, I believe:
>> by
>> definition, such file names cannot exist, ever. I remember mounting
>> an
>> NTFS volume once that used some non-UTF-8 encoding for its file
>> names;
>> whether GUI or CLI, the system *really* doesn't like such file names.
>
> Yes, but Mercurial must handle more or less arbitrary null-terminated
> byte strings on other systems, so we should give this corner case some
> consideration.
All things considered, isn't it safe to assume that any Mac OS X
installation uses UTF-8 for file system encoding?
--
Dan Villiom Podlaski Christiansen
danchr at gmail.com
===============================================
# HG changeset patch
# User Dan Villiom Podlaski Christiansen <danchr at gmail.com>
# Date 1248470961 -7200
# Node ID f132b7058ffa2d3b4a544fe4ded53ac7e35a26ac
# Parent d98cef25b5afed5d8aa325ef87f98789367d8b6e
util: add normalizepath() for getting the 'true' path on Mac OS X.
diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -59,7 +59,7 @@ class dirstate(object):
def _foldmap(self):
f = {}
for name in self._map:
- f[os.path.normcase(name)] = name
+ f[util.realpath(name)] = name
return f
@propertycache
@@ -340,7 +340,7 @@ class dirstate(object):
self._ui.warn(_("not in dirstate: %s\n") % f)
def _normalize(self, path, knownpath):
- norm_path = os.path.normcase(path)
+ norm_path = util.realpath(path)
fold_path = self._foldmap.get(norm_path, None)
if fold_path is None:
if knownpath or not
os.path.exists(os.path.join(self._root, path)):
diff --git a/mercurial/posix.py b/mercurial/posix.py
--- a/mercurial/posix.py
+++ b/mercurial/posix.py
@@ -7,7 +7,7 @@
from i18n import _
import osutil
-import os, sys, errno, stat, getpass, pwd, grp
+import os, sys, errno, stat, getpass, pwd, grp, fcntl
posixfile = open
nulldev = '/dev/null'
@@ -104,6 +104,19 @@ def pconvert(path):
def localpath(path):
return path
+if sys.platform == 'darwin':
+ def realpath(path):
+ try:
+ # fcntl.h: O_SYMLINK = 0x200000, F_GETPATH = 50
+ f = os.open(path, 0x200000)
+ r = fcntl.fcntl(f, 50, '\0' * 1024)
+ os.close(f)
+ return r.rstrip('\0')
+ except IOError:
+ return path
+else:
+ realpath = os.path.realpath
+
def shellquote(s):
if os.sys.platform == 'OpenVMS':
return '"%s"' % s
diff --git a/mercurial/windows.py b/mercurial/windows.py
--- a/mercurial/windows.py
+++ b/mercurial/windows.py
@@ -126,6 +126,10 @@ def localpath(path):
def normpath(path):
return pconvert(os.path.normpath(path))
+def realpath(path):
+ '''Obtain the canonical version of a path.'''
+ return os.path.normpath(os.path.normcase(os.path.realpath(path)))
+
def samestat(s1, s2):
return False
diff --git a/tests/test-path-normalization b/tests/test-path-
normalization
new file mode 100755
--- /dev/null
+++ b/tests/test-path-normalization
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+hg clone --quiet $TESTDIR/test-path-normalization.hg t
+exec hg st -R t
diff --git a/tests/test-path-normalization.hg b/tests/test-path-
normalization.hg
new file mode 100644
index
0000000000000000000000000000000000000000
..73ea086c424d1d011310533a43f74582f1835ff5
GIT binary patch
literal 404
zc$@*00c-w9M=>x$T4*^jL0KkKStkp}5&!@R|Nq~TKxj-4f938JM*u(X-2nmwAP|az
z0s;&$10huyG9fSl8pgpYk?AI$L at -87ni^muCL`4J0s|%l+5^=wri4-IYGX{1noLQi
znWl!ON2!PgOcCl~44PyOG%$$N2*?4D0B8UJ05kv^001FOxrvJ)oInhWL;!$1gcJZN
zHZl at eeuSoK!cnRk*OImct{HOm*$28g5RV}0Q)qZk!TafYGR#&(9PZYW4Opx#oDjRt
zXR7Q0(0w=z^+uQ?2z9ZtxtUbzFjEt-rUu77!kqwjg{AlpOJ<rA)to$4`M at Q3CLPIr
z2tMnSAC^Yql68MDi2N`@n=Kfzs1UbRVQ&<f*+BjwM?_6n?kQx6tEGjftTqthPX<VU
zIZ5a_y=AUg(p at v0MO83}c}Q_o3h4|%DNO~$*6X at F!f%ZY7RZ-yut_j5z+MQPbzuV9
y)YKcl8)+f;{cSo!gJGv<(`jfj0`924OM$pI*#|fSE_%4gaTjt$I8cx$3&#@aQlq2*
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1943 bytes
Desc: not available
Url : http://selenic.com/pipermail/mercurial-devel/attachments/20090724/e4c59391/attachment.bin
More information about the Mercurial-devel
mailing list