[PATCH] OS X: try cheap ascii .lower() in normcase before making full unicode dance

Mads Kiilerich mads at kiilerich.com
Tue Jan 29 10:42:39 CST 2013


# HG changeset patch
# User Mads Kiilerich <madski at unity3d.com>
# Date 1359475301 -3600
# Branch stable
# Node ID b7e3f9a8796fc31b6d81d134072b4daa3e0d42ec
# Parent  a58d8936647aa270854cd919fe8e8b2da1c1c669
OS X: try cheap ascii .lower() in normcase before making full unicode dance

This is similar to what is done in encoding.lower, introduced in c481761033bd.

This has been seen making 'hg up' and 'hg st' in a 50000+ files repo 13%
faster.

This might make Mercurial slightly slower for users who mainly use non-ASCII
filenames. That is a reasonable trade-off.

Some numbers:

hg up
before:
time: real 2.100 secs (user 1.900+0.000 sys 0.200+0.000)
time: real 2.110 secs (user 1.930+0.000 sys 0.180+0.000)
time: real 2.080 secs (user 1.900+0.000 sys 0.180+0.000)
after:
time: real 1.830 secs (user 1.660+0.000 sys 0.180+0.000)
time: real 1.840 secs (user 1.660+0.000 sys 0.180+0.000)
time: real 1.810 secs (user 1.640+0.000 sys 0.180+0.000)

hg st
before:
time: real 1.680 secs (user 1.370+0.000 sys 0.310+0.000)
time: real 1.650 secs (user 1.340+0.000 sys 0.300+0.000)
time: real 1.660 secs (user 1.350+0.000 sys 0.310+0.000)
after:
time: real 1.460 secs (user 1.150+0.000 sys 0.310+0.000)
time: real 1.430 secs (user 1.120+0.000 sys 0.300+0.000)
time: real 1.420 secs (user 1.120+0.000 sys 0.300+0.000)

diff --git a/mercurial/posix.py b/mercurial/posix.py
--- a/mercurial/posix.py
+++ b/mercurial/posix.py
@@ -195,6 +195,11 @@
 
     def normcase(path):
         try:
+            path.decode('ascii') # throw exception for non-ASCII character
+            return path.lower()
+        except UnicodeDecodeError:
+            pass
+        try:
             u = path.decode('utf-8')
         except UnicodeDecodeError:
             # percent-encode any characters that don't round-trip


More information about the Mercurial-devel mailing list