[PATCH 2 of 2] encoding: quickly uppercase ASCII strings

Martin Geisler mg at lazybytes.net
Mon Jul 23 15:59:53 CDT 2012


# HG changeset patch
# User Martin Geisler <mg at aragost.com>
# Date 1343076805 21600
# Branch stable
# Node ID ac88a42b300552cf41443be14cac596362e1b7dc
# Parent  483526b958bcbc483f00f559db0de7221067dc12
encoding: quickly uppercase ASCII strings

This copies the performance hack from encoding.lower. The case-folding
logic that kicks in on case-insensitive filesystems hits this function
hard: with a repository with 75k files, the timings went from

  hg perfstatus
  ! wall 3.156000 comb 3.156250 user 1.625000 sys 1.531250 (best of 3)

to

  hg perfstatus
  ! wall 2.390000 comb 2.390625 user 1.078125 sys 1.312500 (best of 5)

This is a 24% decrease. The same decrease is seen when executing the
status command as normal where the time went from

  hg status --time
  time: real 4.322 secs (user 2.219+0.000 sys 2.094+0.000)

to

  hg status --time
  time: real 3.307 secs (user 1.750+0.000 sys 1.547+0.000)

diff --git a/mercurial/encoding.py b/mercurial/encoding.py
--- a/mercurial/encoding.py
+++ b/mercurial/encoding.py
@@ -190,6 +190,11 @@
 def upper(s):
     "best-effort encoding-aware case-folding of local string s"
     try:
+        s.decode('ascii') # throw exception for non-ASCII character
+        return s.upper()
+    except UnicodeDecodeError:
+        pass
+    try:
         if isinstance(s, localstr):
             u = s._utf8.decode("utf-8")
         else:


More information about the Mercurial-devel mailing list