[PATCH] Optimize node-to-node localrepo.status

Jesse Glick typrase at gmail.com
Wed May 2 20:15:09 CDT 2012

# HG changeset patch
# User Jesse Glick <jesse.glick at oracle.com>
# Date 1336007415 14400
# Branch stable
# Node ID 419630251325c7d081fddd48e816da2cfaaaae51
# Parent  979b1b7340fba32b4d7f499e6d89093758147520
Optimize node-to-node localrepo.status.

Templating with {file_mods}/{file_adds}/{file_dels} calls this a lot. For a
large repo like NetBeans main, the command used by the Hudson plugin for
Mercurial to calculate its changelog can take a long time. Some simple
optimizations can reduce this overhead by around 26%, though it is still far
slower than using only {files}, since localrepo.status() is not cheap.
(Issue3415 may be another manifestation of the same problem.)

1. Introduce match.always() to check if a match object always says yes, i.e.
None was passed in. If so, mfmatches should not bother iterating every file in
the repository.

2. Introduce manifestdict.withflags() to get a set of all files which have any
flags set, since these are likely to be a minority. Otherwise checking .flags()
for every file is a lot of dictionary lookups and is quite slow.

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1334,6 +1334,8 @@
         def mfmatches(ctx):
             mf = ctx.manifest().copy()
+            if match.always():
+                return mf
             for fn in mf.keys():
                 if not match(fn):
                     del mf[fn]
@@ -1419,10 +1421,11 @@
                 mf2 = mfmatches(ctx2)
             modified, added, clean = [], [], []
+            withflags = mf1.withflags() | mf2.withflags()
             for fn in mf2:
                 if fn in mf1:
                     if (fn not in deleted and
-                        (mf1.flags(fn) != mf2.flags(fn) or
+                        ((fn in withflags and mf1.flags(fn) != mf2.flags(fn)) or
                          (mf1[fn] != mf2[fn] and
                           (mf2[fn] or ctx1[fn].cmp(ctx2[fn]))))):
diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -19,6 +19,8 @@
         self._flags = flags
     def flags(self, f):
         return self._flags.get(f, "")
+    def withflags(self):
+        return set(self._flags.keys())
     def set(self, f, flags):
         self._flags[f] = flags
     def copy(self):
diff --git a/mercurial/match.py b/mercurial/match.py
--- a/mercurial/match.py
+++ b/mercurial/match.py
@@ -118,6 +118,8 @@
         return self._files
     def anypats(self):
         return self._anypats
+    def always(self):
+        return False
 class exact(match):
     def __init__(self, root, cwd, files):
@@ -126,6 +128,8 @@
 class always(match):
     def __init__(self, root, cwd):
         match.__init__(self, root, cwd, [])
+    def always(self):
+        return True
 class narrowmatcher(match):
     """Adapt a matcher to work on a subdirectory only.

More information about the Mercurial-devel mailing list