[PATCH 1 of 2] context: use normal manifest lookups instead of manifest.find

Bryan O'Sullivan bos at serpentine.com
Tue Mar 12 16:31:31 CDT 2013


# HG changeset patch
# User Bryan O'Sullivan <bryano at fb.com>
# Date 1363123858 25200
# Node ID be03cde093c4ab03c92e31fc55d5680c35bd8327
# Parent  8cc8715b36030fc84f17426cc384155eb743a5f7
context: use normal manifest lookups instead of manifest.find

The manifest.find method was not managing the manifest's cache properly.
If that cache had not been primed prior to a lookup, and a lookup missed,
we were not adding the reconstructed entry to the cache.

This caused a huge slowdown in "hg grep", where we would reconstruct a
manifest entry from scratch for every single file in every revision we
were visiting.

In addition, manifest.find was using a Python binary search to scan the
text of the manifest. This is quite a lot slower than our C code for
parsing the entire manifest and simply looking an entry up. That code
made its debut in 2008, so manifest.find is simply bad news from every
perspective at this point.

For a manifest containing 150,000 entries with 6,000 changes in one
revision, this change improves "hg grep" performance from 650 seconds
to 2. It seems possible to me that other commands could be positively
affected, too, but I haven't really looked at which.

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -287,23 +287,18 @@ class changectx(object):
         return troubles
 
     def _fileinfo(self, path):
-        if '_manifest' in self.__dict__:
-            try:
-                return self._manifest[path], self._manifest.flags(path)
-            except KeyError:
-                raise error.LookupError(self._node, path,
-                                        _('not found in manifest'))
-        if '_manifestdelta' in self.__dict__ or path in self.files():
-            if path in self._manifestdelta:
-                return (self._manifestdelta[path],
-                        self._manifestdelta.flags(path))
-        node, flag = self._repo.manifest.find(self._changeset[0], path)
-        if not node:
+        if ('_manifest' in self.__dict__ or not
+            (('_manifestdelta' in self.__dict__ or path in self.files()) and
+             path in self._manifestdelta)):
+            mf = self._manifest
+        else:
+            mf = self._manifestdelta
+        try:
+            return mf[path], mf.flags(path)
+        except KeyError:
             raise error.LookupError(self._node, path,
                                     _('not found in manifest'))
 
-        return node, flag
-
     def filenode(self, path):
         return self._fileinfo(path)[0]
 


More information about the Mercurial-devel mailing list