[PATCH 4 of 4] filelog: cmp: don't read data if hashes are identical

Nicolas Dumazet nicdumz at gmail.com
Thu Jul 8 22:14:49 CDT 2010


# HG changeset patch
# User Nicolas Dumazet <nicdumz.commits at gmail.com>
# Date 1278326994 -32400
# Branch stable
# Node ID ff22f8a4c0c584dd67ee4871867e080f2a3fe802
# Parent  ec8a19680c658188fa97cc726aefeb2a25bc5060
filelog: cmp: don't read data if hashes are identical

filelog.renamed() is an expensive call as it reads the filelog if p1 == nullid.
It's more efficient to first compute the hash, and to bail early if
the computed hash is the same as the stored nodeid.

'samehashes' variable is not strictly necessary, but helps comprehension.

diff --git a/mercurial/filelog.py b/mercurial/filelog.py
--- a/mercurial/filelog.py
+++ b/mercurial/filelog.py
@@ -62,9 +62,18 @@
         returns True if text is different than what is stored.
         """
 
-        # for renames, we have to go the slow way
-        if text.startswith('\1\n') or self.renamed(node):
+        t = text
+        if text.startswith('\1\n'):
+            t = '\1\n\1\n' + text
+
+        samehashes = not revlog.revlog.cmp(self, node, t)
+        if samehashes:
+            return False
+
+        # renaming a file produces a different hash, even if the data
+        # remains unchanged. Check if it's the case (slow):
+        if self.renamed(node):
             t2 = self.read(node)
             return t2 != text
 
-        return revlog.revlog.cmp(self, node, text)
+        return True


More information about the Mercurial-devel mailing list