[PATCH 1 of 4 py3] similar: avoid sorting and making copies of addedfiles and removedfiles sets

Augie Fackler raf at durin42.com
Tue Mar 21 19:13:00 UTC 2017


# HG changeset patch
# User Augie Fackler <augie at google.com>
# Date 1489903420 14400
#      Sun Mar 19 02:03:40 2017 -0400
# Node ID a4745fd9219ed5b408bfc0403a4a8e6acd41df6c
# Parent  66c3ae6d886cae0e3a3cff6a0058e2d2a866fd9d
similar: avoid sorting and making copies of addedfiles and removedfiles sets

The process of porting to Python 3 exposed some weirdness in this
code: workingfilectx doesn't define rich comparison operators, so in
Python 2 the default comparison devolved to id(value), which is the
pointer address of the object. Inspection of _findexactmatches and
_findsimilarmatches revealed that they didn't care about the sort
order of the data, so we remove the sort (and potentially make
addremove faster since it's not sorting things). We now have to do one
little extra set dance in order to not mutate the addedfiles set
during its iteration, but that's a small price to pay for the
resulting cleaner nature of the code.

diff --git a/mercurial/similar.py b/mercurial/similar.py
--- a/mercurial/similar.py
+++ b/mercurial/similar.py
@@ -107,13 +107,14 @@ def findrenames(repo, added, removed, th
             if fp in parentctx and parentctx[fp].size() > 0])
 
     # Find exact matches.
-    for (a, b) in _findexactmatches(repo,
-            sorted(addedfiles), sorted(removedfiles)):
-        addedfiles.remove(b)
+    addedremove = set()
+    for (a, b) in _findexactmatches(repo, addedfiles, removedfiles):
+        addedremove.add(b)
         yield (a.path(), b.path(), 1.0)
+    addedfiles -= addedremove
 
     # If the user requested similar files to be matched, search for them also.
     if threshold < 1.0:
         for (a, b, score) in _findsimilarmatches(repo,
-                sorted(addedfiles), sorted(removedfiles), threshold):
+                addedfiles, removedfiles, threshold):
             yield (a.path(), b.path(), score)


More information about the Mercurial-devel mailing list