D987: copies: add a config to limit the number of candidates to check in heuristics

pulkit (Pulkit Goyal) phabricator at mercurial-scm.org
Sun Oct 8 01:00:17 UTC 2017


pulkit created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  The heuristics algorithm find possible candidates for move/copy and then check
  whether they are actually a copy or move. In some cases, there can be lot of
  candidates possible which can actually slow down the algorithm.
  
  This patch introduces a config option
  `experimental.copytrace.movecandidateslimit` using which one can limit the
  candidates to check. The limit defaults to 5.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D987

AFFECTED FILES
  mercurial/configitems.py
  mercurial/copies.py

CHANGE DETAILS

diff --git a/mercurial/copies.py b/mercurial/copies.py
--- a/mercurial/copies.py
+++ b/mercurial/copies.py
@@ -644,6 +644,11 @@
 
         [experimental]
         copytrace = heuristics
+
+    In some cases the copy/move candidates found by heuristics can be very large
+    in number and that will make the algorithm slow. The number of possible
+    candidates to check can be limited by using the config
+    `experimental.copytrace.movecandidateslimit` which defaults to 5.
     """
 
     if c1.rev() is None:
@@ -704,14 +709,32 @@
             # f is guaranteed to be present in c2, that's why
             # c2.filectx(f) won't fail
             f2 = c2.filectx(f)
-            for candidate in movecandidates:
+            # we can have a lot of candidates which can slow down the heuristics
+            # config value to limit the number of candidates moves to check
+            maxcandidates = repo.ui.config('experimental',
+                                           'copytrace.movecandidateslimit')
+
+            # checking the config value to prevent traceback
+            if not isinstance(maxcandidates, int):
+                try:
+                    maxcandidates = int(maxcandidates)
+                except ValueError:
+                    repo.ui.debug("error parsing copytrace.movecandidateslimit:"
+                                  " '%s', setting it to 5" % maxcandidates)
+                    maxcandidates = 5
+
+            for candidate in movecandidates[:maxcandidates]:
                 f1 = c1.filectx(candidate)
                 if _related(f1, f2, anc.rev()):
                     # if there are a few related copies then we'll merge
                     # changes into all of them. This matches the behaviour
                     # of upstream copytracing
                     copies[candidate] = f
 
+            if len(movecandidates) > maxcandidates:
+                repo.ui.debug("more candidates than the limit: %d\n" %
+                              len(movecandidates))
+
     return copies, {}, {}, {}, {}
 
 def _related(f1, f2, limit):
diff --git a/mercurial/configitems.py b/mercurial/configitems.py
--- a/mercurial/configitems.py
+++ b/mercurial/configitems.py
@@ -185,6 +185,9 @@
 coreconfigitem('experimental', 'copytrace',
     default='on',
 )
+coreconfigitem('experimental', 'copytrace.movecandidateslimit',
+    default=5,
+)
 coreconfigitem('experimental', 'copytrace.sourcecommitlimit',
     default=100,
 )



To: pulkit, #hg-reviewers
Cc: mercurial-devel


More information about the Mercurial-devel mailing list