[PATCH STABLE] scmutil: avoid quadratic membership testing (issue5969)

Gregory Szorc gregory.szorc at gmail.com
Sat Aug 25 01:23:05 UTC 2018


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1535160115 25200
#      Fri Aug 24 18:21:55 2018 -0700
# Branch stable
# Node ID db8e86a65460c9bc4794afc4cd6c7e4bb69e3b0b
# Parent  bd63ada7e1f838d7a579edcbd8e3c8ff7ec46a43
scmutil: avoid quadratic membership testing (issue5969)

tr.changes['revs'] is an xrange, which has an O(n) __contains__
implementation. The `rev not in newrevs` lookup a few lines below
will therefore be O(n^2) if all incoming changesets are public.

This issue isn't present on @ because 45e05d39d9ce introduced
a custom type implementing an xrange primitive with O(1) contains
and switched tr.changes['revs'] to be an instance of that type.

We work around the problem on the stable branch by casting the
xrange to a set. This is a bit hacky because it requires allocating
memory to hold each integer in the range. But we are already
holding the full set of pulled revision numbers in memory
multiple times (such as in `tr.changes['phases']`). So this is
a relatively minor problem.

This issue has been present since the phases reporting code was
introduced in the 4.7 cycle by eb9835014d20.

This change should be reverted/ignored when stable is merged into
default.

On the mozilla-unified repository with 483492 changesets, `hg clone`
time improves substantially:

before: 1843.700s user; 29.810s sys
after:   461.170s user; 29.360s sys

diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -1565,7 +1565,10 @@ def registersummarycallback(repo, otr, t
             """Report statistics of phase changes for changesets pre-existing
             pull/unbundle.
             """
-            newrevs = tr.changes.get('revs', xrange(0, 0))
+            # TODO set() is only appropriate for 4.7 since revs post
+            # 45e05d39d9ce is a pycompat.membershiprange, which has O(n)
+            # membership testing.
+            newrevs = set(tr.changes.get('revs', xrange(0, 0)))
             phasetracking = tr.changes.get('phases', {})
             if not phasetracking:
                 return


More information about the Mercurial-devel mailing list