# [PATCH] revset: improves time complexity of 'roots(xxx)'

Pierre-Yves David pierre-yves.david at ens-lyon.org
Mon Jun 22 15:51:10 CDT 2015

```# HG changeset patch
# User Pierre-Yves David <pierre-yves.david at fb.com>
# Date 1434993552 25200
#      Mon Jun 22 10:19:12 2015 -0700
# Node ID 6a2c2c13784761bbd8b1c46d2888257d234ae591
# Parent  7fdd1782fc4ee9da87d8af13e806dc9055db2c38
revset: improves time complexity of 'roots(xxx)'

The canonical way of doing 'roots(X)' is 'X - children(X)'. This is what the
implementation used to be. However, computing children is expensive because it
is unbounded. Any changesets in the repository may be a children of '0' so you
have to look at all changesets in the repository to compute children(0).
Moreover the current revsets implementation for children is not lazy, leading to
bad performance when fetching the first result.

There is a more restricted algorithm to compute roots:

roots(X) = [r for r in X if not parents(r) & X]

This achieve the same result while only looking for parent/children relation in
the X set itself, making the algorithm 'O(len(X))' membership operation.
Another advantages is that it turns the check into a simple filter, preserving
all laziness property of the underlying revsets.

The speed is very significant and some lazyness is restored.

-) revset without 'roots(...)' to compare to base line
0) before this change
1) after this change

revset #0: roots((tip~100::) - (tip~100::tip))
plain         min           last
-) 0.001082      0.000993      0.000790
0) 0.001366      0.001385      0.001339
1) 0.001257  92% 0.001028  74% 0.000821  61%

revset #1: roots((0::) - (0::tip))
plain         min           last
-) 0.134551      0.144682      0.068453
0) 0.161822      0.171786      0.157683
1) 0.137583  85% 0.146204  85% 0.070012  44%

revset #2: roots(tip~100:)
plain         min           first         last
-) 0.000219      0.000225      0.000231      0.000229
0) 0.000513      0.000529      0.000507      0.000539
1) 0.000463  90% 0.000269  50% 0.000267  52% 0.000463  85%

revset #3: roots(:42)
plain         min           first         last
-) 0.000119      0.000146      0.000146      0.000146
0) 0.000231      0.000254      0.000253      0.000260
1) 0.000216  93% 0.000186  73% 0.000184  72% 0.000244  93%

revset #4: roots(not public())
plain         min           first
-) 0.000478      0.000502      0.000504
0) 0.000611      0.000639      0.000634
1) 0.000604      0.000560  87% 0.000558

revset #5: roots((0:tip)::)
plain         min           max           first         last
-) 0.057795      0.004905      0.058260      0.004908      0.038812
0) 0.132845      0.118931      0.130306      0.114280      0.127742
1) 0.111659  84% 0.005023   4% 0.111658  85% 0.005022   4% 0.092490  72%

revset #6: roots(0::tip)
plain         min           max           first         last
-) 0.032971      0.033947      0.033460      0.032350      0.033125
0) 0.083671      0.081953      0.084074      0.080364      0.086069
1) 0.074720  89% 0.035547  43% 0.077025  91% 0.033729  41% 0.083197

revset #7: 42:68 and roots(42:tip)
plain         min           max           first         last
-) 0.006827      0.000251      0.006830      0.000254      0.006771
0) 0.000337      0.000353      0.000366      0.000350      0.000366
1) 0.000318  94% 0.000297  84% 0.000353      0.000293  83% 0.000351

revset #8: roots(0:tip)
plain         min           max           first         last
-) 0.002119      0.000145      0.000147      0.000147      0.000147
0) 0.047441      0.040660      0.045662      0.040284      0.043435
1) 0.038057  80% 0.000187   0% 0.034919  76% 0.000186   0% 0.035097  80%

revset #0: roots(:42 + tip~42:)
plain         min           max           first         last          sort
-) 0.000321      0.000317      0.000319      0.000308      0.000369      0.000343
0) 0.000772      0.000751      0.000811      0.000750      0.000802      0.000783
1) 0.000632  81% 0.000369  49% 0.000617  76% 0.000358  47% 0.000601  74% 0.000642  81%

diff --git a/contrib/all-revsets.txt b/contrib/all-revsets.txt
--- a/contrib/all-revsets.txt
+++ b/contrib/all-revsets.txt
@@ -21,12 +21,10 @@ parents(20000)
# Used in revision 186fd06283b4
(_intlist('20000\x0020001')) and merge()
# Used in revision 911f5a6579d1
p1(20000)
p2(10000)
-# Used in revision f140d6207cca
-roots(0:tip)
# Used in revision b6dc3b79bb25
0::
# Used in revision faf4f63533ff
bookmark()
# Used in revision 22ba2c0825da
@@ -99,28 +97,37 @@ 0::tip
all()
draft()
::tip
draft() and ::tip
::tip and draft()
-roots(0::tip)
author(lmoscovicz)
author(mpm)
-42:68 and roots(42:tip)
::p1(p1(tip))::
public()
:10000 and public()
:10000 and draft()
-roots((0:tip)::)
(not public() - obsolete())

# The one below is used by rebase
(children(ancestor(tip~5, tip)) and ::(tip~5))::

# those two `roots(...)` inputs are close to what phase movement use.
roots((tip~100::) - (tip~100::tip))
roots((0::) - (0::tip))

+# more roots testing
+roots(tip~100:)
+roots(:42)
+roots(not public())
+roots((0:tip)::)
+roots(0::tip)
+42:68 and roots(42:tip)
+# Used in revision f140d6207cca
+roots(0:tip)
+# test disjoint set with multiple roots
+roots((:42) + (tip~42:))
+
# Testing the behavior of "head()" in various situations
diff --git a/mercurial/revset.py b/mercurial/revset.py
--- a/mercurial/revset.py
+++ b/mercurial/revset.py
@@ -1753,13 +1753,17 @@ def reverse(repo, subset, x):
def roots(repo, subset, x):
"""``roots(set)``
Changesets in set with no parent changeset in set.
"""
s = getset(repo, fullreposet(repo), x)
-    subset = subset & s# baseset([r for r in s if r in subset])
-    cs = _children(repo, subset, s)
-    return subset - cs
+    parents = repo.changelog.parentrevs
+    def filter(r):
+        for p in parents(r):
+            if 0 <= p and p in s:
+                return False
+        return True
+    return subset & s.filter(filter)

def sort(repo, subset, x):
"""``sort(set[, [-]key...])``
Sort set by keys. The default sort order is ascending, specify a key
as ``-key`` to sort in descending order.
```