[PATCH] revset: improves time complexity of 'roots(xxx)'

Augie Fackler raf at durin42.com
Mon Jun 22 17:55:05 CDT 2015


On Mon, Jun 22, 2015 at 01:51:10PM -0700, Pierre-Yves David wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david at fb.com>
> # Date 1434993552 25200
> #      Mon Jun 22 10:19:12 2015 -0700
> # Node ID 6a2c2c13784761bbd8b1c46d2888257d234ae591
> # Parent  7fdd1782fc4ee9da87d8af13e806dc9055db2c38
> revset: improves time complexity of 'roots(xxx)'

Queued, thanks

>
> The canonical way of doing 'roots(X)' is 'X - children(X)'. This is what the
> implementation used to be. However, computing children is expensive because it
> is unbounded. Any changesets in the repository may be a children of '0' so you
> have to look at all changesets in the repository to compute children(0).
> Moreover the current revsets implementation for children is not lazy, leading to
> bad performance when fetching the first result.
>
>
> There is a more restricted algorithm to compute roots:
>
>     roots(X) = [r for r in X if not parents(r) & X]
>
> This achieve the same result while only looking for parent/children relation in
> the X set itself, making the algorithm 'O(len(X))' membership operation.
> Another advantages is that it turns the check into a simple filter, preserving
> all laziness property of the underlying revsets.
>
> The speed is very significant and some lazyness is restored.
>
> -) revset without 'roots(...)' to compare to base line
> 0) before this change
> 1) after this change
>
> revset #0: roots((tip~100::) - (tip~100::tip))
>    plain         min           last
> -) 0.001082      0.000993      0.000790
> 0) 0.001366      0.001385      0.001339
> 1) 0.001257  92% 0.001028  74% 0.000821  61%
>
> revset #1: roots((0::) - (0::tip))
>    plain         min           last
> -) 0.134551      0.144682      0.068453
> 0) 0.161822      0.171786      0.157683
> 1) 0.137583  85% 0.146204  85% 0.070012  44%
>
> revset #2: roots(tip~100:)
>    plain         min           first         last
> -) 0.000219      0.000225      0.000231      0.000229
> 0) 0.000513      0.000529      0.000507      0.000539
> 1) 0.000463  90% 0.000269  50% 0.000267  52% 0.000463  85%
>
> revset #3: roots(:42)
>    plain         min           first         last
> -) 0.000119      0.000146      0.000146      0.000146
> 0) 0.000231      0.000254      0.000253      0.000260
> 1) 0.000216  93% 0.000186  73% 0.000184  72% 0.000244  93%
>
> revset #4: roots(not public())
>    plain         min           first
> -) 0.000478      0.000502      0.000504
> 0) 0.000611      0.000639      0.000634
> 1) 0.000604      0.000560  87% 0.000558
>
> revset #5: roots((0:tip)::)
>    plain         min           max           first         last
> -) 0.057795      0.004905      0.058260      0.004908      0.038812
> 0) 0.132845      0.118931      0.130306      0.114280      0.127742
> 1) 0.111659  84% 0.005023   4% 0.111658  85% 0.005022   4% 0.092490  72%
>
> revset #6: roots(0::tip)
>    plain         min           max           first         last
> -) 0.032971      0.033947      0.033460      0.032350      0.033125
> 0) 0.083671      0.081953      0.084074      0.080364      0.086069
> 1) 0.074720  89% 0.035547  43% 0.077025  91% 0.033729  41% 0.083197
>
> revset #7: 42:68 and roots(42:tip)
>    plain         min           max           first         last
> -) 0.006827      0.000251      0.006830      0.000254      0.006771
> 0) 0.000337      0.000353      0.000366      0.000350      0.000366
> 1) 0.000318  94% 0.000297  84% 0.000353      0.000293  83% 0.000351
>
> revset #8: roots(0:tip)
>    plain         min           max           first         last
> -) 0.002119      0.000145      0.000147      0.000147      0.000147
> 0) 0.047441      0.040660      0.045662      0.040284      0.043435
> 1) 0.038057  80% 0.000187   0% 0.034919  76% 0.000186   0% 0.035097  80%
>
> revset #0: roots(:42 + tip~42:)
>    plain         min           max           first         last          sort
> -) 0.000321      0.000317      0.000319      0.000308      0.000369      0.000343
> 0) 0.000772      0.000751      0.000811      0.000750      0.000802      0.000783
> 1) 0.000632  81% 0.000369  49% 0.000617  76% 0.000358  47% 0.000601  74% 0.000642  81%
>
> diff --git a/contrib/all-revsets.txt b/contrib/all-revsets.txt
> --- a/contrib/all-revsets.txt
> +++ b/contrib/all-revsets.txt
> @@ -21,12 +21,10 @@ parents(20000)
>  # Used in revision 186fd06283b4
>  (_intlist('20000\x0020001')) and merge()
>  # Used in revision 911f5a6579d1
>  p1(20000)
>  p2(10000)
> -# Used in revision f140d6207cca
> -roots(0:tip)
>  # Used in revision b6dc3b79bb25
>  0::
>  # Used in revision faf4f63533ff
>  bookmark()
>  # Used in revision 22ba2c0825da
> @@ -99,28 +97,37 @@ 0::tip
>  all()
>  draft()
>  ::tip
>  draft() and ::tip
>  ::tip and draft()
> -roots(0::tip)
>  author(lmoscovicz)
>  author(mpm)
> -42:68 and roots(42:tip)
>  ::p1(p1(tip))::
>  public()
>  :10000 and public()
>  :10000 and draft()
> -roots((0:tip)::)
>  (not public() - obsolete())
>
>  # The one below is used by rebase
>  (children(ancestor(tip~5, tip)) and ::(tip~5))::
>
>  # those two `roots(...)` inputs are close to what phase movement use.
>  roots((tip~100::) - (tip~100::tip))
>  roots((0::) - (0::tip))
>
> +# more roots testing
> +roots(tip~100:)
> +roots(:42)
> +roots(not public())
> +roots((0:tip)::)
> +roots(0::tip)
> +42:68 and roots(42:tip)
> +# Used in revision f140d6207cca
> +roots(0:tip)
> +# test disjoint set with multiple roots
> +roots((:42) + (tip~42:))
> +
>  # Testing the behavior of "head()" in various situations
>  head()
>  head() - public()
>  draft() and head()
>  head() and author("mpm")
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -1753,13 +1753,17 @@ def reverse(repo, subset, x):
>  def roots(repo, subset, x):
>      """``roots(set)``
>      Changesets in set with no parent changeset in set.
>      """
>      s = getset(repo, fullreposet(repo), x)
> -    subset = subset & s# baseset([r for r in s if r in subset])
> -    cs = _children(repo, subset, s)
> -    return subset - cs
> +    parents = repo.changelog.parentrevs
> +    def filter(r):
> +        for p in parents(r):
> +            if 0 <= p and p in s:
> +                return False
> +        return True
> +    return subset & s.filter(filter)
>
>  def sort(repo, subset, x):
>      """``sort(set[, [-]key...])``
>      Sort set by keys. The default sort order is ascending, specify a key
>      as ``-key`` to sort in descending order.
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> https://selenic.com/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list