[PATCH STABLE] revset: force immediate revset evaluation for roots() (issue4313)

Pierre-Yves David pierre-yves.david at ens-lyon.org
Thu Jul 24 04:39:33 CDT 2014



On 07/24/2014 04:09 AM, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1406167462 25200
> #      Wed Jul 23 19:04:22 2014 -0700
> # Branch stable
> # Node ID da352529e4b1b99c313b77058d259535b96b9f2f
> # Parent  54ff2789d75e0662a536b3da61ca25c07436966b
> revset: force immediate revset evaluation for roots() (issue4313)
>
> Lazy revsets make roots() extremely slow for large inputs. The
> underlying reason is that baseset.__sub__ creates a lazyset and the
> subsequent evaluation of that lazyset results in back-and-forth
> evaluation between the 2 sets, leading to an explosion of function
> evaluations and drastic slowdown. This is possibly a regression
> from dd716807fd23.
>
> On the author's machine, this slowdown manifested during evaluation of
> 'roots(%ln::)' in phases.retractboundary after unbundling the Firefox
> repository. Using `time hg unbundle firefox.hg` as a benchmark:
>
> Before: 8:00
> After:  4:28
> Delta: -3:32

Please have a look at contrib/revsetbenchmarks.{py,txt}

Such regression should be added to the benchmark list (if not already 
present) and output in your changeset description should be extract from 
the script run.

>
> For reference, the subset and cs baseset instances impacted by this
> change were of lengths 193634 and 193627, respectively.
>
> The proper fix for this issue is to make baseset.__sub__ more intelligent.
> However, the patch author is not super familiar with this code, would like
> a fix for a bad performance regression to make the next stable release,
> is concerned about a related change in dd716807fd23 and unintended
> consequences of changing baseset.__sub__ and believes that this
> targeted approach is the safest for right now.
>
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -1474,9 +1474,13 @@ def roots(repo, subset, x):
>       """
>       s = getset(repo, spanset(repo), x).set()
>       subset = baseset([r for r in s if r in subset.set()])
>       cs = _children(repo, subset, s)
> -    return subset - cs
> +    # This casting was added for performance reasons because __sub__ on 2
> +    # baseset will create lazyset and the subsequent evaluation appears to
> +    # be O(n^2). This hack should be removed if baseset.__sub__ gains the
> +    # smarts to react intelligently to 2 baseset instances.
> +    return baseset(subset.set() - cs.set())
>
>   def secret(repo, subset, x):
>       """``secret()``
>       Changeset in secret phase."""
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list