[PATCH 5 of 5] obsolete: order of magnitude speedup in _computebumpedset
Augie Fackler
raf at durin42.com
Wed Jan 1 17:18:59 CST 2014
On Mon, Dec 23, 2013 at 04:32:22PM -0800, pierre-yves.david at ens-lyon.org wrote:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david at ens-lyon.org>
> # Date 1387841391 28800
> # Mon Dec 23 15:29:51 2013 -0800
> # Node ID 41a5810bb9cb062cd10f6a12b261fdd66fab8aed
> # Parent 1677c9118fd971197916b7f37177b2a5c03ad802
> obsolete: order of magnitude speedup in _computebumpedset
queueing these, thanks. I may make some English tweaks depending on my
motivation level.
>
> Reminder: a changeset is said "bumped" if it tries to obsolete a immutable
> changeset.
>
>
> The previous algorithm for computing bumped changeset was:
>
> 1) Get all public changesets
> 2) Find all they successors
> 3) Search for stuff that are eligible for being "bumped"
> (mutable and non obsolete)
>
> The entry size of this algorithm is `O(len(public))` which is mostly the same as
> `O(len(repo))`. Even this this approach mean fewer obsolescence marker are
> traveled, this is not very scalable.
>
> The new algorithm is:
>
> 1) For each potential bumped changesets (non obsolete mutable)
> 2) iterate over precursors
> 3) if a precursors is public. changeset is bumped
>
> We travel more obsolescence marker, but the entry size is much smaller since
> the amount of potential bumped should remains mostly stable with time `O(1)`.
>
> On some confidential gigantic repo this move bumped computation from 15.19s to
> 0.46s (×33 speedup…). On "smaller" repo (mercurial, cubicweb's review) no
> significant gain were seen. The additional traversal of obsolescence marker is
> probably probably counter balance the advantage of it.
>
> Other optimisation could be done in the future (eg: sharing precursors cache
> for divergence detection)
>
> diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
> --- a/mercurial/obsolete.py
> +++ b/mercurial/obsolete.py
> @@ -82,10 +82,11 @@ The header is followed by the markers. E
> additional encoding. Keys cannot contain '\0' or ':' and values
> cannot contain '\0'.
> """
> import struct
> import util, base85, node
> +import phases
> from i18n import _
>
> _pack = struct.pack
> _unpack = struct.unpack
>
> @@ -784,18 +785,30 @@ def _computeextinctset(repo):
>
>
> @cachefor('bumped')
> def _computebumpedset(repo):
> """the set of revs trying to obsolete public revisions"""
> - # get all possible bumped changesets
> - tonode = repo.changelog.node
> - publicnodes = (tonode(r) for r in repo.revs('public()'))
> - successors = allsuccessors(repo.obsstore, publicnodes,
> - ignoreflags=bumpedfix)
> - # revision public or already obsolete don't count as bumped
> - query = '%ld - obsolete() - public()'
> - return set(repo.revs(query, _knownrevs(repo, successors)))
> + bumped = set()
> + # utils function (avoid attribut lookup in the loop)
> + phase = repo._phasecache.phase # would be faster to grab the full list
> + public = phases.public
> + cl = repo.changelog
> + torev = cl.nodemap.get
> + obs = getrevs(repo, 'obsolete')
> + for rev in repo:
> + # We only evaluate mutable, non-obsolete revision
> + if (public < phase(repo, rev)) and (rev not in obs):
> + node = cl.node(rev)
> + # (future) A cache of precursors may worth if split is very common
> + for pnode in allprecursors(repo.obsstore, [node],
> + ignoreflags=bumpedfix):
> + prev = torev(pnode) # unfiltered! but so is phasecache
> + if (prev is not None) and (phase(repo, prev) <= public):
> + # we have a public precursors
> + bumped.add(rev)
> + break # Next draft!
> + return bumped
>
> @cachefor('divergent')
> def _computedivergentset(repo):
> """the set of rev that compete to be the final successors of some revision.
> """
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
More information about the Mercurial-devel
mailing list