[PATCH evolve-ext] inhibit: improve transaction marker perf

Durham Goode durham at fb.com
Sun Nov 8 10:38:00 CST 2015



On 11/8/15 8:28 AM, Gregory Szorc wrote:
> On Sat, Nov 7, 2015 at 5:20 PM, Durham Goode <durham at fb.com 
> <mailto:durham at fb.com>> wrote:
>
>     # HG changeset patch
>     # User Durham Goode <durham at fb.com <mailto:durham at fb.com>>
>     # Date 1446945001 28800
>     #      Sat Nov 07 17:10:01 2015 -0800
>     # Node ID 7c680f209f7af35c7c476eecc2f9eec13b32ad62
>     # Parent  48547b4c77defdd17c670b1eb0eb94272edf0207
>     inhibit: improve transaction marker perf
>
>     The old algorithm was a revset "::X and obsolete()". This was
>     inefficient because
>     it requires walking all the way down the ancestor chain (since the
>     revset did
>     not know it could stop walking at public nodes).
>
>     The new algorithm uses changelog.ancestors() directly and provides
>     a function to
>     stop the iteration when we reach a public node. During a commit to
>     a repo with
>     hundreds of thousands of commits, this change reduces the
>     inhibitmarkers time
>     from 1.5 seconds to effectively 0 seconds.
>
>     diff --git a/hgext/inhibit.py b/hgext/inhibit.py
>     --- a/hgext/inhibit.py
>     +++ b/hgext/inhibit.py
>     @@ -23,6 +23,7 @@ from mercurial import scmutil
>      from mercurial import commands
>      from mercurial import lock as lockmod
>      from mercurial import bookmarks
>     +from mercurial import phases
>      from mercurial import util
>      from mercurial.i18n import _
>
>     @@ -129,7 +130,15 @@ def _inhibitmarkers(repo, nodes):
>          if not _inhibitenabled(repo):
>              return
>
>     -    newinhibit = repo.set('::%ln and obsolete()', nodes)
>     +    newinhibit = set()
>     +    revs = [repo[node].rev() for node in nodes]
>     +    phase = repo._phasecache.phase
>     +    obsoletes = obsolete.getrevs(repo, 'obsolete')
>     +    for anc in repo.changelog.ancestors(revs, inclusive=True,
>     +                stopfunc=lambda rev: phase(repo, rev) ==
>     phases.public):
>     +        if anc in obsoletes:
>     +            newinhibit.add(repo[anc])
>     +
>
>
> I must be missing something obvious: why can't this be written as:
>
> obsoletes = ...
> for rev in repo.revs('::%ln and not public()'):
>     if rev in obsoletes:
>         ...
>
> I don't understand why stopfunc is needed and why a revset can't do 
> the same thing.
The way "::X and not public()" works is it iterates down the ancestor 
line and checks each commit to see if it's not public. The iterator has 
no knowledge that finding something that is public means it can stop 
iterating, so it will iterate over all the commits.

In theory we could try to make revsets smarter (i.e. by allowing the 
ancestor revset to communicate to the public revset that it's doing a 
descendant traversal, and allowing the public revset to signal a stop), 
but that's rather complex, and the implementation would probably end up 
adding the stopfunc to lazyancestors anyway (since that's how you'd 
likely implement it).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20151108/29f91981/attachment.html>


More information about the Mercurial-devel mailing list