[PATCH evolve-ext] inhibit: improve transaction marker perf
Durham Goode
durham at fb.com
Sun Nov 8 10:52:36 CST 2015
On 11/8/15 8:46 AM, Gregory Szorc wrote:
> On Sun, Nov 8, 2015 at 8:38 AM, Durham Goode <durham at fb.com
> <mailto:durham at fb.com>> wrote:
>
>
>
> On 11/8/15 8:28 AM, Gregory Szorc wrote:
>> On Sat, Nov 7, 2015 at 5:20 PM, Durham Goode <durham at fb.com
>> <mailto:durham at fb.com>> wrote:
>>
>> # HG changeset patch
>> # User Durham Goode <durham at fb.com <mailto:durham at fb.com>>
>> # Date 1446945001 28800
>> # Sat Nov 07 17:10:01 2015 -0800
>> # Node ID 7c680f209f7af35c7c476eecc2f9eec13b32ad62
>> # Parent 48547b4c77defdd17c670b1eb0eb94272edf0207
>> inhibit: improve transaction marker perf
>>
>> The old algorithm was a revset "::X and obsolete()". This was
>> inefficient because
>> it requires walking all the way down the ancestor chain
>> (since the revset did
>> not know it could stop walking at public nodes).
>>
>> The new algorithm uses changelog.ancestors() directly and
>> provides a function to
>> stop the iteration when we reach a public node. During a
>> commit to a repo with
>> hundreds of thousands of commits, this change reduces the
>> inhibitmarkers time
>> from 1.5 seconds to effectively 0 seconds.
>>
>> diff --git a/hgext/inhibit.py b/hgext/inhibit.py
>> --- a/hgext/inhibit.py
>> +++ b/hgext/inhibit.py
>> @@ -23,6 +23,7 @@ from mercurial import scmutil
>> from mercurial import commands
>> from mercurial import lock as lockmod
>> from mercurial import bookmarks
>> +from mercurial import phases
>> from mercurial import util
>> from mercurial.i18n import _
>>
>> @@ -129,7 +130,15 @@ def _inhibitmarkers(repo, nodes):
>> if not _inhibitenabled(repo):
>> return
>>
>> - newinhibit = repo.set('::%ln and obsolete()', nodes)
>> + newinhibit = set()
>> + revs = [repo[node].rev() for node in nodes]
>> + phase = repo._phasecache.phase
>> + obsoletes = obsolete.getrevs(repo, 'obsolete')
>> + for anc in repo.changelog.ancestors(revs, inclusive=True,
>> + stopfunc=lambda rev: phase(repo, rev) ==
>> phases.public):
>> + if anc in obsoletes:
>> + newinhibit.add(repo[anc])
>> +
>>
>>
>> I must be missing something obvious: why can't this be written as:
>>
>> obsoletes = ...
>> for rev in repo.revs('::%ln and not public()'):
>> if rev in obsoletes:
>> ...
>>
>> I don't understand why stopfunc is needed and why a revset can't
>> do the same thing.
> The way "::X and not public()" works is it iterates down the
> ancestor line and checks each commit to see if it's not public.
> The iterator has no knowledge that finding something that is
> public means it can stop iterating, so it will iterate over all
> the commits.
>
> In theory we could try to make revsets smarter (i.e. by allowing
> the ancestor revset to communicate to the public revset that it's
> doing a descendant traversal, and allowing the public revset to
> signal a stop), but that's rather complex, and the implementation
> would probably end up adding the stopfunc to lazyancestors anyway
> (since that's how you'd likely implement it).
>
>
> I see. Thanks for explaining. And here I thought all that work in
> recent releases to drastically speed up "not public()" implemented
> things like this because it had to.
"not public()" is optimized, so it's quick. But testing a series of
commits against it is still O(size of series). In theory we could
optimize "::X and not public()" to iterate over the "not public()" set
instead of the ancestors, then test against the ancestor set (which is
cheaper), but that's still O(distance from first not-public commit),
which is still too slow in this case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20151108/c76a57dd/attachment.html>
More information about the Mercurial-devel
mailing list