[PATCH 4 of 9 "] discovery: use a lower level but faster way to retrieve parents

Gregory Szorc gregory.szorc at gmail.com
Thu Mar 7 12:54:11 EST 2019


On Thu, Mar 7, 2019 at 4:23 AM Pierre-Yves David <
pierre-yves.david at ens-lyon.org> wrote:

>
>
> On 3/6/19 9:35 PM, Yuya Nishihara wrote:
> > On Tue, 05 Mar 2019 18:39:15 +0100, Pierre-Yves David wrote:
> >> # HG changeset patch
> >> # User Pierre-Yves David <pierre-yves.david at octobus.net>
> >> # Date 1551311787 -3600
> >> #      Thu Feb 28 00:56:27 2019 +0100
> >> # Node ID ffd98d208aa7f92e13bf233b6d752cd2d292ebbe
> >> # Parent  82035c1d714f8f3911632ea1271002745fc620f4
> >> # EXP-Topic discovery-speedup
> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/
> -r ffd98d208aa7
> >> discovery: use a lower level but faster way to retrieve parents
> >
> >> diff --git a/mercurial/setdiscovery.py b/mercurial/setdiscovery.py
> >> --- a/mercurial/setdiscovery.py
> >> +++ b/mercurial/setdiscovery.py
> >> @@ -165,6 +165,12 @@ class partialdiscovery(object):
> >>           # common.bases and all its ancestors
> >>           return self._common.basesheads()
> >>
> >> +    def _parentsgetter(self):
> >> +        getrev = self._repo.changelog.index.__getitem__
> >> +        def getparents(r):
> >> +            return getrev(r)[5:6]
> >> +        return getparents
> >
> > Queued, but it's probably better to use revlog._uncheckedparentrevs()
> than
> > peeking a raw index entry. Can you send a follow up?
>
> Unfortunately, the overhead of _uncheckedparentrevs is still quite
> significant. (Attribute access and exception handling)
>
> before:               66.831393 seconds (median of 3)
> my-change:            49.357107 seconds (median of 3)
> _uncheckedparentrevs: 58.941629 seconds (median of 3)
>

These timings are bonkers and I am quite surprised that attribute access
and exception handling is adding that much overhead! Are we sure there
isn't a `repo.changelog` in a tight loop? Or maybe we're being grossly
inefficient about DAG traversals and missing some caching opportunities?

I agree with Yuya that accessing the raw index entries is undesirable. This
extremely low-level interface will almost certainly not be supported when
we establish a formal storage interface for the changelog. If there are
performance concerns here, we *really* need an inline comment to that
effect. And it would be great if it pointed to a perf* command that could
help measure.

I'll accept this. But without an inline comment and a documented way to
reproduce the performance problem, there's a very good chance we regress
this optimization when establishing a formal interface for changelog.
Please follow up to help mitigate this possibility.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20190307/5fe5e78d/attachment.html>


More information about the Mercurial-devel mailing list