[PATCH 4 of 9 "] discovery: use a lower level but faster way to retrieve parents

Yuya Nishihara yuya at tcha.org
Thu Mar 7 18:09:41 EST 2019


On Thu, 7 Mar 2019 10:03:54 -0800, Gregory Szorc wrote:
> On Thu, Mar 7, 2019 at 9:54 AM Gregory Szorc <gregory.szorc at gmail.com>
> wrote:
> 
> >
> > On Thu, Mar 7, 2019 at 4:23 AM Pierre-Yves David <
> > pierre-yves.david at ens-lyon.org> wrote:
> >
> >>
> >>
> >> On 3/6/19 9:35 PM, Yuya Nishihara wrote:
> >> > On Tue, 05 Mar 2019 18:39:15 +0100, Pierre-Yves David wrote:
> >> >> # HG changeset patch
> >> >> # User Pierre-Yves David <pierre-yves.david at octobus.net>
> >> >> # Date 1551311787 -3600
> >> >> #      Thu Feb 28 00:56:27 2019 +0100
> >> >> # Node ID ffd98d208aa7f92e13bf233b6d752cd2d292ebbe
> >> >> # Parent  82035c1d714f8f3911632ea1271002745fc620f4
> >> >> # EXP-Topic discovery-speedup
> >> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/
> >> -r ffd98d208aa7
> >> >> discovery: use a lower level but faster way to retrieve parents
> >> >
> >> >> diff --git a/mercurial/setdiscovery.py b/mercurial/setdiscovery.py
> >> >> --- a/mercurial/setdiscovery.py
> >> >> +++ b/mercurial/setdiscovery.py
> >> >> @@ -165,6 +165,12 @@ class partialdiscovery(object):
> >> >>           # common.bases and all its ancestors
> >> >>           return self._common.basesheads()
> >> >>
> >> >> +    def _parentsgetter(self):
> >> >> +        getrev = self._repo.changelog.index.__getitem__
> >> >> +        def getparents(r):
> >> >> +            return getrev(r)[5:6]
> >> >> +        return getparents
> >> >
> >> > Queued, but it's probably better to use revlog._uncheckedparentrevs()
> >> than
> >> > peeking a raw index entry. Can you send a follow up?
> >>
> >> Unfortunately, the overhead of _uncheckedparentrevs is still quite
> >> significant. (Attribute access and exception handling)
> >>
> >> before:               66.831393 seconds (median of 3)
> >> my-change:            49.357107 seconds (median of 3)
> >> _uncheckedparentrevs: 58.941629 seconds (median of 3)
> >>
> >
> > These timings are bonkers and I am quite surprised that attribute access
> > and exception handling is adding that much overhead! Are we sure there
> > isn't a `repo.changelog` in a tight loop? Or maybe we're being grossly
> > inefficient about DAG traversals and missing some caching opportunities?
> >
> > I agree with Yuya that accessing the raw index entries is undesirable.
> > This extremely low-level interface will almost certainly not be supported
> > when we establish a formal storage interface for the changelog. If there
> > are performance concerns here, we *really* need an inline comment to that
> > effect. And it would be great if it pointed to a perf* command that could
> > help measure.
> >
> > I'll accept this. But without an inline comment and a documented way to
> > reproduce the performance problem, there's a very good chance we regress
> > this optimization when establishing a formal interface for changelog.
> > Please follow up to help mitigate this possibility.
> >
> 
> ... and the very next changeset in this series (c98420914c1 on committed)
> says "on our private pathological case, this speeds things up from about 53
> seconds to about 7.5 seconds." And 5baf06d2bb4 drops to 6.3s. So it was
> indeed a very inefficient algorithm plus lack of caching contributing to
> extreme amounts of overhead.
> 
> Is this micro optimization needed by the end state of the series? If not,
> perhaps it should be reverted or my original comment asking for a follow-up
> should be ignored. Whatever the outcome, a perf* command or more details on
> how to simulate this performance problem (such as the number of
> changesets/heads involved) would really help. Even 6.3s for discovery feels
> a bit steep and if we're serious about scaling, we need to be presented
> with esoteric use cases like your private repo to consider alternate ways
> of discovery.

I'm also curious about that.

FWIW, if the cost of running Python matters, maybe we can slightly optimize
_uncheckedparentrevs() assuming the given rev is a valid revlog revision.

In [13]: dis.dis(getparents)
  4           0 LOAD_DEREF               0 (getrev)
              3 LOAD_FAST                0 (r)
              6 CALL_FUNCTION            1
              9 LOAD_CONST               1 (5)
             12 LOAD_CONST               2 (6)
             15 SLICE+3
             16 RETURN_VALUE

Micro optimized: return self.index[rev][5:6]

[4]: dis.dis(repo.changelog._uncheckedparentrevs)
715           0 LOAD_FAST                0 (self)
              3 LOAD_ATTR                0 (index)
                                  # extra dict lookup compared to getparents()
              6 LOAD_FAST                1 (rev)
              9 BINARY_SUBSCR
             10 LOAD_CONST               1 (5)
             13 LOAD_CONST               2 (6)
             16 SLICE+3
             17 RETURN_VALUE

Original:
In [9]: dis.dis(repo.changelog._uncheckedparentrevs)
704           0 SETUP_EXCEPT            17 (to 20)

705           3 LOAD_FAST                0 (self)
              6 LOAD_ATTR                0 (index)
              9 LOAD_FAST                1 (rev)
             12 BINARY_SUBSCR
             13 STORE_FAST               2 (entry)
             16 POP_BLOCK
             17 JUMP_FORWARD            44 (to 64)

706     >>   20 DUP_TOP
             21 LOAD_GLOBAL              1 (IndexError)
             24 COMPARE_OP              10 (exception match)
             27 POP_JUMP_IF_FALSE       63
             30 POP_TOP
             31 POP_TOP
             32 POP_TOP
707          33 LOAD_FAST                1 (rev)
             36 LOAD_GLOBAL              2 (wdirrev)
             39 COMPARE_OP               2 (==)
             42 POP_JUMP_IF_FALSE       57
708          45 LOAD_GLOBAL              3 (error)
             48 LOAD_ATTR                4 (WdirUnsupported)
             51 RAISE_VARARGS            1
             54 JUMP_FORWARD             0 (to 57)
709     >>   57 RAISE_VARARGS            0
             60 JUMP_FORWARD             1 (to 64)
        >>   63 END_FINALLY

711     >>   64 LOAD_FAST                2 (entry)
             67 LOAD_CONST               1 (5)
             70 BINARY_SUBSCR
             71 LOAD_FAST                2 (entry)
             74 LOAD_CONST               2 (6)
             77 BINARY_SUBSCR
             78 BUILD_TUPLE              2
             81 RETURN_VALUE


More information about the Mercurial-devel mailing list