[PATCH 4 of 9 "] discovery: use a lower level but faster way to retrieve parents

Pierre-Yves David pierre-yves.david at ens-lyon.org
Thu Mar 7 13:05:28 EST 2019



On 3/7/19 6:54 PM, Gregory Szorc wrote:
> 
> On Thu, Mar 7, 2019 at 4:23 AM Pierre-Yves David 
> <pierre-yves.david at ens-lyon.org <mailto:pierre-yves.david at ens-lyon.org>> 
> wrote:
> 
> 
> 
>     On 3/6/19 9:35 PM, Yuya Nishihara wrote:
>      > On Tue, 05 Mar 2019 18:39:15 +0100, Pierre-Yves David wrote:
>      >> # HG changeset patch
>      >> # User Pierre-Yves David <pierre-yves.david at octobus.net
>     <mailto:pierre-yves.david at octobus.net>>
>      >> # Date 1551311787 -3600
>      >> #      Thu Feb 28 00:56:27 2019 +0100
>      >> # Node ID ffd98d208aa7f92e13bf233b6d752cd2d292ebbe
>      >> # Parent  82035c1d714f8f3911632ea1271002745fc620f4
>      >> # EXP-Topic discovery-speedup
>      >> # Available At https://bitbucket.org/octobus/mercurial-devel/
>      >> #              hg pull
>     https://bitbucket.org/octobus/mercurial-devel/ -r ffd98d208aa7
>      >> discovery: use a lower level but faster way to retrieve parents
>      >
>      >> diff --git a/mercurial/setdiscovery.py b/mercurial/setdiscovery.py
>      >> --- a/mercurial/setdiscovery.py
>      >> +++ b/mercurial/setdiscovery.py
>      >> @@ -165,6 +165,12 @@ class partialdiscovery(object):
>      >>           # common.bases and all its ancestors
>      >>           return self._common.basesheads()
>      >>
>      >> +    def _parentsgetter(self):
>      >> +        getrev = self._repo.changelog.index.__getitem__
>      >> +        def getparents(r):
>      >> +            return getrev(r)[5:6]
>      >> +        return getparents
>      >
>      > Queued, but it's probably better to use
>     revlog._uncheckedparentrevs() than
>      > peeking a raw index entry. Can you send a follow up?
> 
>     Unfortunately, the overhead of _uncheckedparentrevs is still quite
>     significant. (Attribute access and exception handling)
> 
>     before:               66.831393 seconds (median of 3)
>     my-change:            49.357107 seconds (median of 3)
>     _uncheckedparentrevs: 58.941629 seconds (median of 3)
> 
> 
> These timings are bonkers and I am quite surprised that attribute access 
> and exception handling is adding that much overhead! Are we sure there 
> isn't a `repo.changelog` in a tight loop? Or maybe we're being grossly 
> inefficient about DAG traversals and missing some caching opportunities?

I used the very same code has the current changesets but only changed 
the _parentgetter function to:

+    def _parentsgetter(self):
+        return self._repo.changelog._uncheckedparentrevs
+

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list