[PATCH] discovery: be more conservative when adjusting the sample size

Yuya Nishihara yuya at tcha.org
Wed Jun 5 19:33:36 EDT 2019


On Wed, 5 Jun 2019 17:02:36 +0200, Pierre-Yves David wrote:
> 
> 
> On 6/5/19 2:37 PM, Yuya Nishihara wrote:
> > On Wed, 05 Jun 2019 12:36:03 +0200, Pierre-Yves David wrote:
> >> # HG changeset patch
> >> # User Pierre-Yves David <pierre-yves.david at octobus.net>
> >> # Date 1559726605 -7200
> >> #      Wed Jun 05 11:23:25 2019 +0200
> >> # Node ID 2c67430451fafcdd68770436c520e2d008428986
> >> # Parent  12793787439538411013edffe0f9b98762d38a37
> >> # EXP-Topic discovery-large-undecided
> >> # Available At https://bitbucket.org/octobus/mercurial-devel/
> >> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 2c67430451fa
> >> discovery: be more conservative when adjusting the sample size
> >>
> >> Since 5b34972a0094, the discovery will increase the sample size when it detect a
> >> "complex" undecided set. However this detection focussed on the number of roots
> >> only, this could regress discovery performance when the undecided set has many
> >> roots that eventually get merged into a few heads.
> >>
> >> To prevent such misbehavior, we adjust the logic to take in account both heads
> >> and roots. The sample size will be increased only if both are especially large.
> >>
> >> Performance testing on the same case as 5b34972a0094, does not show a
> >> significant difference.
> >>
> >> diff --git a/mercurial/setdiscovery.py b/mercurial/setdiscovery.py
> >> --- a/mercurial/setdiscovery.py
> >> +++ b/mercurial/setdiscovery.py
> >> @@ -236,14 +236,19 @@ class partialdiscovery(object):
> >>           sample = set(repo.revs('heads(%ld)', revs))
> >>           parentrevs = self._parentsgetter()
> >>   
> >> +        ### update from both directions
> >> +        revsroots = set(repo.revs('roots(%ld)', revs))
> >> +        nbroots = len(revsroots)
> >> +        revsheads = sample.copy()
> >> +        nbheads = len(revsheads)
> >> +
> >> +        if not self._respectsize:
> >> +            size = max(size, min(nbroots, nbheads))
> >> +
> >>           # update from heads
> >> -        revsheads = sample.copy()
> >>           _updatesample(revs, revsheads, sample, parentrevs)
> >>   
> >>           # update from roots
> >> -        revsroots = set(repo.revs('roots(%ld)', revs))
> >> -        if not self._respectsize:
> >> -            size = max(size, len(revsroots))
> > 
> > Looks good, but any reason to move these codes around? IIUC, _updatesample()
> > never mutates revsheads.
> 
> The update to size need to happen before both `_updatesample(…,revheads, 
> …)` and `_updatesample(…,revroots, …)` so the code has to move higher. 
> For consistency I am gathering them all in one place.
> 
> Does this answer your question ?

Not really. The size isn't used until _limitsample(). Anyway, that isn't
important, but I felt some orphaned comment looked weird.


More information about the Mercurial-devel mailing list