[PATCH] annotate: pre-calculate the "needed" dictionary

Jun Wu quark at fb.com
Sun Sep 4 16:50:25 EDT 2016


Good catch about both issues! I will prepare a V2.

Excerpts from Yuya Nishihara's message of 2016-09-05 00:21:45 +0900:
> On Fri, 2 Sep 2016 16:10:30 +0100, Jun Wu wrote:
> > # HG changeset patch
> > # User Jun Wu <quark at fb.com>
> > # Date 1472826059 -3600
> > #      Fri Sep 02 15:20:59 2016 +0100
> > # Node ID c9a9040c95add098eb8e280fa60edaa766154141
> > # Parent  d130a38ef41f3c9e2d2f26df3535d89a93f87301
> > # Available At https://bitbucket.org/quark-zju/hg-draft 
> > #              hg pull https://bitbucket.org/quark-zju/hg-draft  -r c9a9040c95ad
> > annotate: pre-calculate the "needed" dictionary
> > 
> > The "needed" dict is used as a reference counter to free items in the giant
> > "hist" dict. However, currently it is not very accurate and can lead to
> > dropping "hist" items unnecessarily, for example, with the following DAG,
> > 
> >       -3-
> >      /   \
> >   --1--2--4--
> > 
> > The current algorithm will visit and calculate rev 1 twice, undesired.
> 
> Also that would generate wrong result probably because hist[f] was deleted too
> early and pcache[f] was emptied. Compare "hg annotate mercurial/commands.py"
> for example.
> 
> Can you add a test case for that? This patch appears to fix old bug, so I think
> it's good for stable.
> 
> > +        # 1st DFS pre-calculates pcache and needed
> >          visit = [base]
> > -        hist = {}
> >          pcache = {}
> >          needed = {base: 1}
> >          while visit:
> >              f = visit[-1]
> > -            pcached = f in pcache
> > -            if not pcached:
> > -                pcache[f] = parents(f)
> > +            visit.pop()
> 
> Nit: could be f = visit.pop()
> 
> > +            pl = parents(f)
> > +            pcache[f] = pl
> > +            for p in pl:
> > +                if p not in needed:
> > +                    needed[p] = 1
> > +                else:
> > +                    needed[p] += 1
> > +                if p not in pcache:
> > +                    visit.append(p)
> 
> There can be more than one route to 'p' from 'visit[]', so 'p not in pcache'
> isn't enough to avoid unneeded calculation of 'parents(f)', and needed[p] can
> be over-counted.


More information about the Mercurial-devel mailing list