[PATCH 1 of 3] hbisect.get: use simpler code

Tue Sep 20 11:49:11 CDT 2011

Matt, All,

On Tuesday 20 September 2011 01:34:21 Matt Mackall wrote:
> On Tue, 2011-09-20 at 01:16 +0200, Yann E. MORIN wrote:
> > # HG changeset patch
> > # User "Yann E. MORIN" <yann.morin.1998 at anciens.enib.fr>
> > # Date 1316467030 -7200
> > # Node ID 2dcb74f39e48c629ae2a7d68dbc6760b41d9bd15
> > # Parent  353a1ba928f682a0a3e29bf54b90c7cdc2f98fef
> > hbisect.get: use simpler code
> > 
> > Use repo.set() wherever possible, instead of locally trying to
> > reproduce complex graph computations.
> > 
> > Also, use lists instead of sets, they're easier to work with as
> > we need to pass some of them to repo.set() to use the shiny new
> > 'l' format flag (Thx, Matt!).
> > 
> > The previous series went in too fast for this cleanup pass to be
> > included, so here it is. ;-)
> > 
> > Signed-off-by: "Yann E. MORIN" <yann.morin.1998 at anciens.enib.fr>
> > 
> > diff --git a/mercurial/hbisect.py b/mercurial/hbisect.py
> > --- a/mercurial/hbisect.py
> > +++ b/mercurial/hbisect.py
> > @@ -168,36 +168,22 @@
> >          return [repo.changelog.rev(n) for n in state[status]]
> >      else:
> >          # Build sets of good, bad, and skipped csets
> > -        goods = set(repo.changelog.rev(n) for n in state['good'])
> > -        bads  = set(repo.changelog.rev(n) for n in state['bad'])
> > -        skips = set(repo.changelog.rev(n) for n in state['skip'])
> > -
> > -        # Build kinship of good csets
> > -        ga = goods.copy()   # Goods' Ancestors
> > -        gd = goods.copy()   # Goods' Descendants
> > -        for g in goods:
> > -            ga |= set(repo.changelog.ancestors(g))
> > -            gd |= set(repo.changelog.descendants(g))
> > -
> > -        # Build kinship of bad csets
> > -        ba = bads.copy()    # Bads' Ancestors
> > -        bd = bads.copy()    # Bads' Descendants
> > -        for b in bads:
> > -            ba |= set(repo.changelog.ancestors(b))
> > -            bd |= set(repo.changelog.descendants(b))
> > +        goods = get(repo, 'good')
> > +        bads  = get(repo, 'bad')
> > +        skips = get(repo, 'skip')
> >  
> >          # Build the range of the bisection
> > -        range  = set(c for c in ba if c in gd)
> > -        range |= set(c for c in ga if c in bd)
> > +        range  = [r.rev() for r in repo.set('%ld::%ld', goods, bads)]
> > +        range += [r.rev() for r in repo.set('%ld::%ld', bads, goods)]
> 
> In general, it's best to have one big revset query than to piece
> together multiple results. Why? Because the revset engine has an
> optimizer (much like a query planner in a database engine) that can
> avoid doing extra work in a bunch of cases.
> 
> >          if status == 'range':
> >              return [c for c in range]
> >          elif status == 'pruned':
> >              # We do not want skipped csets that are out-of-range
> > -            return [c for c in range if c in (goods | bads | skips)]
> > +            return [c for c in range if c in (goods + bads + skips)]
> 
> ..and by the time we've gotten here, we've already potentially created a
> huge set of context objects and then turned them into ints when we could
> have just had a small set of contexts in the first place. So each of
> these should be -a whole self-contained revset expression-.

OK, I see what you mean.

I have changed the code to use repo.set() as much as possible, and it is
indeed faster in most cases. I'm testing with the DAG in test-bisect2.t,
so it is a rather small one, but I noticed some 2s -> 0.3s speed up to
compute the 'untested' set.

In some cases, though, the template passed to repo.set() is really complex,
so I'll try to use the %s format to use pretty-looking calls.

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'