[PATCH 1 of 3] hbisect.get: use simpler code
Yann E. MORIN
yann.morin.1998 at anciens.enib.fr
Tue Sep 20 11:49:11 CDT 2011
Matt, All,
On Tuesday 20 September 2011 01:34:21 Matt Mackall wrote:
> On Tue, 2011-09-20 at 01:16 +0200, Yann E. MORIN wrote:
> > # HG changeset patch
> > # User "Yann E. MORIN" <yann.morin.1998 at anciens.enib.fr>
> > # Date 1316467030 -7200
> > # Node ID 2dcb74f39e48c629ae2a7d68dbc6760b41d9bd15
> > # Parent 353a1ba928f682a0a3e29bf54b90c7cdc2f98fef
> > hbisect.get: use simpler code
> >
> > Use repo.set() wherever possible, instead of locally trying to
> > reproduce complex graph computations.
> >
> > Also, use lists instead of sets, they're easier to work with as
> > we need to pass some of them to repo.set() to use the shiny new
> > 'l' format flag (Thx, Matt!).
> >
> > The previous series went in too fast for this cleanup pass to be
> > included, so here it is. ;-)
> >
> > Signed-off-by: "Yann E. MORIN" <yann.morin.1998 at anciens.enib.fr>
> >
> > diff --git a/mercurial/hbisect.py b/mercurial/hbisect.py
> > --- a/mercurial/hbisect.py
> > +++ b/mercurial/hbisect.py
> > @@ -168,36 +168,22 @@
> > return [repo.changelog.rev(n) for n in state[status]]
> > else:
> > # Build sets of good, bad, and skipped csets
> > - goods = set(repo.changelog.rev(n) for n in state['good'])
> > - bads = set(repo.changelog.rev(n) for n in state['bad'])
> > - skips = set(repo.changelog.rev(n) for n in state['skip'])
> > -
> > - # Build kinship of good csets
> > - ga = goods.copy() # Goods' Ancestors
> > - gd = goods.copy() # Goods' Descendants
> > - for g in goods:
> > - ga |= set(repo.changelog.ancestors(g))
> > - gd |= set(repo.changelog.descendants(g))
> > -
> > - # Build kinship of bad csets
> > - ba = bads.copy() # Bads' Ancestors
> > - bd = bads.copy() # Bads' Descendants
> > - for b in bads:
> > - ba |= set(repo.changelog.ancestors(b))
> > - bd |= set(repo.changelog.descendants(b))
> > + goods = get(repo, 'good')
> > + bads = get(repo, 'bad')
> > + skips = get(repo, 'skip')
> >
> > # Build the range of the bisection
> > - range = set(c for c in ba if c in gd)
> > - range |= set(c for c in ga if c in bd)
> > + range = [r.rev() for r in repo.set('%ld::%ld', goods, bads)]
> > + range += [r.rev() for r in repo.set('%ld::%ld', bads, goods)]
>
> In general, it's best to have one big revset query than to piece
> together multiple results. Why? Because the revset engine has an
> optimizer (much like a query planner in a database engine) that can
> avoid doing extra work in a bunch of cases.
>
> > if status == 'range':
> > return [c for c in range]
> > elif status == 'pruned':
> > # We do not want skipped csets that are out-of-range
> > - return [c for c in range if c in (goods | bads | skips)]
> > + return [c for c in range if c in (goods + bads + skips)]
>
> ..and by the time we've gotten here, we've already potentially created a
> huge set of context objects and then turned them into ints when we could
> have just had a small set of contexts in the first place. So each of
> these should be -a whole self-contained revset expression-.
OK, I see what you mean.
I have changed the code to use repo.set() as much as possible, and it is
indeed faster in most cases. I'm testing with the DAG in test-bisect2.t,
so it is a rather small one, but I noticed some 2s -> 0.3s speed up to
compute the 'untested' set.
In some cases, though, the template passed to repo.set() is really complex,
so I'll try to use the %s format to use pretty-looking calls.
Regards,
Yann E. MORIN.
--
.-----------------.--------------------.------------------.--------------------.
| Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ |
| +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. |
'------------------------------^-------^------------------^--------------------'
More information about the Mercurial-devel
mailing list