[PATCH 1 of 3] hbisect.get: use simpler code

Mon Sep 19 18:34:21 CDT 2011

On Tue, 2011-09-20 at 01:16 +0200, Yann E. MORIN wrote:
> # HG changeset patch
> # User "Yann E. MORIN" <yann.morin.1998 at anciens.enib.fr>
> # Date 1316467030 -7200
> # Node ID 2dcb74f39e48c629ae2a7d68dbc6760b41d9bd15
> # Parent  353a1ba928f682a0a3e29bf54b90c7cdc2f98fef
> hbisect.get: use simpler code
> 
> Use repo.set() wherever possible, instead of locally trying to
> reproduce complex graph computations.
> 
> Also, use lists instead of sets, they're easier to work with as
> we need to pass some of them to repo.set() to use the shiny new
> 'l' format flag (Thx, Matt!).
> 
> The previous series went in too fast for this cleanup pass to be
> included, so here it is. ;-)
> 
> Signed-off-by: "Yann E. MORIN" <yann.morin.1998 at anciens.enib.fr>
> 
> diff --git a/mercurial/hbisect.py b/mercurial/hbisect.py
> --- a/mercurial/hbisect.py
> +++ b/mercurial/hbisect.py
> @@ -168,36 +168,22 @@
>          return [repo.changelog.rev(n) for n in state[status]]
>      else:
>          # Build sets of good, bad, and skipped csets
> -        goods = set(repo.changelog.rev(n) for n in state['good'])
> -        bads  = set(repo.changelog.rev(n) for n in state['bad'])
> -        skips = set(repo.changelog.rev(n) for n in state['skip'])
> -
> -        # Build kinship of good csets
> -        ga = goods.copy()   # Goods' Ancestors
> -        gd = goods.copy()   # Goods' Descendants
> -        for g in goods:
> -            ga |= set(repo.changelog.ancestors(g))
> -            gd |= set(repo.changelog.descendants(g))
> -
> -        # Build kinship of bad csets
> -        ba = bads.copy()    # Bads' Ancestors
> -        bd = bads.copy()    # Bads' Descendants
> -        for b in bads:
> -            ba |= set(repo.changelog.ancestors(b))
> -            bd |= set(repo.changelog.descendants(b))
> +        goods = get(repo, 'good')
> +        bads  = get(repo, 'bad')
> +        skips = get(repo, 'skip')
>  
>          # Build the range of the bisection
> -        range  = set(c for c in ba if c in gd)
> -        range |= set(c for c in ga if c in bd)
> +        range  = [r.rev() for r in repo.set('%ld::%ld', goods, bads)]
> +        range += [r.rev() for r in repo.set('%ld::%ld', bads, goods)]

In general, it's best to have one big revset query than to piece
together multiple results. Why? Because the revset engine has an
optimizer (much like a query planner in a database engine) that can
avoid doing extra work in a bunch of cases.

>          if status == 'range':
>              return [c for c in range]
>          elif status == 'pruned':
>              # We do not want skipped csets that are out-of-range
> -            return [c for c in range if c in (goods | bads | skips)]
> +            return [c for c in range if c in (goods + bads + skips)]

..and by the time we've gotten here, we've already potentially created a
huge set of context objects and then turned them into ints when we could
have just had a small set of contexts in the first place. So each of
these should be -a whole self-contained revset expression-.

>          elif status == 'untested':
>              # Return the csets in range that are not pruned
> -            return [c for c in range if not c in (goods | bads | skips)]
> +            return [c for c in range if not c in (goods + bads + skips)]
>  
>          else:
>              raise error.ParseError(_('invalid bisect state'))
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel

-- 
Mathematics is the supreme nostalgia of our time.