[PATCH] graphmod: restore generator nature of dagwalker

Idan Kamara idankk86 at gmail.com
Sat Apr 30 07:11:01 CDT 2011


# HG changeset patch
# User Idan Kamara <idankk86 at gmail.com>
# Date 1304165458 -10800
# Node ID 61fe0686f5da8736d1cc87618c6137d7234434f2
# Parent  bcfe78c3d15c928cd0ab80625c640c81a0ae7d62
graphmod: restore generator nature of dagwalker

9966c95b8c4f introduced the ability to walk the DAG
given arbitrary revisions, but changed the behaviour of
it to return a list of all nodes (and create a changectx
for each one) rather than doing it lazily.

This has a pretty significant impact on performance for large
repositories (tested on CPython repo, with output disabled):

  $ time hg glog

  real	0m2.642s
  user	0m2.560s
  sys	0m0.080s

Before 9966c95b8c4f:

  $ time hg glog

  real	0m0.143s
  user	0m0.112s
  sys	0m0.032s

And after this fix:

  $ time hg glog

  real	0m0.213s
  user	0m0.184s
  sys	0m0.028s

diff -r bcfe78c3d15c -r 61fe0686f5da mercurial/graphmod.py
--- a/mercurial/graphmod.py	Sat Apr 30 12:56:28 2011 +0200
+++ b/mercurial/graphmod.py	Sat Apr 30 15:10:58 2011 +0300
@@ -31,39 +31,27 @@
     returned.
     """
     if not revs:
-        return []
-
-    ns = [repo[r].node() for r in revs]
-    revdag = list(nodes(repo, ns))
+        return
 
     cl = repo.changelog
     lowestrev = min(revs)
     gpcache = {}
-    leafs = {}
 
-    for i, (id, c, ctx, parents) in enumerate(revdag):
+    for rev in revs:
+        ctx = repo[rev]
+        parents = sorted(set([p.rev() for p in ctx.parents() if p.rev() in revs]))
         mpars = [p.rev() for p in ctx.parents() if
                  p.rev() != nullrev and p.rev() not in parents]
-        grandparents = []
 
         for mpar in mpars:
             gp = gpcache.get(mpar) or grandparent(cl, lowestrev, revs, mpar)
             gpcache[mpar] = gp
             if gp is None:
-                leafs.setdefault(mpar, []).append((i, ctx))
-            else:
-                grandparents.append(gp)
+                parents.append(mpar)
+            elif gp not in parents:
+                parents.append(gp)
 
-        if grandparents:
-            for gp in grandparents:
-                if gp not in revdag[i][3]:
-                    revdag[i][3].append(gp)
-
-    for parent, leafs in leafs.iteritems():
-        for i, ctx in leafs:
-            revdag[i][3].append(parent)
-
-    return revdag
+        yield (ctx.rev(), CHANGESET, ctx, parents)
 
 def nodes(repo, nodes):
     """cset DAG generator yielding (id, CHANGESET, ctx, [parentids]) tuples


More information about the Mercurial-devel mailing list