[PATCH 4 of 5] topoiter: support for non-contiguous revset
Pierre-Yves David
pierre-yves.david at ens-lyon.org
Thu Nov 20 02:10:46 CST 2014
On 11/19/2014 10:49 PM, Martin von Zweigbergk wrote:
> Sorry, just some naive comments and questions below.
>
> On Tue Nov 18 2014 at 3:57:18 PM Pierre-Yves David
> <pierre-yves.david at ens-lyon.org <mailto:pierre-yves.david at ens-lyon.org>>
> wrote:
>
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david at fb.com
> <mailto:pierre-yves.david at fb.com>>
> # Date 1409850336 -7200
> # Thu Sep 04 19:05:36 2014 +0200
> # Node ID 026932aa1c4f05cf930c5853ae8946__36399de111
> # Parent ee131567469924b1642956ad5adfa9__f6122ba6d3
> topoiter: support for non-contiguous revset
>
> The algorithm now work when some revision are skipped. We now use "first
> included ancestors" instead of just "parent" to link changesets with
> each other.
>
> diff --git a/mercurial/graphmod.py b/mercurial/graphmod.py
> --- a/mercurial/graphmod.py
> +++ b/mercurial/graphmod.py
> @@ -18,17 +18,17 @@ Data depends on type.
> """
>
> from mercurial.node import nullrev
> import util
>
> +import heapq
> +
> CHANGESET = 'C'
>
> def topoiter(revs, parentsfunc):
> """topologically iter over a set of revision, one branch at a
> time.
>
> - Currently does not handle non contigious <revs> input.
> -
> Currently consider every changeset under a merge to be on the
> same branch
> using revision number to sort them.
>
> Could be easily extend to give priority to an initial branch.
>
> @@ -89,20 +89,31 @@ def topoiter(revs, parentsfunc):
> # and the <blocked> set contains the parents revisions of
> already emitted
> # revision.
> #
> # You could pre-seed the <parents> set of groups[0] to a specific
> # changesets to select what the first emitted branch should be.
> - #
> - # We do not support revisions will hole yet, but adding such
> support
> - # would be easy. The iteration will have to be done using both
> input
> - # revision and parents (see cl.ancestors function + a few
> tweaks) but
> - # only revisions parts of the initial set should be emitted.
> groups = [([], unblocked)]
> - for current in revs:
> - if True:
> - # Look for a group awaiting on the current revision.
> - matching = [i for i, g in enumerate(groups) if current
> in g[1]]
> + pendingheap = []
> + pendingset = set()
> +
> + heapq.heapify(pendingheap)
>
>
> Not really necessary since pendingheap is already a valid heap, but I'm
> new to Python so I don't know what's considered best practice.
Yeah but better safe than sorry. It does not hurt to be over explicit
about this being a heap.
> + heappop = heapq.heappop
> + heappush = heapq.heappush
>
>
> Is this functionally equivalent to "from heapq import heappop" etc?
> Faster? You want the smaller scope?
It equivalent.
It is faster too because python is bad at lookup.
>
> + for currentrev in revs:
> + # heap works with smallest element, we want highest so we
> invert
> + if currentrev not in pendingset:
> + heappush(pendingheap, -currentrev)
> + pendingset.add(currentrev)
> + # iterates on pending rev until after the current rev have been
> + # processeed.
> + rev = None
> + while rev != currentrev:
> + rev = -heappop(pendingheap)
> + pendingset.remove(rev)
> +
> + # seek for a group awaiting on the current revision.
>
>
> Should 'Look' have been 'seek' in 1/5 instead? Or did you do it on
> purpose to make the patch easier to read?
It should have been seek. (Not super critical)
>
> + matching = [i for i, g in enumerate(groups) if rev in g[1]]
>
> if matching:
> # The main idea is to gather together all set that
> await on the
> # same revision.
> #
> @@ -128,23 +139,29 @@ def topoiter(revs, parentsfunc):
> for i in reversed(matching):
> del groups[i]
> else:
> # his is a new head we create a new group for it.
> targetidx = len(groups)
> - groups.append(([], set([current])))
> + groups.append(([], set([rev])))
>
>
> Would s/current/rev/ had made sense in 1/5 to reduce this patch a bit?
No, rev is something we want emitted, current is either rev or a skipped
parent of rev.
>
>
> gr = groups[targetidx]
>
> # We now adds the current nodes to this groups. This
> is done after
> # the group merging because all elements from a group
> that relied
> # on this rev must preceed it.
> #
> # we also update the <parents> set to includes the
> parents on the
> # new nodes.
> - gr[0].append(current)
> - gr[1].remove(current)
> - gr[1].update([p for p in parentsfunc(current) if p >
> nullrev])
> + if rev == currentrev: # only display stuff in rev
> + gr[0].append(rev)
> + gr[1].remove(rev)
> + parents = [p for p in parentsfunc(rev) if p > nullrev]
>
> + gr[1].update(parents)
> + for p in parents:
> + if p not in pendingset:
> + pendingset.add(p)
> + heappush(pendingheap, -p)
>
> # Look for a group to display
> #
> # When unblocked is empty (if clause), We are not
> waiting over any
> # revision during the first iteration (if no priority
> was given) or
> @@ -173,10 +190,15 @@ def topoiter(revs, parentsfunc):
> # unless it is group[0] in which case you just
> empty it.
> if targetidx:
> del groups[targetidx]
> else:
> gr[0][:] = []
> + # Check if we have some group waiting for revision we are not
> going to
> + # iterate over
> + for g in groups:
> + for r in g[0]:
> + yield r
>
> def dagwalker(repo, revs):
> """cset DAG generator yielding (id, CHANGESET, ctx,
> [parentids]) tuples
>
> This generator function walks through revisions (which should
> be ordered
> diff --git a/tests/test-glog-topological.__t
> b/tests/test-glog-topological.__t
> --- a/tests/test-glog-topological.__t
> +++ b/tests/test-glog-topological.__t
> @@ -35,10 +35,13 @@ later.
> | |
> o | 1
> |/
> o 0
>
> +
> +(display all nodes)
> +
> $ hg --config experimental.graph-__topological=1 log -G
> o 8
> |
> o 3
> |
> @@ -54,5 +57,24 @@ later.
> | |
> | o 4
> |/
> o 0
>
> +
> +(revset skipping nodes)
> +
> + $ hg --config experimental.graph-__topological=1 log -G --rev
> 'not (2+6)'
>
>
> Sorry, but I didn't quite understand the algorithm. Would it also work
> well and efficiently with e.g. 'hg log hgext/'? I don't know enough
> about revsets, I suppose.
Sure it just work with a list of revs and parentrev function. The revset
used for 'hgext/' should be lazy so we are all good.
--
Pierre-Yves David
More information about the Mercurial-devel
mailing list