Enriching a file log by branches, tags and bookmarks

Tue Apr 7 18:25:58 CDT 2015

> On 7 Apr 2015, at 14:55, Marc Strapetz <marc.strapetz at syntevo.com> wrote:
> 
> I'm taking this thread from the main list to the developers list:
> 
> I'm looking for a way to enrich a file log (or sub-tree log) by branches, tags and bookmarks: every such "tag" should be displayed at the "closest" commit of the file log, i.e. at that commit which contains the file in identical state (content).
> 
> Here is an example of how this should look like for setup.py of the Mercurial repository (note that this repository is quite outdated, so most recent tags do not show up):
> 
> http://i.imgur.com/AYt885I.png
> 
> Using a loop of revset queries gives me the information I'm looking for, but is too inefficient (pseudo-code):
> 
> $ for all $X in tags(): hg log -r "max(ancestors(tagged(X)) and
> file(setup.py))" -T "$X {rev}\n"
> 
> For this approach, e.g. ancestors() as well as file(setup.py) is run for every tag (N times), while I'm having in mind a solution which would traverse the commit hierarchy only once or twice, possibly increasing memory usage by O(N) for the sake of constant running time. (Is that reasonable for large repositories, like for the CPython repository?)
> 
> Unfortunately my Python and Mercurial knowledge are quite limited and hence I'm looking here for someone who is able to write an extension to solve this problem on a work-for-hire basis.
> 
> -Marc
> 
> 

Below is an extension that horribly abuses revsets to do something like what you are asking for. I’ve only tested it with hg 3.2.3. I hope that with the lazy revset functions in recent versions of mercurial it is fairly efficient. In particular, that iterating over "sort(::n, -rev)” is an efficient way to travel down the ancestry for a revision.

It’s a horrible abuse because there’s a revset function that doesn’t actually do any filtering or searching. Instead, it loads the tags from the repository, then looks through the revisions passed to it to find the last ancestor of each tagged revision. I called it “pushtags” because it pushes the tags down to the revision that is going to appear in the output, but this is an awful name because “push” normally means something very different in mercurial.

Once the pushtags revset function has done its work, you can extract the results using the “pushedtags” template function. Here’s some example usage:

$: hg log -r 'pushtags(file("setup.py"))' --template '{pad(rev, 8)} {pushedtags}\n' -l 12
0        0.4c
16       0.4d
63       0.4e 0.4f
67
72
152
155
157
188      0.5
193      0.5b
218
240

$: hg log -G -r 'pushtags(file("mercurial/util.py"))' --template '{pad(rev, 8)} {pushedtags}\n' -l 12
o    24605    tip
|\
| o  24439
|/|
| o  24236
|/|
| o    24188
|/|\
o---+  24164    3.3.2 3.3.3
 / /
| o  24155    3.3.1
|/
o  23917    3.3-rc 3.3
|
o  23899
|
o  23864
|
o  23832
|
o  23789
|
o  23543
|

I’d love to hear of a better way of doing this - I think it would be a useful feature.

Hope that helps,

Simon

# --------------------------------------------

from mercurial import revset, util, templatekw

def pushtags(repo, subset, x):
    """``pushtags(set)``
    "push" each tag in the repository down to the last revision in
    subset that is an ancestor of the tagged revision.

    This revset is useless by itself - it must be used in combination
    with the `pushedtags` template keyword.
    """
    # For each tag, iterate down the ancestors until we find a
    # revision that is in subset. Start with the earliest tag and
    # cache the "last tagged ancestor" for every revision we visit.
    s = revset.getset(repo, subset, x)
    cl = repo.changelog
    tags = [(cl.rev(node), tag) for (tag, node) in repo.tags().items()]
    tags.sort()
    cache = {}
    tagmap = {}
    for rev, tag in tags:
        visited = []
        for ancestorrev in repo.revs('sort(::%d, -rev)', rev):
            visited.append(ancestorrev)
            if ancestorrev in cache:
                # we've already resolved this rev
                tagmap[tag] = cache[ancestorrev]
            elif ancestorrev in s:
                tagmap[tag] = ancestorrev
            else:
                continue
            # cache the revision we found as the winning ancestor for
            # every revision we visited, so that if we visit them in
            # the future we can reuse the answer
            winner = tagmap[tag]
            for rev in visited:
                cache[rev] = winner
            break

    repo._pushedtags = {}
    for (rev, tag) in tags:
        if tag in tagmap:
            repo._pushedtags.setdefault(tagmap[tag], []).append(tag)

    return s

revset.symbols.update({'pushtags': pushtags})

def pushedtagskw(**kwargs):
    repo, ctx = kwargs['repo'], kwargs['ctx']
    pushedtags = getattr(repo, '_pushedtags', None)
    if pushedtags is None:
        raise util.Abort('pushedtags template keyword requires '
                         'pushtags revset')

    tags = pushedtags.get(ctx.rev(), [])
    if tags:
        return templatekw.showlist('pushedtag', tags, **kwargs)

templatekw.keywords['pushedtags'] = pushedtagskw

# --------------------------------------------