[PATCH 1 of 2] templates: add {lasttag} and {lasttagdistance} keywords

Fri Aug 28 07:55:30 CDT 2009

On Fri, Aug 28, 2009 at 3:38 AM, Gilles Moris<gilles.moris at free.fr> wrote:
> # HG changeset patch
> # User Gilles Moris <gilles.moris at free.fr>
> # Date 1247087563 -7200
> # Node ID 6fed19d04b0d859931837172ad974edb1e7b6b80
> # Parent  92a01636caa706bab02685ecea7a995436533f82
> templates: add {lasttag} and {lasttagdistance} keywords
>
> Useful to build nightly version names and see on which tag a rev is based on.
>
> Among tag ancestors, the one with the latest date, or with the longuest path if

Spelling: 'longuest' should be 'longest'.

> there are multiple pathes to the same tag, is chosen.

'pathes' -> 'paths'

> This way, lasttag does not depend on the push/pull order and matches the
> user's expectation, in particular in the case of a backout merge.
> Only global tags are considered. The local tags (including 'tip' and automatic
> MQ tags) are discarded.

I think 'ignored' would be better than 'discarded'.  And don't use
"The" here: just "Local tags (...) are ignored."

> +        def showlasttag(**args):
> +            return self.lasttagmap[ctx.rev()][2]
> +        def showlasttagdistance(**args):
> +            return self.lasttagmap[ctx.rev()][1]

I think I understand why you build the whole map: in case someone puts
{lasttag} in an 'hg log' command that shows many revisions.  But I
expect the common case to be 'hg tip' or 'hg parents', which usually
shows a single revision.

> +def computelasttagmap(repo):
> +    '''compute an array[rev] of latest global tag tuple, each tuple being
> +    (date of the latest tag, distance to that tag, the tag(s)).
> +    '''
> +    defvalue = 0, 0, 'null'
> +    ltmap = [defvalue] * len(repo)
> +
> +    for r in range(len(repo)):
> +        ctx = repo[r]

FAIL!  You've just imposed a huge penalty for anyone using this
feature in a large repository.  In my case, I've got 108,000
changesets.  So if I want to see {latesttag} for the tip, I have to
sit and wait while this code first builds a list with 108,000 items,
and then iterates over 108,000 changesets.  Not good.  And the memory
consumption will be considerable too.

One final question, which is unclear from the comments and the code:
are you going by topological nearness or revision-count nearness?  I
think topological distance is the only sane way to do this, so I
really hope that's what you're doing.  The comments and docstrings
should be clearer.

I really hope there is a way to do this that efficiently supports
large repositories.  I think it's a great idea and I'd like to see it
go in, but not if it uses O(N) time and memory (N being the number of
changesets in the whole repo).

Greg