[PATCH 1 of 2] templates: add {lasttag} and {lasttagdistance} keywords

Mads Kiilerich mads at kiilerich.com
Fri Aug 28 09:18:28 CDT 2009


On 08/28/2009 02:55 PM, Greg Ward wrote:
> On Fri, Aug 28, 2009 at 3:38 AM, Gilles Moris<gilles.moris at free.fr>  wrote:
>    
>> # HG changeset patch
>> # User Gilles Moris<gilles.moris at free.fr>
>> # Date 1247087563 -7200
>> # Node ID 6fed19d04b0d859931837172ad974edb1e7b6b80
>> # Parent  92a01636caa706bab02685ecea7a995436533f82
>> templates: add {lasttag} and {lasttagdistance} keywords
>>      

As you guys might have seen I have worked over this several times 
together with Gilles. Matt had some initial comments which has been 
followed, so I think the patch is waiting for further comments or his 
blessing.

My primary concert is that the term "last" is used but explained as 
"latest" in comments and documentation. I would rather use two letters 
more and make it self-documenting ;-)

>> +        def showlasttag(**args):
>> +            return self.lasttagmap[ctx.rev()][2]
>> +        def showlasttagdistance(**args):
>> +            return self.lasttagmap[ctx.rev()][1]
>>      
> I think I understand why you build the whole map: in case someone puts
> {lasttag} in an 'hg log' command that shows many revisions.  But I
> expect the common case to be 'hg tip' or 'hg parents', which usually
> shows a single revision.
>
>    
>> +def computelasttagmap(repo):
>> +    '''compute an array[rev] of latest global tag tuple, each tuple being
>> +    (date of the latest tag, distance to that tag, the tag(s)).
>> +    '''
>> +    defvalue = 0, 0, 'null'
>> +    ltmap = [defvalue] * len(repo)
>> +
>> +    for r in range(len(repo)):
>> +        ctx = repo[r]
>>      
> FAIL!  You've just imposed a huge penalty for anyone using this
> feature in a large repository.  In my case, I've got 108,000
> changesets.  So if I want to see {latesttag} for the tip, I have to
> sit and wait while this code first builds a list with 108,000 items,
> and then iterates over 108,000 changesets.  Not good.  And the memory
> consumption will be considerable too.
>    
...
> I really hope there is a way to do this that efficiently supports
> large repositories.  I think it's a great idea and I'd like to see it
> go in, but not if it uses O(N) time and memory (N being the number of
> changesets in the whole repo).
>    

If your repo has N changesets then the latest tag might be N 
parent-links away, so O(N) is a real upper bound which will be reached 
in some cases.

But I share your concern. An 
O(lengh-of-longest-parent-path-until-a-tag-is-met) could be used, but 
working backwards makes it harder to gain the same performance when 
lasttag of multiple tags is requested.

I am quite sure I posted a suggestion of an algorithm which worked from 
a changeset and backwards through it parents. I can however not find it 
anywhere... IIRC the conclusion was that going backwards through parents 
were much slower than going forwards, so the current approach was preferred.

> One final question, which is unclear from the comments and the code:
> are you going by topological nearness or revision-count nearness?  I
> think topological distance is the only sane way to do this, so I
> really hope that's what you're doing.  The comments and docstrings
> should be clearer.
>    

Neither; it uses the nearest tag with the latest date, and the distance 
is the longest topological distance to that. Matt preferred 'most 
recent' in 
http://www.selenic.com/pipermail/mercurial-devel/2009-July/013866.html , 
interpreted the way that the latest tag in the example is b.

I don't like that the date is used that way, but it very seldom makes 
any difference so I really don't care. If tags are used on multiple 
branches then no algorithm will be able to decide which tag is 
considered most important anyway.

/Mads



More information about the Mercurial-devel mailing list