[PATCH 1 of 2] templates: add {lasttag} and {lasttagdistance} keywords

Greg Ward greg-hg at gerg.ca
Fri Aug 28 14:17:16 CDT 2009


On Fri, Aug 28, 2009 at 10:18 AM, Mads Kiilerich<mads at kiilerich.com> wrote:
>> I really hope there is a way to do this that efficiently supports
>> large repositories.  I think it's a great idea and I'd like to see it
>> go in, but not if it uses O(N) time and memory (N being the number of
>> changesets in the whole repo).
>>
>
> If your repo has N changesets then the latest tag might be N parent-links
> away, so O(N) is a real upper bound which will be reached in some cases.

Granted: so it has to have O(N) runtime in the worst case.  But as
coded, it uses O(N) memory in *every* case.  And I think it also uses
O(N) runtime in every case (I didn't see a break or return in that
loop over all changesets).

As I said recently, repositories with 100,000+ changesets are reality
*today*, and they are not going to get any smaller as time goes by.
And one important component of optimization is picking algorithms that
scale well and don't *need* to be optimized for large datasets.

>> One final question, which is unclear from the comments and the code:
>> are you going by topological nearness or revision-count nearness?  I
>> think topological distance is the only sane way to do this, so I
>> really hope that's what you're doing.  The comments and docstrings
>> should be clearer.
>>
>
> Neither; it uses the nearest tag with the latest date, and the distance is
> the longest topological distance to that. Matt preferred 'most recent' in
> http://www.selenic.com/pipermail/mercurial-devel/2009-July/013866.html ,
> interpreted the way that the latest tag in the example is b.

Oh yeah, I forgot about that.  You guys clearly thought long and hard
about it at the time, so I have nothing further to add.  (Except that
possibly the comments could explain the motivation for the current
algorithm, which is non-trivial.)

Greg



More information about the Mercurial-devel mailing list