[PATCH 1 of 2] templates: add {lasttag} and {lasttagdistance} keywords

Steve Borho steve at borho.org
Sun Oct 4 17:43:40 CDT 2009


On Fri, Aug 28, 2009 at 4:32 PM, Gilles Moris <gilles.moris at free.fr> wrote:

> On Fri August 28 2009 21:17:16 Greg Ward wrote:
> > On Fri, Aug 28, 2009 at 10:18 AM, Mads Kiilerich<mads at kiilerich.com>
> wrote:
> > >> I really hope there is a way to do this that efficiently supports
> > >> large repositories.  I think it's a great idea and I'd like to see it
> > >> go in, but not if it uses O(N) time and memory (N being the number of
> > >> changesets in the whole repo).
> > >>
> > >
> > > If your repo has N changesets then the latest tag might be N
> parent-links
> > > away, so O(N) is a real upper bound which will be reached in some
> cases.
> >
> > Granted: so it has to have O(N) runtime in the worst case.  But as
> > coded, it uses O(N) memory in *every* case.  And I think it also uses
> > O(N) runtime in every case (I didn't see a break or return in that
> > loop over all changesets).
> >
> > As I said recently, repositories with 100,000+ changesets are reality
> > *today*, and they are not going to get any smaller as time goes by.
> > And one important component of optimization is picking algorithms that
> > scale well and don't *need* to be optimized for large datasets.
> >
>
> Guess what is my test repo: netbeans 6.7 133k+ cset.
>
> Let's start with the time:
> I have a pretty decent processor (C2D 3Ghz, but in a VirtualBox), but this
> is
> not a high end system.
> Time for hg tip --template '{lasttag}' : around 4 sec
> Now guess, time for hg id -n: around 3sec, so I am about 25% slower. Fair
> enough, but that's not bad considering I am doing a topological search.
> Yes, everything is in the disk cache, and possibly identify could be
> improved.
>
> But this brings us to memory consumption: I quickly measured to 80 MB for
> this
> 133K+ repo.
> That's a lot, but how much size is your working directory: between 1 and 2
> GB,
> may be even compiled. Don't tell that the package maintainer has less than
> 1 GB
> of RAM and a pretty decent CPU to deal with that repo.
>
> Optimization is a difficult task: I have already spent some time on several
> optimization projects, and you never know which approach is going to be the
> best.
> Moreover, you can optimize for memory or for speed but rarely both, it is
> always a trade-off.
>
> As to use an algorithm that search for a single rev (like the the one I use
> in
> the nearest extension), you can count the time in hours if somebody use 'hg
> log'.
> You say that usually they will not use it, but actually they *will*. I have
> made
> enough customer support to know they usually use our software in a way we
> don't
> expect. Anything else than a map for a template is broken from the start.
>
> Last, did you just tried it on your 108k repo ?
> May be will find that it is not so bad and I would be interested to get
> your
> timings.
>
> > >> One final question, which is unclear from the comments and the code:
> > >> are you going by topological nearness or revision-count nearness?  I
> > >> think topological distance is the only sane way to do this, so I
> > >> really hope that's what you're doing.  The comments and docstrings
> > >> should be clearer.
> > >>
> > >
> > > Neither; it uses the nearest tag with the latest date, and the distance
> is
> > > the longest topological distance to that. Matt preferred 'most recent'
> in
> > > http://www.selenic.com/pipermail/mercurial-devel/2009-July/013866.html,
> > > interpreted the way that the latest tag in the example is b.
> >
> > Oh yeah, I forgot about that.  You guys clearly thought long and hard
> > about it at the time, so I have nothing further to add.  (Except that
> > possibly the comments could explain the motivation for the current
> > algorithm, which is non-trivial.)
> >
>
> Yes, I will try to answer your feedback and improve the doc.
>
> Regards.
> Gilles.
>
> > Greg
> >
>
> PS: I realized couple days ago that my "fast" nearest extension do not
> return the right distance. So fast but broken. I will have to figure out
> why.
> Probably because mixing heuristic and topology search.
>
>
Ping.  Did this get dropped because of bike shedding again?  I would
_really_ like to see a {lasttag} template make it in before 1.4, in some
form or another.

--
Steve Borho
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://selenic.com/pipermail/mercurial-devel/attachments/20091004/07ca2509/attachment.htm 


More information about the Mercurial-devel mailing list