Questions about revset.py

Mon Apr 9 13:19:46 CDT 2012

On Sun, 2012-04-08 at 20:12 +0200, Patrick Mézard wrote:
> Le 08/04/12 18:55, Matt Mackall a écrit :
> > On Sun, 2012-04-08 at 11:53 +0200, Patrick Mézard wrote:
> >> Hello,
> >>
> >> 1- Should revset.match() preserve input revisions order?
> >>
> >> orset() reorders by matched subexpressions.
> > 
> > Yeah, this isn't terribly well-defined. But in general, we should try
> > to.
> > 
> > For instance {1 2 3} or {0 1 2} = ?
> > 
> > Right now, we give {1 2 3 0}, preserving the order of the first set, and
> > appending new elements from the second set, again in order.
> > 
> >> 2- Should revset.match() preserve input revisions cardinality?
> >>
> >> I think rangeset() enforces unicity.
> > 
> > I can't think of a case where we'd want to have repeated revisions in
> > the output.
> > 
> >> 3- Should revset.match() sort input revisions itself (for performance reasons) or leave that to the caller?
> > 
> > Not sure what you mean here.
> 
> What happened is the first implementation of log command with revsets
> was calling revset.match() on "tip:0" instead of "0:tip".

???

If I run "time hg debugrevspec 0:tip" vs "tip:0", I get the same result.
If I hack this loop to reverse the order, I get the same result:

    func = revset.match(ui, expr)
    for c in func(repo, list(reversed(range(len(repo))))):
        ui.write("%s\n" % c)

So what precisely is being reversed where and why is it making things
slower?

>  Written that way this is pretty obvious but with the changeset
> enumerations hidden in revset code, it took me 15mn to add one
> "sorted()" call and another line to preserve revisions order, and
> restore the performances. So the question was: should revset.match()
> do that itself?
> 
> Note the order breakage mentioned in [1] might have light performance
> impact, but probably not the kind to bother us right now.
> 
> About the revset-based log command: I am in the process of
> benchmarking but one thing which will not magically go away with
> revset optimization is the "time to first output line" is
> unsurprisingly higher than with the regular log command, since we
> filter the whole revision set before displaying it. Did you plan to
> turn revset.match() into a massive generator (assuming this is even
> possible) or have other insights about this point?

It's an idea I've tinkered with. It will work with some predicates, but
obviously not all (eg last()), and most common queries are pretty fast
so it's not clear that it's worth the trouble.

-- 
Mathematics is the supreme nostalgia of our time.