Questions about revset.py

Matt Mackall mpm at selenic.com
Wed Apr 11 11:22:10 CDT 2012


On Wed, 2012-04-11 at 10:41 +0200, Patrick Mézard wrote:
> Le 09/04/12 20:19, Matt Mackall a écrit :
> > On Sun, 2012-04-08 at 20:12 +0200, Patrick Mézard wrote:
> >> Le 08/04/12 18:55, Matt Mackall a écrit :
> >>> On Sun, 2012-04-08 at 11:53 +0200, Patrick Mézard wrote:
> >>>> Hello,
> >>>>
> >>>> 1- Should revset.match() preserve input revisions order?
> >>>>
> >>>> orset() reorders by matched subexpressions.
> >>>
> >>> Yeah, this isn't terribly well-defined. But in general, we should try
> >>> to.
> >>>
> >>> For instance {1 2 3} or {0 1 2} = ?
> >>>
> >>> Right now, we give {1 2 3 0}, preserving the order of the first set, and
> >>> appending new elements from the second set, again in order.
> >>>
> >>>> 2- Should revset.match() preserve input revisions cardinality?
> >>>>
> >>>> I think rangeset() enforces unicity.
> >>>
> >>> I can't think of a case where we'd want to have repeated revisions in
> >>> the output.
> >>>
> >>>> 3- Should revset.match() sort input revisions itself (for performance reasons) or leave that to the caller?
> >>>
> >>> Not sure what you mean here.
> >>
> >> What happened is the first implementation of log command with revsets
> >> was calling revset.match() on "tip:0" instead of "0:tip".
> > 
> > ???
> > 
> > If I run "time hg debugrevspec 0:tip" vs "tip:0", I get the same result.
> > If I hack this loop to reverse the order, I get the same result:
> > 
> >     func = revset.match(ui, expr)
> >     for c in func(repo, list(reversed(range(len(repo))))):
> >         ui.write("%s\n" % c)
> > 
> > So what precisely is being reversed where and why is it making things
> > slower?
> 
> Exactly what you did but you need a revset actually touching the changelog. On mercurial repository, best of 3 runs:
> 
> Regular hg:
> 
>   $ time hgdev debugrevspec 'user(mpm)' | wc
>   3802    3802   19618
> 
>   real	0m1.022s
>   user	0m0.979s
>   sys	0m0.042s
> 
> 
> Patched hg:
> 
>   $ time hgdev debugrevspec 'user(mpm)' | wc
>   3802    3802   19618
> 
>   real	0m1.811s
>   user	0m1.528s
>   sys	0m0.282s
> 
> 
> My patch is similar to yours:
> 
> 
> diff --git a/mercurial/commands.py b/mercurial/commands.py
> --- a/mercurial/commands.py
> +++ b/mercurial/commands.py
> @@ -2188,7 +2188,7 @@
>          if newtree != tree:
>              ui.note(revset.prettyformat(newtree), "\n")
>      func = revset.match(ui, expr)
> -    for c in func(repo, range(len(repo))):
> +    for c in func(repo, range(len(repo)-1, -1, -1)):
>          ui.write("%s\n" % c)
> 
> 
> I have not analyzed the issue yet, my hunch is loops like:
> 
> 
> def author(repo, subset, x):
>     """``author(string)``
>     Alias for ``user(string)``.
>     """
>     # i18n: "author" is a keyword
>     n = encoding.lower(getstring(x, _("author requires a string")))
>     return [r for r in subset if n in encoding.lower(repo[r].user())]

> work much better when subset is in changelog order (which is the whole point of cmdutil.increasing_window_whatever() IIRC).

Almost certainly.

Ok. So I take it there's a case where you find it useful to pass
something other than range(len(repo)) to the matcher?

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list