[PATCH STABLE] log: fix order of revisions filtered by multiple OR options (issue5100)

Yuya Nishihara yuya at tcha.org
Tue Feb 23 10:38:57 EST 2016


On Tue, 23 Feb 2016 15:05:58 +0100, Pierre-Yves David wrote:
> On 02/15/2016 03:34 PM, Yuya Nishihara wrote:
> > # HG changeset patch
> > # User Yuya Nishihara <yuya at tcha.org>
> > # Date 1455543967 -32400
> > #      Mon Feb 15 22:46:07 2016 +0900
> > # Branch stable
> > # Node ID ae36fdcb20ae3b7b0d8b2e7f868a1c3eb104e34e
> > # Parent  8da94662afe51a836eda500652097772c34002e8
> > log: fix order of revisions filtered by multiple OR options (issue5100)
> >
> > This is the simplest workaround for the issue of the ordering of revset, which
> > is that the expression "x or y" takes over the ordering specified by the input
> > set (or the left-hand-side expression.) For example, the following expression
> >
> >    A & (x | y)
> >
> > will be evaluated as if
> >
> >    (A & x) | (A & y)
> >
> > It is wrong because revset has ordering. Ideally, we should fix the revset
> > computation, but that would require a long patch series. So, for now, this
> > patch just works around the common log cases.
> >
> > Since this change might have some impact on performance, it is enabled only
> > if the expression seems to have ' or ' expression.
> >
> > diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
> > --- a/mercurial/cmdutil.py
> > +++ b/mercurial/cmdutil.py
> > @@ -2139,9 +2139,14 @@ def getlogrevs(repo, pats, opts):
> >           # Revset matches can reorder revisions. "A or B" typically returns
> >           # returns the revision matching A then the revision matching B. Sort
> >           # again to fix that.
> > +        oldrevs = revs
> >           revs = matcher(repo, revs)
> >           if not opts.get('rev'):
> >               revs.sort(reverse=True)
> > +        elif ' or ' in expr:
> > +            # XXX "A or B" is known to change the order; fix it by filtering
> > +            # matched set again (issue5100)
> > +            revs = oldrevs & revs
> 
> uurg, this is hacky even if I understand why we need it. Should 'addset'
> have some logic to handle this case?

I don't think 'addset' can have something, but optimize() can do. Basic idea
is to tell optimize() if the current expression should define its ordering or
follow the existing ordering.

  A & (x | y)  # A defines ordering, (x | y) follows A
  =   -------

  x | y | z    # x, y, z, and (x | y | z) define ordering
  =========

  A & x(B)     # A defines ordering, x(B) follows A, but B may define ordering
  =   --=      # if B is a set expression (e.g. first(x:y))

If an expression is known to have its ordering (e.g. x:y, sort(), orset(), ..),
but if it should follow the existing ordering, wrap it.

  A & (x | y) -> A & _reorder((x | y))
                     --------
                     A.filter(addset(x, y))

> You are checking for ' or ' but what about '+' ?

_makelogrevsets() doesn't use '+'. I agree it's a hack, but I wanted to keep
a stable patch simple. But as the issue is old regression, I'm okay to drop
this patch and do a complete fix in default branch.

> Also, what about user input containing " or " string (eg: in 'desc(…)'), 
> over sorting will not introduce behavior issue but still seems a bit 
> unfortunabte,

Yep. I can change it to investigate the parsed tree.

> finally. this seems to be fixing the issue for log only. Is log the only 
> affected command (because it build revset in a strange way?) or the bug 
> more widespread?

graphlog isn't affected because it sorts the result set, but there are many ways
to get strange ordering as revset is expressive.

% hg debugrevspec '0: & 2:1' -v --optimize
(and
  (rangepost
    ('symbol', '0'))
  (range
    ('symbol', '2')
    ('symbol', '1')))
* optimized:
(and
  (range
    ('symbol', '0')
    ('string', 'tip'))
  (range
    ('symbol', '2')
    ('symbol', '1')))
* set:
<filteredset
  <spanset- 1:2>>
2
1


More information about the Mercurial-devel mailing list