Finding Merges in the Wrong Direction

Jensen, Aaron ajensen at webmd.net
Mon Nov 11 18:45:08 CST 2013


Avoiding `branch(X)` gives a *huge* speedup.  This query:

    limit(children(p2(merge() and ::X) and branch(default)) and reverse(ancestors(X)))

takes about 6 seconds.  If I put the `-2000:` clause in, it takes less than a second:

    limit(children(p2(-2000 and merge() and ::X) and branch(default)) and reverse(ancestors(X)))

However, there are no guarantees that an offending changeset could be within the last 2000 changesets.  I'd like to replace it with a clause that gives me the parent of the first changeset in a branch.  Is there a quicker way than

    descendants(parents(min(branch(X)))) 

I've tried several dozen iterations that don't use `branch(X)`, but none of them work.


-----Original Message-----
From: Matt Mackall [mailto:mpm at selenic.com] 
Sent: Monday, November 11, 2013 15:13
To: Jensen, Aaron
Cc: mercurial at selenic.com
Subject: Re: Finding Merges in the Wrong Direction

On Mon, 2013-11-11 at 21:55 +0000, Jensen, Aaron wrote:
> 	> hg log -r "limit(descendants(p1(first(branch('2013B')))) and
> reverse(p2(ancestors(branch('2013B'))) and branch('default')),1)"

Most of the slowness will be caused by branch('2013B') which will find ALL changesets on that branch. Since branches need not be connected, this is done by unpacking every changeset object and will thus take about as much time as running 'hg log'. You do this twice and Mercurial's revset engine isn't smart enough to deal with cache duplicate expressions.

So the trick is to minimize the number of changesets branch() has to look at. Something like:

'children(parents(-2000: and merge() and ::X) and branch(default)) and branch(X)'

Let's break this apart:

-2000 and merge() and ::X

        Let's look at all the interesting changes in the last 2000
        commits (the merges) that happen to be on our branch. Checking
        for merges is fast and so is just following ancestors of a
        branch head, no unpacking changesets necessary.
        
parents(...) and branch(default)

        Now we take the (small) set of interesting commits and see if
        their parents are on the default branch to get an even smaller
        (hopefully empty) set. Looking up parents is fast, and unpacking
        a small number of branch names is also fast.
        
children(...) and branch(X)

        Now we go from that small set back to the changesets we want to
        report.

This might not have the exact semantics you want, but if you play around with the above ideas, you should be able to get something that takes well under a second.

--
Mathematics is the supreme nostalgia of our time.





More information about the Mercurial mailing list