[PATCH 2 of 2] revset: make children only look at commits >= minrev

Pierre-Yves David pierre-yves.david at ens-lyon.org
Wed Sep 24 19:09:17 CDT 2014



On 09/23/2014 04:35 PM, Durham Goode wrote:
> Pierre-Yves and I have come to a stalemate in real life, so perhaps the
> community will have an opinion.
>
> I believe these changes are small enough and impactful enough to go in
> as is. They don't cause any regressions in the revset benchmarks, and
> are unlikely to cause significant regressions in any other cases.  While
> the same performance might be achievable by refactoring the underlying
> revset classes to be smarter, the current change is 3 lines, buys us a
> 40% perf improvement, and doesn't make a future refactor any more
> difficult.

This change -do- makes it more difficult to improves revset in the 
future. But making revset function less homogeneous you are making it 
harder to take advantage of future improvement in the core class. You 
also make it harder to test the impact of improvement in those core 
class. Moreover by doing dirty optimization for the common case you hide 
obvious issue, letting them being expressed again later when someone add 
new usage of revset. Finally, doing benchmark and regression testing 
gets harder too. The more differently each revsets is, the harder is it 
to get a good picture of the performance impact with testing only a few 
revsets. If all revsets function build it result in a different way we 
end needing to test tens of combination for all of them which make such 
benchmark eventually impossible.

This is not just theoretical statement. This is my very experience of 
someone that works on revset and have to revert this kind of changes to 
be able to do so.

The lazy revset implementation is young and crappy. It has been done by 
an intern in a few months and left uncompleted. (no offense intended for 
the said intern, nothing unexpectedly bad there). What this 
implementation now needs is love, not hotfix. Improving dramatically the 
current implementation is -simple- and -cheap-. We had O(n²) stupidity 
in a common code path for 5 months, and it was solvable by a 3 line 
patches to a core class.

The case you are trying to address in your patch (smallest set should be 
used as the base when doing "and" operation) have been spotted multiple 
time (parents, children), affect multiple real world command (histedit, 
rebase, most probably evolve) And a lot of other revset could benefit 
from this approach (p1, p2, etc). There is obvious way to fix it once 
and for all in the handfull of smartset classes: makes the __and__ 
function aware of some length hint and use that to swap the operand when 
it seems smart to do so. Fixing your case using this approach is not a 
huge amount of work.Detecting that this issue is a pattern and deciding 
how solve it properly were most of the work and it is already here. You 
do not have to produce a complete implementation of this idea to solve 
your idea. Just touching the couple of classes involved in the case you 
are trying to optimize are enough. And this will speedup some other 
revset throughout the code base. Doing only part of this new logic is 
okay once you have started the work in that direction other people may 
complete the other bricks over time without having to do the whole 
diagnostic process you already did.

So please send the couple of few lines patches that would fix this issue 
the right way instead of building technical debt. More time it is 
necessary to produce such patches have been already invested in this debate.

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list