[PATCH] patchcopies: backout and optimisation that backfired

Pierre-Yves David pierre-yves.david at ens-lyon.org
Fri Oct 4 09:06:57 EDT 2019


On 10/4/19 2:18 PM, Yuya Nishihara wrote:
> On Fri, 04 Oct 2019 00:05:44 -0400, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david at octobus.net>
>> # Date 1570072771 14400
>> #      Wed Oct 02 23:19:31 2019 -0400
>> # Node ID 07010e0cae5cab2ff0c375f24c1d8185ead8e957
>> # Parent  03e769278ef31f648ba5c49be719da5b73587607
>> # EXP-Topic patchcopies-regression
>> # Available At https://bitbucket.org/octobus/mercurial-devel/
>> #              hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 07010e0cae5c
>> patchcopies: backout and optimisation that backfired
>>
>> In 8a0136f69027 we introduced a new "introrev" function. And in d98fb3f42f33, we
>> made it smart enough to speedup various pathological case.
>>
>> However, it turns out that is can also introduce massive slow down in some other
>> cases. To clarify the situation, I ran `perfpathcopies` on 6989 pair of revision
>> in the pypy repository (`hg perfhelper-pathcopies`. The result is massively in favor of dropping this
>> condition:
>>
>> Here are various numbers (before this changeset/after this changesets)
>>
>>              source       destination  before      after    abs-time    ratio
>> worth cases 0c90acafea5d 7a260b2c4511    1.251143 1.775874   -0.524731 1.419401
>>              795743bd02b0 daa08b2d951c    1.276092 1.722316   -0.446224 1.349680
>>              d7b4ace71d7e a4fedb1664ee    1.046802 1.491993   -0.445191 1.425287
>>              464d641a9035 4d87b3d82dc2    1.072234 1.509769   -0.437535 1.408059
>>              a305590465d6 a84f8ceb8740    1.322278 1.738900   -0.416622 1.315079
>> worse  1%   dbfbfcf077e9 89183b3a84f3    1.471880 1.735137   -0.263257 1.178858
>> worth  5%   688a132b06b6 573e529840d7    0.090268 0.118713   -0.028445 1.315117
>> worth 10%   0bfaff4207c3 dca96cba7aee    1.790686 1.795441   -0.004755 1.002655
>> worth 25%   e99ce6af254c a7dedcc55d55    0.001954 0.002108   -0.000154 1.078813
>> median      9e7c5b33e755 c6c97bf46846    0.001925 0.001901    0.000024 0.987532
>> best 25%    3f055112fae9 df1447de24d7    0.002923 0.002163    0.000760 0.739993
>> best 10%    da67e5b7be9e f2d371471588    0.473663 0.441357    0.032306 0.931795
>> best  5%    b02f6ce0d05b 9b4bd629bb44    1.664841 1.427323    0.237518 0.857333
>> best  1%    73ea83bf410b ed910c5221d9    3.005100 0.138998    2.866102 0.046254
>> best cases  80b492d79663 1456861b1ea6 1464.845562 3.942297 1460.903265 0.002691
>>              ace7255d9a26 682589af6612 1594.849940 2.467435 1592.382505 0.001547
>>              c3b14617fbd7 743a0fcaa4eb 1618.130257 2.636641 1615.493616 0.001629
>>              c3b14617fbd7 9ba6ab77fd29 1621.882135 2.670315 1619.211820 0.001646
>>              ef865da04745 421ca854cd86 1631.803875 2.646586 1629.157289 0.001622
>>
>>
>> As one can see, the average case is not really impacted. However, the worth case
>> we get after this changeset are much better than the one we had before it. We
>> have 30 pairs where improvement is > 10 minutes.
>>
>>
>> This reflect in the combined time for all pairs
>> before: 37700s
>> after:   1898s (-95%)
>>
>> If we remove these last 30 case, we still see a significant improvements:
>>
>> before: 2347s
>> after:  1898s (-20%)
>>
>> diff --git a/mercurial/copies.py b/mercurial/copies.py
>> --- a/mercurial/copies.py
>> +++ b/mercurial/copies.py
>> @@ -160,8 +160,6 @@ def _tracefile(fctx, am, basemf, limit):
>>               return path
>>           if basemf and basemf.get(path, None) == f.filenode():
>>               return path
>> -        if not f.isintroducedafter(limit):
>> -            return None
> 
> So, _tracefile() no longer stops at the limit. Maybe we can remove
> _findlimit() as well if it's worth nothing.
> 
> Alternatively, maybe the limit could be enforced by linkrev(), e.g.
> f.linkrev() < repo[limit][path].linkrev(). I don't know if it's good for
> mitigation of the worst case, and we'll probably need to adjust _findlimit()
> to make sure repo[limit][path].linkrev() is the lowest revision.

oh, right. The code using introrev was supposed to be a "better" version 
of something based on linkrev. I suppose that using linkrev will still 
yield some benefit without falling into huge slowdown.

I'll update the patch and gather new timing. Thanks


-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list