run-tests.py refactor

Gregory Szorc gregory.szorc at gmail.com
Sat May 10 14:14:54 CDT 2014


On 4/25/2014 9:30 PM, Yuya Nishihara wrote:
> On Fri, 25 Apr 2014 14:46:01 -0700, Gregory Szorc wrote:
>> On 4/25/2014 6:10 AM, Martin Geisler wrote:
>>> Gregory Szorc <gregory.szorc at gmail.com> writes:
>>>
>>>> On my i7-2600K mech HD w/ -l -j8:
>>>>
>>>> Original:
>>>>
>>>> real    6m40.558s user    31m31.405s sys     17m48.862s
>>>
>>> I normally run with -j 20 and get a real time of 4m24s. I find that
>>> there's so much IO going on that I can start that many processes in
>>> parallel to really saturate the CPU (maybe -j 15 would be the
>>> same).
>>
>> I created a tmpfs to isolate the impact of I/O from my mechanical HD.
>> My 4+4 cores were largely saturated during -j8. There were only brief
>> periods (maybe ~20s total) where idle CPU was below 5%. Raising it to
>> -j10 made the lulls go away.
>>
>> Total execution time on the tmpfs was about the same as the mechanical
>> HD. I can only conclude that the tests are CPU bottlenecked. My
>> previous measurements indicate the bulk of that is inside `hg`.
>>
>> I may very well write a feature that invokes the Python profiler on
>> all the `hg` invocations and sums the results. Could be interesting.
> 
> IIRC, slow module lookup of Python contributes to the startup time of `hg`
> process.  It might be partly an I/O bottleneck.

Further measurement indicates that spawning Python processes and
importing the mercurial modules introduces significant overhead to
running tests.

I ran `time python -c '1'` and `time python hg version` to measure the
overhead of new Python processes and the most simple Mercurial command,
respectively. I tested things on my desktop Sandy Bridge i7-2600K (few
years old now) running Ubuntu 14.04 and a more modern 2013 Haswell MBP.
I ran things several times and obtained the median. I didn't use an
hgrc, so no extensions were loaded.

Machine   Distribution      Python Time     HG Time
desktop   cpython 2.7.6         40ms          87ms
desktop   cpython 3.3.5         60ms          N/A
MBP       cpython 2.7.6         16ms          44ms
MBP       cpython 3.3.5         27ms          N/A

(Note that 3.3.5 is slower than 2.7.6!)

We have nearly 13,000 `hg` invocations in our .t tests. If you
extrapolate things out, the overhead for new Python processes is thus:

Machine   Distributions   Python     HG
desktop   cpython 2.7.6     520s    1,131s
desktop   cpython 3.3.5     780s     N/A
MBP       cpython 2.7.6     208s     572s
MBP       cpython 3.3.5     351s     N/A

We can then compare these numbers to the total Mercurial test suite
execution time and see what the overhead of new Python processes and new
HG processes are.

Machine  Test CPU Time   Python %    HG %
desktop     2,960s         17.6      38.2
MBP         1,855s         11.2      30.8

It certainly appears that cpython process overhead and the overhead of
"initializing" the Mercurial environment within a fresh Python
interpreter are contributing a significant amount of overhead to the
Mercurial test suite.

Inlining tests into Python instead of shell like my patch series is
attempting or using chg should significantly speed up the test suite.

Activating my regular hgrc file slowed down `hg version` by ~40ms on my
desktop machine. That is statistically significant.

On my desktop machine, Perl 5.18.2 runs in 15ms, Ruby 1.9.3p484 in 27ms,
PHP 5.5.9 in 50ms, and Git 1.8.1.2 runs in 12ms.

The Python new process overhead could very well be contributing to the
perception that Mercurial is slow[er than Git]. You need <25ms latency
for commands to feel instantaneous. We are already beyond that for the
most trivial Mercurial commands.

Gregory


More information about the Mercurial-devel mailing list