Unexplained bottleneck with hg serve?

Martin Geisler mg at lazybytes.net
Tue Nov 1 07:20:46 CDT 2011


Matt Mackall <mpm at selenic.com> writes:

> On Mon, 2011-10-31 at 17:30 +0100, Na'Tosha Bard wrote:
>> Hello,
>> 
>> We have been experimenting with hg serve and have found that it has
>> some performance limitation we cannot pinpoint. We've discovered that
>> no given mercurial operation (pull or clone) will go faster than 20
>> Mb/s, despite the fact that there is plenty of available network
>> bandwidth. The host's CPU and disk are also not loaded. Running "hg
>> clone --uncompressed" results in a very fast clone (approx. 30MB/s).
>> 
>> So the normal clone (without the --uncompressed flag) was much slower
>> by comparison, but did not max out the CPU. This was tested on just a
>> normal Ubuntu Linux on a laptop.
>
> Is it maxing out a single core?

I'm here in Copenhagen trying to help with the performance problems.
We've done some more systematic tests this morning. The setup:

* Network: standard gigabit LAN

* Server: Windows machine with 8 cores, 48 GB RAM. Base load: 10% CPU,
  less than 5 Mbit/s on network.

* Clients: Linux laptops with Core i7, dual core.

We put the OpenOffice repository on the server and started "hg serve"
four times on four different ports.

* Normal (compressed) clones:

  - Server: 5% CPU per "hg serve" process.

  - 1 client: 1 machine: 50% load on one core, changelog: 5 Mbit/s
    download, manifest: 15-20 Mbit/s, file logs: 8-25 Mbit/s, avg 17
    Mbit/s

  - 2 clients 1 machine: 50% load on two cores, 30 Mbit/s total download

  - 3 clients, 2 machines: changelog: 4 Mbit/s, manifest: 14 Mbit/s on
    third client

  - 4 clients, 3 machines: changelog: 7 Mbit/s, manifest: 20 Mbit/s

  So each "hg serve" seems to max out at ~20 Mbit/s even though the
  clients are not maxed out for CPU.

* Uncompressed clones:

  - Server: 1% CPU per "hg serve" process

  - 1 client: 1 machine: 18% CPU, avg download speed 50 Mbit/s for manifest

  - 2 clients, 2 machines: total avg download 70 Mbit/s

  - 3 clients, 3 machines: total avg 100 Mbit/s

  - 4 clients, 3 machines: total avg 130 Mbit/s

  So each "hg serve" can stream about ~30 Mbit/s with --uncompressed,
  with less CPU load on both server and clients.

We also tested uncompressed cloned between the Linux laptops and got a
speed of 14 MB/sec for the full clone. The average speed of a Linux
laptop cloning from Windows server was 4 MB/sec.


* Init plus pull:

  - Server: 6% per "hg serve" process

  - 1 client, 1 machine: 15 Mbit/s avg, 50% CPU load

  - 2 clients, 1 machine: changelog total 10 Mbit/s, manifest total 40
    Mbit/s

  - 3 clients, 2 machines: changelog total 15 Mbit/s, manifest total
    30-45 Mbit/s

  So as expected, this is very similar to the normal clone. We tested
  this case as well since the TeamCity continuous integration server
  makes its clones like this.


The overall question is why we cannnot get more than 20-30 Mbit/s out of
a "hg serve" process. Btw, I ran "hg serve" directly instead of wrapping
it in Apache or IIS since I expect this to be the cleanest results --
the webservers will have to hand-over to the Python hgweb code anyway.

The poor performance of --uncompressed with the Windows server also
puzzles us.

-- 
Martin Geisler

aragost Trifork
Professional Mercurial support
http://aragost.com/mercurial/


More information about the Mercurial mailing list