Speeding up Mercurial on NFS
Martin Geisler
mg at aragost.com
Tue Jan 11 05:11:14 CST 2011
Matt Mackall <mpm at selenic.com> writes:
> On Mon, 2011-01-10 at 14:45 +0100, Martin Geisler wrote:
>
>> timeit -q --pre 'sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"' \
>> 'walker -t $N -q'
>
> I think dropping the cache here invalidates the comparison with the
> below. The important case for NFS is when the _server_ has a warm
> cache and doesn't need to talk to the disk for a lookup. If you do a
> warm cache test here, you'll get a target number that will be the best
> you can hope to get out of the NFS case.
Right, good point. I was mostly including the local disk numbers in
order to see how single- vs multi-threaded walking compared to one
another and to see how the C program compared to Python. So I would say
the test shows that there is little to be gained in switching to C for
such an I/O intensive program -- which was also what we would expect.
> Also, it's just not very interesting. On spinning media, we know that
> we're going to have a seek bottleneck that multithreading can only
> exacerbate. On SSD, we're either going to hit a req/s barrier that's
> lower than our syscall throughput (no scaling) or we're going to going
> to saturate the syscall interface (scaling up to number of cores), or
> we're going to saturate the filesystem lookup locks (but not on modern
> Linux with a desktop machine!). What you have above appears to be an
> SSD and two cores, yes?
I agree with you that a single disk should max out like you describe...
but the above numbers are for a normal 7200 RPM 1 TB SATA disk and a
quad core i7 930.
> By comparison, threaded NFS lookups is all about saturating the pipe
> because the (cached on server) lookup is much faster than the request
> round trip.
>
> How many files are you walking here?
There are 70k files -- it is the working copy of OpenOffice, changeset
67e476e04669. You sometimes talk about walking a repo with 207k files,
is that a public repo?
>> Running over an artificially slow (0.2 ms delay) NFS link back to
>> localhost gives:
>>
>> threads pywalker walker
>> 1 9.0 s 8.2 s
>> 2 6.3 s 4.5 s
>> 4 6.1 s 2.7 s
>> 8 5.9 s 1.5 s
>> 16 6.0 s 1.7 s
>> 32 6.0 s 1.9 s
>
> Interesting. Looks like you're getting about 6-8 parallel requests per
> round trip time. But that seems way too slow, that'd only be ~ 15k
> requests per second or 22500 - 30k files total. Or, if that .2ms is
> round-trip delay, 45k - 60k files total.
Yes, it's round-trip delay -- I add 0.1 ms delay to the link and the
ping time goes to 0.2 ms. I use
sudo tc qdisc change dev lo root netem delay 0.1ms
to add a simple constant delay to all packets.
> You should also run this test without the delay. Again, this will give
> you a target baseline for what you can hope to get out of NFS. It
> should saturate around threads = cores, but should probably be
> marginally faster than that 1.5s number there.
Okay, here are tests without the delay -- raw speed on the local
loopback. I unmount the NFS filesystem after each test but do not clear
any other caches:
threads pywalker walker
1 2230 ms 1931 ms
2 1857 ms 1164 ms
4 2594 ms 818 ms
8 2757 ms 833 ms
16 2796 ms 991 ms
32 2776 ms 987 ms
The eight (hyper-threading) cores were never maxed out while I ran the
tests, they only peaked up to about 50% utilization.
At first I thought this was because of how I walk the tree: each worker
thread scans a directory and inserts each subdirectory into a queue. It
then returns and grabs the next directory from the queue. This gives a
breadth-first traversal of the directory tree.
But there are 203 top-level directories in the root of the OpenOffice
working copy, so all threads will quickly get a directory to work on.
The number of directories on different levels are as follows:
level directories
0 203
1 948
2 1611
3 1372
4 1144
5 1473
6 330
7 71
8 50
9 15
10 6
11 2
12 1
13 0
I used 'ls -d */*(/) | wc -l' with varying number of '*/' to get these
numbers.
--
Martin Geisler
aragost Trifork
Professional Mercurial support
http://aragost.com/en/services/mercurial/blog/
More information about the Mercurial-devel
mailing list