[PATCH] largefiles: use multiple threads for fetching largefiles remotely

Matt Mackall mpm at selenic.com
Tue Oct 14 14:45:20 CDT 2014

On Fri, 2014-10-10 at 02:59 +0200, Mads Kiilerich wrote:
> largefiles: use multiple threads for fetching largefiles remotely
> Largefiles are currently fetched with one request per file. That adds a
> constant overhead per file that gives bad network utilization.

By constant overhead, you mean round-trip and connection setup time? If
so, wouldn't that be better mitigated with some form of pipelining or
batching? How much of this is Mercurial startup time on the server?
Does this imply multiple SSH connections too?

For files that are _actually large_, streaming multiple files to/from
spinning disk simultaneously is suboptimal because it creates an
interleaved I/O pattern. On the write side, you might get interleaved
storage too, which persists.

It's also kind of frowned on to use multiple TCP streams for data
transfer as it defeats TCP's fairness model, so on-by-default might not
be the right answer.

