RepositoryLayoutRequirements

This page does not meet our wiki style guidelines. Please help improve this page by cleaning up its formatting.

In http://selenic.com/pipermail/mercurial/2008-June/019561.html, mpm listed repository layout requirements (quote from that post):

Here are the repo layout requirements:

a) does not randomize disk layout (ie hashing)
b) avoids case collisions
c) uses only ASCII
d) handles stupidly long filenames
e) avoids filesystem quirks like reserved words and characters
f) mostly human-readable (optional)
g) reversible (optional)

(a) is important for performance. Filesystems are likely to store sequentially created files in the same directory near each other on disk. Disks and operating systems are likely to cache such data. Thus, always reading and writing files in a consistent order gives the operating system and hardware its best opportunity to optimize layout. Ideally, the store layout should roughly mirror the working dir layout.

Point (g) is interesting. If we throw out cryptographically-strong hashing because of (a), we either have to expand names to meet (b) and (c) or throw out information and risk collisions. And we don't really want that, so (g) may effectively be implied. It's also worth considering (f): it makes understanding what's going on in the store a hell of a lot easier, especially when something goes wrong. Which again means: no hashing.

Which means that adding (d) is hard, because just about any solution to (b) and (c) will blow up path lengths, especially in the presence of interesting character sets. If our absolute paths are limited to a mere 255 characters and people want to have multi-line filenames, we've got a problem.

So we probably have to find a way to write longer filenames (NTFS's real limit is 32k).

mpm on why hashing filenames doesn't solve the problem (quote from from http://selenic.com/pipermail/mercurial/2008-June/019574.html):

Let's say you've cloned repo A, which gives you some set of files F that get stored in random order. To checkout that set of files F, we can either read them out of the repo in random (pessimal) order, or write them out to the working directory in random (pessimal) order. In both cases, we end up seeking like mad. The performance hit is around an order of magnitude: not something to sneeze at.

(And that's on an FS without decent metadata read-ahead. The difference will likely be more marked on something like cmason's btrfs.)

If we're clever, we can try reading every file out of the repo into memory in hash order, then writing them out in directory order. Which works just fine until you can't fit everything in memory. Then we discover we weren't so clever after all and we're back to seeking like mad, even if we're doing fairly large subsets. If our repo is 1G and we're batching chunks of 100M, we probably still end up seeking over the entire 1G of input space (or output space) to batch up each chunk.

Mercurial originally used a hashing scheme for its store 3 years ago and as soon as I moved my Linux test repo from one partition to another, performance suddenly went to hell because the sort order got pessimized by rsync alphabetizing things. I tried all sorts of variations on sorting, batching, etc., but none of them were even close in performance to the current scheme, namely mirroring the working directory layout and doing all operations in alphabetical order.

Further, simply cloning or cp'ing generally defragments things rather than making them worse.

Notes

According to http://selenic.com/pipermail/mercurial-devel/2008-June/006916.html, a reversible encoding of path names is currently (2008-06-28) required (see also http://selenic.com/pipermail/mercurial-devel/2008-June/006930.html).

CategoryInternals