Hardlinked Clones

When doing a local clone with a plain hg clone A B mercurial first tries to create hardlinks for files inside the .hg/ directory (not the working copy). This speeds up cloning and saves harddisk space by using the same physical file for two or more directory entries.

Both Linux and Windows NTFS file systems support hardlinks (but note that NTFS limits the maximum number of hardlinks per file to 1023). For filesystems that don't support hardlinks (e.g. Windows FAT), mercurial falls back to copying all files instead of hardlinking them.

In situations where a hardlinked clone may not be ideal, users can use hg clone --pull, which will use the pull protocol for cloning and create a fully independent clone.

Cloning over http/https or ssh from a remote server implicitly implies --pull.

When committing or pushing to a repository, mercurial checks the hardlink count for every file X it needs to write to inside .hg. If the count is two or more, mercurial breaks up the hardlink for X before writing to it. Breaking up the hardlink for a file X means (1) copying X to a temporary file, (2) deleting X and then (3) renaming the tempfile back to X.

1. Examples

  $ hg clone http://selenic.com/repo/hg hgcopy
  requesting all changes
  adding changesets
  adding manifests
  adding file changes
  added 12613 changesets with 24932 changes to 1936 files
  updating to branch default
  848 files updated, 0 files merged, 0 files removed, 0 files unresolved

This was clone over http from a remote server. The resulting clone (hgcopy) thus has no hardlinks.

  $ hg clone --pull hgcopy hgcopy2
  requesting all changes
  adding changesets
  adding manifests
  adding file changes
  added 12613 changesets with 24932 changes to 1936 files
  updating to branch default
  848 files updated, 0 files merged, 0 files removed, ..

This was a clone with explicit --pull. The resulting clone (hgcopy2) thus has no hardlinks and is completely independent from hgcopy.

If mercurial prints "adding changesets" then the resulting clone will have no hardlinks.

  $ hg clone --debug -U hgcopy2 hgcopy3
  linked 1956 files

This was a clone which uses hardlinks. The files in hgcopy2 and hgcopy3 (inside the .hg dir) are hardlinked. Mercurial versions 1.6 and later print the number of files that were hardlinked if --debug is specified.

  $ hg clone --debug -U hgcopy2 x:\hgcopy4
  copied 1956 files

This was a clone where mercurial first tried doing hardlinks, but didn't succeed. For example the filesystem may not support hardlinks or the source and the destination are not on the same volume. In this case mercurial falls back to copying the files.

2. Hardlinked clones on Windows shares

Mercurial versions up to 1.6.2 suffer from a bug which is present in nearly all Windows variants (including Windows 7). Windows computers that serve files on a share always report a count of one when asked for the number of hardlinks a file has, even if a file actually does have hardlinks and thus the correct number reported should be two or more. This means mercurial running on a client gets a wrong hardlink count for files which are part of a hardlinked clone that resides on a windows network share.

A workaround for Mercurial running on Windows for this has been first released with mercurial 1.6.3 (see WhatsNew). The workaround unconditionally makes a full copy of each file before writing to it if the file is on a windows network share, thus making sure any hardlinks that may exist on that file are broken up.

Note that this workaround is only effective if Mercurial is run on Windows. There is a related bug in the Linux CIFS driver, which is still not fixed ("Linux CIFS mounts may corrupt hardlinked repos on Windows shares", see issue1866). A workaround for that Linux CIFS driver bug has been released with Mercurial 1.7.1.

All Mercurial versions prior to 1.6.3 fail to cope with this Windows bug. If such a Mercurial version is used on a client computer to commit or push to a hardlinked clone on a network share, then the target repository may be corrupted because the file modifications will erroneously appear in all clones that share these files. There is no error message reported on the respective commit or push. The resulting repository corruption is detected by a later hg verify.

3. See also

HardlinkedClones (last edited 2013-08-14 07:54:22 by MartinGeisler)