[PATCH 00 of 10] RFC: Light-Weight Copy

Sune Foldager cryo at cyanite.org
Wed Sep 8 19:48:13 UTC 2010


NOTE: This queue is not push-ready. It's posted for initial comments and feedback.

This patch queue implements light-weight copy in Mercurial. Light-weight copy prevents hg
from storing a complete version of a file when it's copied or renamed. It's a change in
the storage format; the API stays the same.

The patch queue was originally created by Vsevolod Solovyov, and has been worked on by me
subsequently. The code is essentially the same, although there has been a number of
changes, bug fixes, patch folding and optimizations. The final patch of the queue, is
recent work by me.

Currently, all patches are attributed to me; I don't consider the queue to be push-ready,
so this is not necessarily finalized either. See below.

The queue applies towards 5b849148b620, which is close to crew-default-tip. It likely
applies against tip as well.


I'll briefly describe how light-weight copy ("lwcopy") is implemented here:

Whenever a copy-revision is stored in the filelog, lwcopy stores, instead of the whole
file text, only a delta between the previous filelog (for the old file name). Often, this
delta will be empty. A flag is added to metadata to signal that this has been done. This
allows lwcopy-changesets and old-style changesets to coexist in the same repository.

The hash values are calculated against the full text, as before, so nothing changes in a
repository, even if it's converted to lwcopy.

To control these things, a few methods in revlog have been split into public and private
parts, which filelog can then override.

Since the changes are at a pretty low level, generally private methods, the wire protocol
never sees any changes, and old-style changegroups are simply emulated: when reading
changegroups, lwcopy changesets are expanded to the full text; when adding changegroups,
copy changesets are lwcopy-fied automatically.

The exception is stream-clone, which sends the store files directly. The last patch in the
queue addresses this by introducing a new stream-capability the client can use to decide
whether or not it understands the streaming format. If it doesn't, it'll use pull-clone.


Notes on the state of the queue:

The number of, and division of the patches are largely as Vsevolod left them. It may make
sense to reorder and/or fold some of them, and certainly to flesh out the commit messages.

Also, no tests are added, but the queue does fix a few tests which fail due to changed
sizes of the filelogs (ironically, due to the small data sizes generally employed in
examples, the sizes are larger than before; this is due to the new 'lwcopy' metadata
element). The test suite passes.

Note that the suite doesn't test old client vs. new server or vice-versa. I have done
manual testing of various scenarios with success. This includes cloning a lwcopy repo with
an old client, making a new copy or rename and pushing the change back, etc. All this
works as expected.

Apart from the above, the patches are pretty much finished, as I see it. I would like to
make some minor changes, e.g. rename the 'lwcopy' metadata to 'copylw' to align with the
other two copy-related keys, maybe rename a few methods and similar.

Comments, questions and suggestions are welcome.

-- 
Sune Foldager


More information about the Mercurial-devel mailing list