[PATCH] prevent any file paths under .hg/store/data/ from getting too long
Jesse Glick
jesse.glick at sun.com
Thu Dec 20 07:05:28 CST 2007
Background comments about the impl:
Before I started I assumed this would be an easy fix - simply chop off
some prefix from long store paths, maybe uniquify if required with a
hash. I assumed that the manifest would perhaps list triplets of working
(checkout) path, store path, and node ID.
As I was surprised to discover, the manifest in fact only contains pairs
of working path and node ID; the store path is computed on demand using
a translation function, which must be repeatable (hence the use of a
hash to uniquify); and this function must be reversible (for
streamclone.py). Making it reversible complicates the patch since it is
then necessary to maintain a separate .hg/store/longnames from which the
working path can be recovered.
A different approach (probably too ambitious for me) would be to
maintain a working -> store path mapping file (00mapping.[di]?) with
pairs like
some/path/to/File.txt some/path/to/_file.txt
some/very/truly/long/path/to/a/file truly/long/path/to/a/file
another/truly/long/path/to/a/file 2ruly/long/path/to/a/file
Hg would add an entry after creating a new revlog in storage. (Or
before? Not sure about locking semantics there.)
This seems related to the problem with repo growth after renames (filed
as #883). Again, I was surprised when I found that renaming a file
creates a new revlog; clearly this is required if the translation
function is to be repeatable. If there were a persisted mapping file,
perhaps Hg could reuse the same revlog for the renamed working path:
original-file original-file
new-file original-file
.hg/store/data/original-file.[di] would then keep revisions of both
original-file and new-file, which would I guess permit good compression
if the file did not change much or at all during the move. (Probably the
existing metadata keys 'copy' and 'copyrev' would still be needed for
rename merges, --follow, etc. to work.)
Am I off track here? Is there a reason why the current design is necessary?
More information about the Mercurial-devel
mailing list