[RFC] kbfiles: an extension to track binary files with less wasted bandwidth
Martin Geisler
mg at lazybytes.net
Thu Sep 22 11:37:48 CDT 2011
"Na'Tosha Bard" <natosha at unity3d.com> writes:
> So, to pick this topic up again, can we get an open punchlist of
> things that the mercurial community (and project leader) believes is
> "missing" for the largefiles extension? E.g, what is missing for it to
> be accepted into mercurial?
I guess you'll have to patchbomb it here eventually. Also, you could
describe the features in a mail here -- I found a usage.txt file in the
repository which seems relevant:
Largefiles allows for tracking large, incompressible binary files in
Mercurial without requiring excessive bandwidth for clones and pulls.
Files added as largefiles are not tracked directly by Mercurial;
rather, their revisions are identified by a checksum, and Mercurial
tracks these checksums. This way, when you clone a repository or pull
in changesets, the large files in older revisions of the repository
are not needed, and only the ones needed to update to the current
version are downloaded. This saves both disk space and bandwidth.
If you are starting a new repository or adding new large binary files,
using largefiles for them is as easy as adding '--large' to your hg
add command. For example:
$ dd if=/dev/urandom of=thisfileislarge count=2000
$ hg add --large thisfileislarge
$ hg commit -m 'add thisfileislarge, which is large, as a largefile'
When you push a changeset that affects largefiles to a remote
repository, its largefile revisions will be uploaded along with it.
Note that the remote Mercurial must also have the largefiles extension
enabled for this to work.
When you pull a changeset that affects largefiles from a remote
repository, nothing different from Mercurial's normal behavior
happens. However, when you update to such a revision, any largefiles
needed by that revision are downloaded and cached if they have never
been downloaded before. This means that network access is required to
update to revision you have not yet updated to.
If you already have large files tracked by Mercurial without the
largefiles extension, you will need to convert your repository in
order to benefit from largefiles. This is done with the 'hg lfconvert'
command:
$ hg lfconvert --size 10 oldrepo newrepo
By default, in repositories that already have largefiles in them, any
new file over 10MB will automatically be added as largefiles. To
change this threshhold, set [largefiles].size in your Mercurial config
file to the minimum size in megabytes to track as a largefile, or use
the --lfsize option to the add command (also in megabytes):
[largefiles]
size = 2
$ hg add --lfsize 2
The [largefiles].patterns config option allows you to specify specific
space-separated filename patterns (in shell glob syntax) that should
always be tracked as largefiles:
[largefiles]
pattens = *.jpg *.{png,bmp} library.zip content/audio/*
I tried cloning the largefiles repo into the hgext folder in Mercurial
and ran
% pyflakes hgext/largefiles/*.py
hgext/largefiles/basestore.py:15: 'shutil' imported but unused
hgext/largefiles/basestore.py:17: 'error' imported but unused
hgext/largefiles/basestore.py:17: 'url_' imported but unused
hgext/largefiles/lfutil.py:39: redefinition of function 'dirstate_walk' from line 35
hgext/largefiles/localstore.py:57: undefined name 'err'
hgext/largefiles/overrides.py:13: 're' imported but unused
hgext/largefiles/overrides.py:28: 'proto' imported but unused
hgext/largefiles/overrides.py:611: local variable 'dest' is assigned to but never used
hgext/largefiles/overrides.py:662: redefinition of function 'write' from line 647
hgext/largefiles/proto.py:7: 'shutil' imported but unused
hgext/largefiles/proto.py:109: undefined name 'l'
hgext/largefiles/proto.py:126: undefined name 'capabilities_orig'
hgext/largefiles/proto.py:155: undefined name 'ssh_oldcallstream'
hgext/largefiles/proto.py:162: undefined name 'http_oldcallstream'
hgext/largefiles/remotestore.py:57: undefined name 'HTTPError'
hgext/largefiles/remotestore.py:61: undefined name 'urllib2'
hgext/largefiles/remotestore.py:86: local variable 'expect_hash' is assigned to but never used
hgext/largefiles/remotestore.py:95: undefined name 'store_path'
hgext/largefiles/remotestore.py:100: undefined name 'store_path'
hgext/largefiles/reposetup.py:15: 'httprepo' imported but unused
hgext/largefiles/reposetup.py:34: undefined name '_'
hgext/largefiles/reposetup.py:224: redefinition of unused 'node' from line 15
You should look into those errors.
> The main repository is living here:
> https://developers.kilnhg.com/Repo/Kiln/largefiles/largefiles
>
> (there's also a branch with some compatibility stuff that's useful for
> Kiln users, but that is not so relevant here).
>
> Cheers,
> Na'Tosha
--
Martin Geisler
Mercurial links: http://mercurial.ch/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20110922/a8592f02/attachment.pgp>
More information about the Mercurial-devel
mailing list