Converting big files
Greg Ward
greg at gerg.ca
Sun Apr 11 15:32:20 CDT 2010
Hi folks --
a bit of a design debate has arisen with bfiles, and I want some
outside output. Here's the scenario: you have a Mercurial repo that
currently tracks large binary files as regular files. You want to
switch to using bfiles, i.e. get those large binary files out of
.hg/store and put them somewhere else. (Quick summary: bfiles works
by tracking a "standin file", .hgbfiles/<bigfile>, for each <bigfile>.
That's a 41-byte file containing the SHA-1 hash of the big file's
content plus a newline. The actual big files live on a central store
somewhere. The central store can be a filesystem path
(local/NFS/SMB/whatever), an HTTP URL, or an SSH URL. It's structured
like <bigfile>/<hash>, i.e. every big file is a directory containing
multiple revisions, whose filenames are just the SHA-1 hash of the
contents.)
So, we need a tool to convert an existing repo with some big files in
it to a new repo where those big files are replaced by standin files
plus a new central store containing the actual big files.
Idea #1: bfiles should wrap/extend 'hg convert' and provide a way to
specify what are the big files.
pro:
- use existing convert machinery, so less to reinvent
- could probably work for conversions from svn etc. as well as hg->hg
con:
- extending an extension feels fragile -- presumably even less
stable API than core hg
Idea #2: implement something separate from 'hg convert'
2a: new command implemented by bfiles extension (bfconvert?)
2b: new extension entirely
2c: new standalone script
pro:
- unaffected by API changes in hgext.convert (albeit still subject
to the whims of hg's and bfiles' APIs)
con:
- risk of reinventing wheels that already exist in hgext.convert
- might only work for hg -> hg conversions
Am I missing any pros or cons for these two ideas? Or is there
another way to implement this that I have not thought of? Other
thoughts, opinions, ideas?
Thanks --
Greg
More information about the Mercurial-devel
mailing list