Converting big files

Dirkjan Ochtman dirkjan at ochtman.nl
Mon Apr 12 01:43:43 CDT 2010


On Sun, Apr 11, 2010 at 22:32, Greg Ward <greg at gerg.ca> wrote:
> So, we need a tool to convert an existing repo with some big files in
> it to a new repo where those big files are replaced by standin files
> plus a new central store containing the actual big files.
>
> Idea #1: bfiles should wrap/extend 'hg convert' and provide a way to
> specify what are the big files.
> pro:
>  - use existing convert machinery, so less to reinvent
>  - could probably work for conversions from svn etc. as well as hg->hg
> con:
>  - extending an extension feels fragile -- presumably even less
> stable API than core hg
>
> Idea #2: implement something separate from 'hg convert'
>  2a: new command implemented by bfiles extension (bfconvert?)
>  2b: new extension entirely
>  2c: new standalone script
> pro:
>  - unaffected by API changes in hgext.convert (albeit still subject
> to the whims of hg's and bfiles' APIs)
> con:
>  - risk of reinventing wheels that already exist in hgext.convert
>  - might only work for hg -> hg conversions
>
> Am I missing any pros or cons for these two ideas?  Or is there
> another way to implement this that I have not thought of?  Other
> thoughts, opinions, ideas?

I wouldn't worry too much about API changes in convert.

FWIW, I've been thinking about a tool to clean up hg repositories. One
big point *against* convert in hg-to-hg conversions, for me, is the
loss in fidelity of tags. I.e. using convert will result in a
repository that has all the tags smashed into a single commit from a
non-descript user (issue872, I think). This stems from the model hg
convert uses, which is of course very general.

For my conversions (Jython, in this case), I've been thinking about
some other way of doing this, that would let me preserve tags
information by staying very close to hg's ctx/memctx architecture.
Most of the operations I'm thinking about here are filemap-like things
(that is, I want to split a repo that has been converted by
hgsubversion). I suspect such a script could be more generally useful,
but I haven't gotten started. The initial part of just pushing ctxs
through commitctx shouldn't be too hard, and your bfiles extraction
thing should be very easy to plug in to that.

Cheers,

Dirkjan


More information about the Mercurial-devel mailing list