Turning cvs2{svn,git} into cvs2hg

Dirkjan Ochtman dirkjan at ochtman.nl
Sun Jul 19 06:23:23 CDT 2009


On Fri, Jul 17, 2009 at 15:47, Greg Ward<greg at gerg.ca> wrote:
> I can see three ways to fix that mismatch:
>
> 1) modify the way cvs2git generates fixup commits so that
> hg-fastimport does not create pathological Mercurial repos
> 2) write a filter that turns the fastimport dump created by cvs2git
> into something that hg-fastimport + Mercurial handle nicely
> 3) modify hg-fastimport to handle those fixup commits directly
>
> I think #1 benefits the most people, since it could potentially make
> life simpler for git-fastimport as well.  (They could, in theory,
> eventually drop support for the implementation quirk that cvs2git
> takes advantage of.  Unfortunately, they promoted that implementation
> quirk to a documented part of the syntax when cvs2git started using
> it, so that seems unlikely.)  (Michael H.: this is my brief summary of
> a thread on the git mailing list that you pointed out to me a few
> months ago; if I'm summarizing inaccurately, my apologies.)

I'd be interested to hear what the quirk is. When I last looked at the
fast-import format, it looked like something that would suit Mercurial
quite well. I hear bzr has taken it up as well. It would be awesome to
have a good format to exchange data between at the least the current
crop of DVCSs, e.g. bzr, git and hg. Getting a good fast-import tool
would then allow deprecation of our custom bzr and git import code.

> So now I'm thinking, to heck with fastimport.  Just write a new
> backend ("output option") for cvs2svn that directly populates a
> Mercurial repo.  Then I can use existing Mercurial tools and APIs to
> turn that intermediate repo into my final product.
>
> The benefits of this are fairly obvious:
>  * no more awkward 2-step conversions for cvs->hg
>  * conversion should run faster (and use less memory)

Sounds good to me.

> The drawbacks are more subtle:
>  * not using hg's convert extension means my proposed cvs2hg would not
>    benefit from one key feature of it, namely the toposort that produces
>    a space-optimal hg repo (OTOH, my hg-writing backend would certainly
>    depend on Mercurial's API, so I could hook in toposort somehow)

I've run Benoit's revlog reordering script on the Python repo with
great succes. You should try it for your case as well. If that works
for you, I'd posit that the toposorting the changelog as well wouldn't
make that much of a difference. Actually, post-facto toposorting would
probably not be extremely hard and it would be useful to have a script
for in general.

>  * cvs2svn maintainers now have to worry about maintaining 3 personalities
>    (OTOH, they already have a not-very-functional cvs2hg script + sample
>    config in their source tree)

Well, if you worry about it they don't have to so much.

>  * hg-fastimport might go back to being a neglected and unloved extension
>    if I concentrate on cvs2hg

Maybe I ought to take it up, then. Are there any actual users out there?

Cheers,

Dirkjan



More information about the Mercurial-devel mailing list