A thought on subrepos

Matt Mackall mpm at selenic.com
Thu Apr 14 17:55:09 CDT 2011


On Thu, 2011-04-14 at 15:00 -0500, Matt Mackall wrote:
> It seems many projects with subrepos are structured like:
> 
> app/ <- main repo
>  lib/ <- a subrepo
> 
> This is perhaps the most obvious way to do things, but is not really
> ideal. A better way is:
> 
> build/ <- main repo
>  app/ <- subrepo
>  lib/ <- subrepo
> 
> For starters, this does away with most of the "I didn't mean to
> recursively commit" issues as commits at the top-level will be much less
> common.
> 
> This also greatly lowers the degree of dependence between app/ and lib/,
> but still gives you the ability to commit and tag coherent combinations
> of app and lib.
> 
> A general statement of this approach is: "if a repo contains real code,
> it shouldn't contain subrepos."

It might help to put this in perspective with a concrete example. Here's
the one that actually triggered this thought: how would Mercurial deal
with having a non-trivial dependency?

Let's say we wanted to depend on Paramiko (a Python implementation of
SSH). We wouldn't simply do:

mercurial/
 paramiko/

That, as I hinted at before, would not be good engineering practice. In
general, we're not really tightly bound to exact changesets of
Paramiko. 

And for all the devs out there who've already got or who can trivially
'apt-get python-paramiko', we're not doing them any favors by throwing
subrepos at them.

So instead we'd do:

mercurial-build-env/
 mercurial/ <- our current repo, exactly as it currently is
 paramiko/ <- an independent repo

Now this leaves the question: when would we ever commit at the top
level? I think the answer is:

- whenever there was a significant reason to bump to a later library
release
- whenever we tagged a Mercurial release

At this level, using mercurial-build-env is more or less completely
optional. If you've already got a working copy of paramiko, you can skip
it.

We can trivially expand this by pulling in the things that Mercurial
already depends on, or optionally uses:

mercurial-build-env/
 Makefile   <- build everything
 mercurial/ <- our current repo, exactly as it currently is
 python/
 paramiko/
 openssl/
 python-subversion/
 svn/
 git/
 bzr/
 cvs/
 darcs/
 monotone/
 pygments/
 gpg/
 gettext/
 docutils/

This is basically what you need to run everything in the test suite. I
hope we can all agree that having git/ and friends as subrepos under
mercurial/ is... absurd[1]. But the same is true for everything else
here: these packages are all independent peers.

Ok, so that's getting to be a contrived example, but it's well on the
way to how this should work for large projects. The last N companies
I've worked for have all been in the business of building an entire
operating system's worth of deliverables from countless source packages.
If you want to actually succeed at this, you need to respect the
independence of packages, otherwise you will find yourself quickly
pulled into dependency hell and/or unable to actually update anything.

[1] And if git ever got Mercurial subrepo support, we could have an
infinite regress!

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list