Improvement to how subrepos are found?

Martin Geisler mg at aragost.com
Wed Jan 5 03:36:35 CST 2011


Mads Kiilerich <mads at kiilerich.com> writes:

> On 01/04/2011 11:10 AM, Martin Geisler wrote:
>> Hi everybody,
>>
>> My client sent me the suggestion below, and I think it has some
>> interesting elements. The proposal is to make the right-hand side of
>> the .hgsub file a fallback location instead of being the main
>> location.
>
> I agree that trivial relative subrepo paths should be used in most
> cases.
>
> One of the main reasons for not always using trivial relative subrepo
> paths might be that hgweb doesn't have good support for serving
> subrepos. Consider for example the url for a subrepo rooted in "file".

True, that is a difficult case :)

> AFAIK the major hosting sites also don't support in situ subrepos at
> all. It would be nice to get to a point where the repo structure
> wasn't limited by unnecessary technical limitations.

Right, most hosts don't support nested repositories on the server. That
is why the fallback URL (absolute like 'http://bitbucket.org/... or
ralative like '../sub') is still needed.

> The fallback idea doesn't seem appealing. I see Mercurial as the kind
> of tool that do exactly what it is told, always do it the same way,
> rather would fail and ask for new instructions than try to guess.

I see Mercurial the same way -- except that there is no guessing
involved here. Thanks to the cryptographic hashes, it doesn't matter
where I get a changeset from. As long as I get a changeset with the hash
stored in my .hgsubstate file, then I'm good. [Here I'm of course
ignoring SVN subrepos where there is no hashes and the fact that we
don't verify incoming changeset hashes...]

So the guessing should be acceptable. One problem is the extra
round-trip required to hit a URL that might never work on, say,
Bitbucket, that is kind of ugly.

Perhaps that suggests that this should only be attempted for local
repositories? That is, first look for the subrepo in-place when making a
local clone, then fallback to the URL or path given in .hgsub.

We're solving two things here:

a) it is silly to re-clone a subrepo from http://... when it is right
   there on disk already. Cloning from the local subrepo also means that
   hardlinks are created.

b) clones of clones does not work when subrepos use relative paths that
   go outside the main repo, that is, paths of the form '../sub'.

Remote repositories can be assumed to have their '../sub' subrepos in
the right place -- after all, a remote repository is meant to be used as
the source of clones. At first glance, this means that there is no need
to look for the subrepos in-place.

However, if a .hgsub file says

  foo = http://hg.company.com/foo

then an "outpost" (a remote company branch) might want to make one clone
from the central server and then serve that clone over HTTP to other
machines in the outpost. In that case it would be silly to go all the
way back to hg.company.com to fetch the subrepos when the subrepo is
already cloned to the outpost -- that is point a) above.

The guys in the company could of course solve the problem by using
relative subrepo paths and then clone things as appropriate onto the
output server. Perhaps that is a simpler solution...

> More important: Changing existing behavior in subtle ways is probably
> not an option - especially considering how hard it is to recover from
> subrepo inconsistencies. I think the noble goal should be reached in
> another way.

Well, the end result is not changed here: you get the same subrepo
changesets but you might get them from somewhere else.

> I assume that most projects with external dependencies will have a
> work-flow where they prefer to have a local mirror of the repo. Very
> few users should pull new versions from the external repo. It would be
> an error if other users did it unintentionally, so there is no reason
> to make it easy for everybody to do it.

It seems that my client will make such local mirrors of their big repo
with 70+ subrepos. The subrepos are not really external dependencies,
they are more like different internal modules as far as I can tell.

> Couldn't most projects just start using simple relative paths in
> .hgsub and manage the external url by using some combination of
> documentation and local configuration of paths and subpaths? What
> extra tool support would be nice to have?

I must admit that I'm unsure exactly why my client is not using relative
subrepo paths everywhere. I'll ask them!

-- 
Martin Geisler

aragost Trifork
Professional Mercurial support
http://aragost.com/en/services/mercurial/blog/


More information about the Mercurial-devel mailing list