A case for subrepos with absolute URLs

Arne Babenhauserheide arne_bab at web.de
Sun Dec 11 15:29:53 CST 2011


Hi, 

We had some discussion on how bad absolute subrepo-URLs are, but I think now that these are merely implementation details seeping through, which make life hard for those who actually use subrepos. 

Specifying an absolute path for a subrepo is a very simple way to specify an upstream. Thus, if you use Mercurial with absolute-path-subrepos, it can become an advanced dependency tracking system. 

Starting to do that is as simple as can be: Clone your dependency, update it to the correct version and add it as subrepo. Then, when someone else gets your repository, he gets the same setup, including the upstream information (that’s what the subrepo-path is then: upstream information in the place where you expect it). 

Problems in this scheme come only from one source: What happens if you don’t have access to the upstream right now? And the only real problem you get is, that you cannot update to revisions anymore, which use versions of the upstream you don’t have yet. 

The reason for that is, that Mercurial tries to guarantee a completely consistent state along subrepos, which creates strong coupling. And that is not tied to absolute URLs. It is rather a fundamental problem of strong coupling between seperate Mercuial-Repositories. 

If we have relative paths, then we remove a subrepo and delete its relatively specified source repo a few years later, we cannot go back to these old revisions anymore. 

So the problem does not originate in absolute URLs, these just show the problem. It originates in the strong coupling. 

Because of that I want to argue, that Mercurial should not discourage the use of absolute URLs in subrepos, but rather reduce the consistency requirement over subrepo boundaries. A few ideas: 


* Add a way to get subrepo revisions from the parent repo on pull in the same way as we can get them when cloning.

* Try harder to find relatively specified subrepos by checking heuristics: often “subrepo” can be found at “../subrepo”. 

* Add the possibility of ignoring missing subrepos (this should make it impossible to change the corresponding substate without changing the subrepo to an existing subrepo source).

* Add the possibility of ignoring missing revisions in an existing subrepo-source. Here we’d need some way to specify a new revision for the subrepo. 

* Maybe even add a place inside the .hg where we store all subrepos which any revision depended on, so we don’t need to be able to access them when we update to a revision which needs them. Hardlinks should make the cost of this negligible in most cases.


Especially the last three parts should reduce the coupling between parent-repo and the source of the subrepo, so subrepos should come closer towards becoming first class citizens in Mercurial.

Best wishes, 
Arne

PS: This even bit me on my own systems when pushing over ssh, because I had a seperate target for the subrepo (to have it in my double-backed-up filesystem-tree). Or had to reinitialize a subrepo, because the old one broke → inaccessible revisions of the parent repo ⇒ Never require perfection in any part of the system.


More information about the Mercurial-devel mailing list