this is what happens: .hgsubstate isn't touched $ hg init r $ cd r $ hg init sub $ cd sub $ echo a>a $ hg add a $ hg commit -m a $ cd .. $ echo sub = sub > .hgsub $ hg add .hgsub $ hg commit .hgsub -m sub # interesting line $ hg manifest .hgsub $ hg debugsub path sub source sub revision (hg version 1.8.1+82-48d606d7192b)
Why should Mercurial have recorded the subrepo state? You told it to only commit .hgsub, and that is what it did. Next time you make a full commit it will finish the job. What would you expect should happen? The only alternative I see is that we could forbid partial commits of .hgsub, but I don't see why that should be any better.
At least for git subrepos, an entry in .hgsub without a corresponding one in .hgsubstate is an inconsistent state that causes a crash on update. What should subrepo update do if it is not given a state? As for forbidding partial commits of .hgsub, what if you only want to commit your changes in subrepos without committing all the other dirty files in your working dir? It seems necessary to me that committing .hgsub should invoke the magic of .hgsubstate
Mercurial as subrepo do AFAIK not have any problems in this case, so it might be a bug in the git subrepo implementation if it can't handle it. You can commit the new subrepo sub among other dirty files by committing both .hgsub and sub.
If it is acceptable for a revision in history to contain a subrepo source without a corresponding state, I think operations on this revision should treat the subrepo as nonexistent: such as removing it on update, ignoring it on status, etc. Is there a way to implement this with the current API? I'm not completely sure, but .hgsub might be read in a different stage than .hgsubstate
To commit changes only in subrepos you can mention those subrepos explicitly on the commandline. If they're newly-added, you would need to also mention .hgsub. If there is an entry in .hgsub with no corresponding entry in .hgsubstate, wouldn't it make most sense to treat it as being at the null revision? On update, such a subrepo should then be updated to null (but _not_ removed), or perhaps even left as-is, leaving status to be computed from the null revision. Do SVN and Git allow a null working directory this way?
There is no effective difference between updating to the null revision and removing a subrepo. Neither svn nor git have a useful analog: in svn, the remove method nukes the whole directory; in git, everything but the .git directory is removed (and the repo is marked 'bare'). Importantly, the decision to update or remove a subrepo is decided before it gets to the subrepo API. I think it should only make sense to call subrepo.get if it is actually getting a revision. I did not anticipate get receiving an empty state, and gitsubrepo currently crashes if you try.
There is a very important difference: changesets in a subrepo may not exist anywhere else. This is why `hg up null` in an outer repo does not remove subrepos. For SVN subrepos, we could go ahead and remove the subrepo, since it avoids the above issue by design (following the usual checks for local changes). For git subrepos, turning it into a bare subrepo if gitsubrepo.get receives 'null' as the target revision sounds workable, but keep in mind that git subrepos may also have local-only history that shouldn't be nuked by `update`.
I think we are in agreement that updating to a missing hgsubstate reference should end up calling subrepo.remove, which explicitly checks for changes, and doesn't remove the repository in the case of distributed subrepos. For the less important issue of _where_ subrepo.remove should be called, either subrepo.get checks for an empty state or somewhere higher up does it. For empty hgsubstate references in general, several subrepo actions (status, get, archive, push, etc) either need to explicitly check for an empty state, or the caller needs to not call them with empty states. (hg must interpret '' as the null revision and does the right thing already)
Digression: I think we're going to need to eventually tackle this 'subrepos may have precious contents' by adding a dirty flag (possibly named needpush for clarity). This dirty flag would be cleared on push and also avoid the whole issue of empty pushes to read-only repos. So don't get too bogged down here by that.
--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:18 EDT --- This bug was previously known as _bug_ 2716 at http://mercurial.selenic.com/bts/issue2716
Bulk close: no activity for >2 years -> WONTFIX
Bulk change recent WONTFIX -> new, more descriptive ARCHIVED state (sorry for the spam)