Thoughts on Mercurial and Git

Brendan Cully brendan at kublai.com
Tue Mar 27 19:14:33 CDT 2007


On Tuesday, 27 March 2007 at 12:44, Theodore Tso wrote:
> The one thing that I do wonder about both of these approach is that
> they handles the "overlay" and the "local branch" as a special case.
> This works, but in the long run I can forsee some potential problems.
> 
> For example, what if you have a changeset which is on the local
> branch, which and then some or all of the branch gets pushed to the
> mainline, and then gets pulled back down into primary, aster revlog.
> Is there any way to make the changeset on the local branch go away
> without destroying and recreating the entire branch?  (This is another
> case of the only-way-you-can-remove-revisions-is-by-truncating design
> tradeoff in hg.)

No, there isn't. I guess it would be nice to have a data pool. I think
I could extend overlay fairly easily to optionally record only the
index update (64 bytes per revision) and consult a shared data
store. We could probably arrange to write to it as well.

I doubt it would save very much space for most uses though.

by the way, overlay and local-branch are at right angles. local-branch
would probably be fastest and happiest as an overlay from the main
store (O(1) branch creation), but it doesn't depend on it.

> A similar issue exists in the overlay changes.  Right now, I have a
> single git repository which mirrors Linus's kernel tree, and I never
> check anything into it. I call that my "base".  All of my other git
> trees for the kernel use "base" as git's alternate objects directory,
> so only objects that aren't in Linus's tree needs to be stored in my
> local git repositories.  So far so good; this could map into the hg
> overlay model fairly easily.  But sometimes changes which are in my
> local repository will get pushed or pulled into mainline, which means
> they will appear in "base".  At that point, is there any way for the
> changeset which is in the local revlog to get garbage collected?  I
> don't think so.

no, not currently. But if I were to add the 'remote data' option, it
would probably be pretty easy to garbage collect shared revs.

> On another point, I'm curious.  Does your overlay patch allow for a
> user to create an overlay, and then make changes in the overlay, and
> then later, after new changes are pulled from mainline into "base",
> and then when the overlay pulls from the base, to use the new changes
> from the base?  Or is that once you have made changes in the overlay,
> that all changes must be appended to the overlay, even if that means
> you are copying them from the original base repository of the overlay?
> The overlay patch could have been implemented either way, and I'm not
> sure how it was done.

it used to just advance a pointer in the parent where possible. That
turned out to have a couple of weird corner cases due to the linkrev
field in each revlog, so it doesn't do that any more. But I like the
shared data idea and may hack it up.

> Hmm, thinking this some more, I guess this would be my one true
> "complaint" about Hg.  Git just seems architecturally cleaner, in
> terms of being able to handle overlay repositories and local branches
> in a much cleaner way, without needing special cases.  That doesn't
> mean that Hg shouldn't try to do it, but that it will probably be more
> difficult for Hg to add these features than it was for git.  But
> that's a technological aesthetics argument, not a
> feature/functionality or performance argument.

I think hg just hasn't spent as much time as it should on designing an
external API. It's very hackable, but there aren't always clean ways
to hook in, yet. I don't think it's a core architectural problem, it
just hasn't been a priority. bzr, for example, has spent a lot more
time on API, and as a result it has all kinds of cool plugins for
things like svn or hg backends etc. It just can't seem to get quite
fast enough.


More information about the Mercurial mailing list