Something changed shrink-revlog for the worse

Greg Ward greg at gerg.ca
Mon Jan 11 19:48:21 CST 2010


On Mon, Jan 11, 2010 at 11:16 AM, Benoit Boissinot
<benoit.boissinot at ens-lyon.org> wrote:
> I don't see why first parent matters. Except if you have a special case
> (like more stuff happening on trunk, that's probably what happens for
> you) either parent should work.

Which parent you follow most certainly matters.  Step away from this
new-fangled DVCS world where all branches are equal, and get into the
head of a CVS or Subversion user for a minute.  Over here in CVS-land,
branches are big, scary, rare things.  (They're scary because CVS
handles them so poorly, therefore they are rare and big.)  You only
create a branch when you really need to, e.g. for making a release to
customers.  Changes on a release branch are infrequent and small
compared to changes ongoing on the trunk.

Now translate to Mercurial, *but* to a Mercurial repo converted from a
highly branch-heavy CVS repo.  Furthermore, consider that the
conversion process takes care to convert CVS merges to Mercurial
merges.  (Of our ~104,000 changesets, ~12,000 are merges, and most of
those are because my conversion tools parse CVS commit messages and
add a second parent when appropriate.)  For any given branch B, the
manifest on the trunk quickly diverges from the manifest on that
branch.  Each new trunk changeset has a small manifest delta relative
to its first parent, but a large delta relative to the head of B.  And
that delta gets larger as time goes on.

Oh yeah, did I mention that we have release branches that live for
years?  So if I merge from a branch created in 2006 to the trunk in
late 2008, the manifest delta *relative to the second parent* is huge.
 But relative to the first parent, it is as small as ever.

[Benoit talking to himself in separate messages]
> Did you try a simple DFS ?
>
> It's actually what you did, you don't use the order of the nodes, right?

Correct: my toposort is completely driven by topology, not revision
order.  That's deliberate and it's a big win in my case.

Hmmm.  I think my first patch to shrink-revlog will be to make it
possible to select between different toposorts.  I think this
discussion will be much easier if we can add toposorts at will and
test them against each other in situ, rather than slinging patches
around.  If a clear winner emerges, we can go back to having just one.
 If not, we can leave the selectability in place.  Sound good?

Greg


More information about the Mercurial-devel mailing list