Internal info about repository size

Dirkjan Ochtman dirkjan at ochtman.nl
Fri Sep 25 05:00:08 CDT 2009


On Fri, Sep 25, 2009 at 10:33, Berkes Adam <adam.berkes at intland.com> wrote:
> I would like to ask a question how it is possible (not a blame, really
> interested) that two converted (from svn) repo has so huge size
> difference (with the same changesets and files but one converted with
> --datesort), and what is more interesting is there any trick to reduce
> the bigger one? It is the result that branches are rebased during date
> sorted convert and those additional revisions resulting duplications?

This happens because there's a weakness in the revlogng structures
Mercurial uses to save history. Basically, we always save a diff
against the last committed changeset. If that changeset is always on a
different branch from the currently committing changeset, we keep
having to repeat the diff against the branches for each changeset,
thus growing the data files a lot. We're working on a fix for this (a
scheme affectionately called parent deltas), and there is a script in
circulation on the mailing list that can keep the order of revisions
in the repository but reorder some of the revlog files (usually the
manifest file is particularly prone to growing).

Cheers,

Dirkjan


More information about the Mercurial-devel mailing list