[PATCH 1 of 5 V2] treemanifest: create treemanifest class

Martin von Zweigbergk martinvonz at google.com
Wed Mar 18 17:47:16 CDT 2015


On Wed, Mar 18, 2015 at 3:28 PM Matt Mackall <mpm at selenic.com> wrote:

> On Mon, 2015-03-16 at 16:27 -0700, Martin von Zweigbergk wrote:
> > # HG changeset patch
> > # User Martin von Zweigbergk <martinvonz at google.com>
> > # Date 1426435855 25200
> > #      Sun Mar 15 09:10:55 2015 -0700
> > # Node ID 0d969d6efeef08935c93afc416bf406def6b8e59
> > # Parent  567ae53657544744155897ada91f16f8af61ad8a
> > treemanifest: create treemanifest class
> >
> > There are a number of problems with large and flat manifests. Copying
> > from http://mercurial.selenic.com/wiki/ManifestShardingPlan:
> >
> >  * manifest too large for RAM
> >
> >  * manifest resolution too much CPU (long delta chains)
> >
> >  * committing is slow because entire manifest has to be hashed
> >
> >  * impossible for narrow clone to leave out part of manifest as all is
> >    needed to calculate new hash
> >
> >  * diffing two revisions involves traversing entire subdirectories
> >    even if identical
> >
> > This is a first step in a series introducing a manifest revlog per
> > directory.
> >
> > This change adds boolean configuration option
> > experimental.treemanifest. When the option is enabled, manifests are
> > parsed into a new tree data structure with one tree node per
> > directory. At this point, it is just a different data structure in
> > memory; there is still just a single manifest revlog on disk.
>
> I think the right way to do this is:
>
> - at repo create time, check the flag
>    - add treemanifest to requires file
> - at startup, check requires file
>    - unconditionally use treemanifest if set
>

I agree with the second step, but I'm not sure about the first. After the 5
patches in this series, using treemanifest is still not a final decision;
the manifest is still stored in the regular flat format. In other words,
--config experimental.treemanifest=True can be passed to any command
without affecting anything besides performance. It seems like the above
would unnecessarily make it a final decision. I was rather planning on
writing the requires flag when a treemanifest is first written to disk
using the future submanifest revlog structure. I would even be possible to
unconditionally read into the treemanifest structure (if we imagine it got
fast enough). Then the config would clearly only be about writing. What do
you think?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150318/0aaa910e/attachment.html>


More information about the Mercurial-devel mailing list