Manifest compression

Matt Mackall mpm at selenic.com
Thu Aug 15 16:05:49 CDT 2013


On Fri, 2013-08-09 at 11:07 -0400, Augie Fackler wrote:
> On Thu, Aug 08, 2013 at 11:45:35PM +0000, Wojciech Lopata wrote:
> > Hello,
> >
> > My name is Wojciech Lopata, I'm an intern at Facebook. I'm going to
> > spend next weeks implementing manifest compression, basing on ideas
> > described in wiki:
> > http://mercurial.selenic.com/wiki/ImprovingManifestCompressionPlan. Feel
> > free to email me with any ideas and suggestions.
> 
> The one thing that would be nice to preserve would be the ability to
> parse only the deltas of a manifest and know what paths were modified,
> but I'm not sure that's actually practical. Matt, do you have a
> feeling on that? That property of manifests has been really helpful
> for hg-on-bigtable (though we're now getting annoyed with how much
> space manifests use up, and I'm planning to watch this work with great
> interest.)

This turns out to be a pretty big deal for verify as well and various
other code paths that automatically take advantage of it via contexts.

The verify thing might be avoidable if we had a data-structure-aware
delta-applier that got rid of all the memcpy and parsing we end up doing
while build successive versions of manifests.

Some other shortcuts might be possible though. For instance, if we have
three files listed in a changeset, and three nodes changed in a
manifest, we can usually infer that they're one to one (and sorted).
This won't work with merges, but I think it works most of the time.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list