On compressing revlogs

Wed Jun 6 19:08:56 CDT 2012

On Mon, Jun 4, 2012 at 2:56 PM, Matt Mackall <mpm at selenic.com> wrote:

> One possibility is to throw a disk cache at it.
>
> If we cached a manifest revision that was:
>
> a) uncompressed
> b) sufficiently close to tip to avoid the bulk of decompression on the
> working set
> c) sufficiently far from tip to encompass the working set
>
> ..then we could probably achieve a time scale close to optimal.

I gave the read side of this a try this afternoon; there's a WIP patch at
http://pastebin.com/0RNH0Ve9

The manifest cache has to be constructed by hand right now, but this isn't
too painful:

hg debugdata -c 337000 | head -1 > .hg/store/mfcache
hg debugdata -m $(cat .hg/store/mfcache) >> .hg/store/mfcache

The tip of my manifest has a delta chain about 43,000 deltas long.

plainhg --time debugdata -m -- -1 | head -0
Time: real 1.060 secs (user 0.990+0.000 sys 0.070+0.000)

Ouch. I constructed a cache entry 1,000 revs back from the tip, and here's
what happened:

devhg --time debugdata -m -- -1 | head -0
Time: real 0.120 secs (user 0.080+0.000 sys 0.030+0.000)

With a cached manifest from 10,000 revs back, performance is still okay,
though starting to get a bit iffy:

devhg --time debugdata -m -- -1 | head -0
Time: real 0.330 secs (user 0.290+0.000 sys 0.040+0.000)

So at least the read side of this is beneficial, provided the write side is
done sensibly. I need to spend some time thinking about how that ought to
work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20120606/1686ee38/attachment.html>