Compressing revlog delta chunks reusing the zlib stream

Current revlog format compresses delta chunks separately from each other. The idea is using the same compression object for a snapshot and its delta chunks, simply sync-flushing the zlib stream for each chunk.

Test results

Mercurial crew (http://hg.intevation.org/mercurial/crew) - 10395 revisions (head: 4612cded5176 )

total size (kb)

relative size
(plain == 1)

00manifest.d size (kb)

00manifest.d relative size
(plain == 1)

plain

25708

1.00

8632

1.00

zlibstream applied

18400

0.72

4104

0.48

plain,
shrinked manifest

21740

0.85

4668

0.54

zlibstream applied,
shrinked manifest

16856

0.66

2560

0.30

Linux kernel (http://www.kernel.org/hg/index.cgi/linux-2.6/) - 181609 revisions (head: 64ef3e5df11e )

total size (kb)

relative size
(plain == 1)

00manifest.d size (kb)

00manifest.d relative size
(plain == 1)

plain

1418724

1.00

895624

1.00

zlibstream applied

1343864

0.95

874396

0.98

plain,
shrinked manifest

752248

0.53

231112

0.26

zlibstream applied,
shrinked manifest

681380

0.48

211912

0.24

Patch queue

<!> The code currently doesn't properly sets and checks the requires file, and the changes on revlog chunks take effect immediately on any new revisions: take care when running this on existing repositories!

http://bitbucket.org/wbruna/hg-zlibstream/

Mailing list threads

http://www.selenic.com/pipermail/mercurial-devel/2010-February/018564.html