[PATCH] Add script to rewrite manifest to workaround lack of parent deltas
Benoit Boissinot
benoit.boissinot at ens-lyon.org
Tue Aug 25 07:40:31 CDT 2009
On Fri, Aug 21, 2009 at 06:17:17PM -0400, Greg Ward wrote:
> On Fri, Aug 21, 2009 at 5:58 PM, I wrote:
> >> Maybe it's cleaner to pop from the end
> >> i = visit.pop()
> > [...]
> >>> + if len(parents_with_child) == 0:
> >>> + next.append(c)
> >>> + visit = next + visit
> >> if you pop from the end, then you can do:
> >> visit += next
> >
> > Not only is it cleaner, it massively improves the shrinkage factor on
> > my test repo: from 8.5x smaller to 16.9x smaller. Specifically,
> > pop(0) with "visit = next + visit" shrank a 56.1 MB manifest to 5.5
> > MB; fiddling with the order shrank it to 3.3 MB instead. Wow! I
> > suppose I should test it for correctness though. ;-)
>
> Damn. Should have known it was too good to be true; this makes the
> sort rather unstable. Example: with the original, visit.pop(0) and
> 'visit = next + visit' version, repeated runs of shrink-manifest look
> like this:
>
> $ ~/src/hg-crew/contrib/shrink-manifest.py
> reading 15043 revs ................
> sorting ...
> writing 15043 revs ................
> old file size: 58830219 bytes ( 56.1 MiB)
> new file size: 6929734 bytes ( 6.6 MiB)
> shrinkage: 88.2% (8.5x)
> $ rm .hg/store/*.old
> $ ~/src/hg-crew/contrib/shrink-manifest.py
> reading 15043 revs ................
> sorting ...
> writing 15043 revs ................
> old file size: 6929734 bytes ( 6.6 MiB)
> new file size: 6929734 bytes ( 6.6 MiB)
> shrinkage: 0.0% (1.0x)
>
> Good: stable and predictable.
>
> But with visit.pop() and "visit += next", it's not so good:
>
> $ ~/src/hg-crew/contrib/shrink-manifest.py
> reading 15043 revs ................
> sorting ...
> writing 15043 revs ................
> old file size: 58830219 bytes ( 56.1 MiB)
> new file size: 3472373 bytes ( 3.3 MiB)
> shrinkage: 94.1% (16.9x)
> $ rm .hg/store/*.old
> $ ~/src/hg-crew/contrib/shrink-manifest.py
> reading 15043 revs ................
> sorting ...
> writing 15043 revs ................
> old file size: 3472373 bytes ( 3.3 MiB)
> new file size: 17573799 bytes ( 16.8 MiB)
> shrinkage: -406.1% (0.2x)
> $ rm .hg/store/*.old
> ~/src/hg-crew/contrib/shrink-manifest.py
> reading 15043 revs ................
> sorting ...
> writing 15043 revs ................
> old file size: 17573799 bytes ( 16.8 MiB)
> new file size: 5870243 bytes ( 5.6 MiB)
> shrinkage: 66.6% (3.0x)
>
> Yuck: I don't like that behaviour. It should be OK to run
> shrink-manifest repeatedly on an already-shrunk manifest. Reverting
> to the original algorithm.
I have an almost stable algorithm that stays around 3.3, do you think it
should be more stable (I can play more with it if needed)?
regards,
Benoit
PS: Dirkjan, maybe you want to play with it to see if it makes a
difference on the python repo
--
:wq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shrink.py
Type: text/x-python
Size: 6065 bytes
Desc: not available
Url : http://selenic.com/pipermail/mercurial-devel/attachments/20090825/8d76f2ba/attachment.py
More information about the Mercurial-devel
mailing list