Extreme unbundling performance difference between none-v2 and zstd-v2 bundle

Gregory Szorc gregory.szorc at gmail.com
Sun Dec 31 17:30:58 EST 2017


On Sat, Dec 30, 2017 at 11:45 AM, Stefan Ring <stefanrin at gmail.com> wrote:

> Hi,
>
> I have a repository (in generaldelta format) that when fed to hg
> bundle --all results in a 462 MB file with -t none-v2 and a 75 MB file
> with -t zstd-v2. Bundling is reasonably fast. It is unbundling where
> things fall apart.
>
> $ time hg clone -U zstd-bundle.hg test-fast
> requesting all changes
> adding changesets
> adding manifests
> adding file changes
> added 57372 changesets with 106051 changes to 15184 files (+7 heads)
> new changesets 194d1d9b4b58:2dc5d8355c7e
>
> real    1m18.643s
> user    1m14.319s
> sys     0m4.231s
>
> $ time hg clone -U uncompressed-bundle.hg test-slow
> requesting all changes
> adding changesets
> adding manifests
> adding file changes
> added 57372 changesets with 106051 changes to 15184 files (+7 heads)
> new changesets 194d1d9b4b58:2dc5d8355c7e
>
> real    25m34.647s
> user    25m2.263s
> sys     0m30.693s
>
> Any ideas?
>
> Tested with 4.2.3 and the current hg-stable head
> (4.4.2+9-058c725925e3). With 4.2.3 it was even more pronounced
> (40min), but this was also using a different Python build, so it is
> not directly comparable. The measurements above are using the same
> Python and Mercurial versions (current hg-stable).
>

Using `hg clone` with a bundle source appears to go through a not very
optimal (read: slow) code path. If you use `hg init` + `hg unbundle`, the
performance should be much faster.

Under the hood, `hg clone <bundle> <dest>` will decompress the source
bundle to a temporary file and then initiate an `hg clone` from that
decompressed bundle. But the code accessing repos from bundles isn't as
efficient as accessing repos from their normal storage format. Furthermore,
when you clone from this "bundle repo," it will effectively produce a new
bundle. `hg unbundle` skips all of this and just "streams" the bundle file
into the destination repository.

Why the uncompressed source was slower than zstd, I'm not sure. That
doesn't make much sense to me!

IMO we should recognize the bundle source case in the `hg clone` code and
have it be fast by default (by skipping the "bundle repo" layer).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20171231/79dea356/attachment.html>


More information about the Mercurial-devel mailing list