Clone performance for files consisting of many zeros differs significantly on Linux and Windows.

Matt Mackall mpm at selenic.com
Tue Nov 6 16:42:26 CST 2012


On Mon, 2012-11-05 at 17:53 +0000, Schueler Nikolaus (LQKG IT RDS)
wrote:
> Hi Matt,
> 
> > -----Original Message-----
> > From: Matt Mackall [mailto:mpm at selenic.com]
> > Sent: Freitag, 2. November 2012 14:43
> > To: Schueler Nikolaus (LQKG IT RDS)
> > Cc: mercurial-devel at selenic.com
> > Subject: Re: Clone performance for files consisting of many zeros differs
> > significantly on Linux and Windows.
> > 
> > On Fri, 2012-11-02 at 11:24 +0000, Schueler Nikolaus (LQKG IT RDS)
> > wrote:
> > > Hi all,
> > >
> > > there seems to be a problem with cloning big files consisting only of
> > > zeros. Cloning times for Linux and Windows differ quite a lot. Any
> > > idea about that?
> > 
> > Try --profile.
> > 

[by deleting the original message, you credit me with far too good a
memory - I've had to go dig around the archives for the original
timings]

> here's the output from profiling on Windows:
> 
> D:\Users\schueler\tmp>"c:\Program Files\Mercurial\hg.exe" clone https://mtstest.
> lantiq.com/~hgtest/repo/HGSUP-779/manyzeros --profile
> *** failed to import extension hglock: No module named hglock
> http authorization required
> realm: hgtest team repos
> user: schueler
> password:
> destination directory: manyzeros
> requesting all changes
> adding changesets
> adding manifests
> adding file changes
> added 1 changesets with 1 changes to 1 files
> updating to branch default
> 1 files updated, 0 files merged, 0 files removed, 0 files unresolved
>    CallCount    Recursive     Total(s)    Inline(s) module:lineno(function)
>           15            0    137.0120    134.8801   <mercurial\util.pyc>:897(rea
> d)
>        +1005            0      2.1255      0.0088   +<mercurial\util.pyc>:884(sp
> litbig)
>        +2021            0      0.0027      0.0027   +<len>
>        +1004            0      0.0019      0.0019   +<method 'append' of 'collec
> tions.deque' objects>
>        +1017            0      0.0016      0.0016   +<method 'popleft' of 'colle
> ctions.deque' objects>
>          +14            0      0.0000      0.0000   +<method 'appendleft' of 'co
> llections.deque' objects>

Almost all the time is being spent here:

http://www.selenic.com/hg/file/df84bec19dce/mercurial/util.py#l898

This is interesting. Here's what happens:

- client requests a bundle
- server DEcompresses each individual delta to be sent
- then REcompresses the stream of deltas to increase overall compression
ratio
- client decompresses the stream
- client recompresses the deltas individually and stores them

All this decompression/recompression is actually fairly cheap as you can
see on the Linux side. But the streaming behavior here is supposed to be
evened out on both ends by this chunkiter class. This normally works
quite well, but in your case, the large block of zeros is getting
compressed by around 10000 to 1. This puts a huge bubble in the pipeline
on the client side and is thus exposing what is probably O(N**2)
inefficiency in Python's string implementation as used by our buffering
algorithm. Not sure why it's only affecting Windows so drastically,
probably something to do with realloc().

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list