Clone performance for files consisting of many zeros differs significantly on Linux and Windows.

Matt Mackall mpm at selenic.com
Fri Nov 9 11:52:00 CST 2012


On Fri, 2012-11-09 at 09:56 +0000, Schueler Nikolaus (LQKG IT RDS)
wrote:
> Hi Matt,
> 
> is there something we can do about this in the near future or does
> that need deeper research and rework for the Windows implementation?
> (I fear the second alternative may be more probable). If could assist
> in doing research or even fix, I would be glad to help.

It's probably the sort of thing you can fix in an afternoon of
tinkering. The usual way to deal with this is to switch from a style
like this:

a = ""
while len(a) < wanted:
    a += more()
return a

where += implies a quadratic amount of copying to something like this:

a = []
l = 0:
while l < wanted:
    a.append(more())
    l += len(a[-1])
return ''.join(a)

..which doesn't.

> > -----Original Message-----
> > From: Matt Mackall [mailto:mpm at selenic.com]
> > Sent: Dienstag, 6. November 2012 23:42
> > To: Schueler Nikolaus (LQKG IT RDS)
> > Cc: mercurial-devel at selenic.com
> > Subject: RE: Clone performance for files consisting of many zeros differs
> > significantly on Linux and Windows.
> > 
> > On Mon, 2012-11-05 at 17:53 +0000, Schueler Nikolaus (LQKG IT RDS)
> > wrote:
> > > Hi Matt,
> > >
> > > > -----Original Message-----
> > > > From: Matt Mackall [mailto:mpm at selenic.com]
> > > > Sent: Freitag, 2. November 2012 14:43
> > > > To: Schueler Nikolaus (LQKG IT RDS)
> > > > Cc: mercurial-devel at selenic.com
> > > > Subject: Re: Clone performance for files consisting of many zeros
> > > > differs significantly on Linux and Windows.
> > > >
> > > > On Fri, 2012-11-02 at 11:24 +0000, Schueler Nikolaus (LQKG IT RDS)
> > > > wrote:
> > > > > Hi all,
> > > > >
> > > > > there seems to be a problem with cloning big files consisting only
> > > > > of zeros. Cloning times for Linux and Windows differ quite a lot.
> > > > > Any idea about that?
> > > >
> > > > Try --profile.
> > > >
> > 
> > [by deleting the original message, you credit me with far too good a memory
> > - I've had to go dig around the archives for the original timings]
> > 
> > > here's the output from profiling on Windows:
> > >
> > > D:\Users\schueler\tmp>"c:\Program Files\Mercurial\hg.exe" clone
> > https://mtstest.
> > > lantiq.com/~hgtest/repo/HGSUP-779/manyzeros --profile
> > > *** failed to import extension hglock: No module named hglock http
> > > authorization required
> > > realm: hgtest team repos
> > > user: schueler
> > > password:
> > > destination directory: manyzeros
> > > requesting all changes
> > > adding changesets
> > > adding manifests
> > > adding file changes
> > > added 1 changesets with 1 changes to 1 files updating to branch
> > > default
> > > 1 files updated, 0 files merged, 0 files removed, 0 files unresolved
> > >    CallCount    Recursive     Total(s)    Inline(s) module:lineno(function)
> > >           15            0    137.0120    134.8801   <mercurial\util.pyc>:897(rea
> > > d)
> > >        +1005            0      2.1255      0.0088   +<mercurial\util.pyc>:884(sp
> > > litbig)
> > >        +2021            0      0.0027      0.0027   +<len>
> > >        +1004            0      0.0019      0.0019   +<method 'append' of 'collec
> > > tions.deque' objects>
> > >        +1017            0      0.0016      0.0016   +<method 'popleft' of 'colle
> > > ctions.deque' objects>
> > >          +14            0      0.0000      0.0000   +<method 'appendleft' of 'co
> > > llections.deque' objects>
> > 
> > Almost all the time is being spent here:
> > 
> > http://www.selenic.com/hg/file/df84bec19dce/mercurial/util.py#l898
> > 
> > This is interesting. Here's what happens:
> > 
> > - client requests a bundle
> > - server DEcompresses each individual delta to be sent
> > - then REcompresses the stream of deltas to increase overall compression
> > ratio
> > - client decompresses the stream
> > - client recompresses the deltas individually and stores them
> > 
> > All this decompression/recompression is actually fairly cheap as you can see
> > on the Linux side. But the streaming behavior here is supposed to be evened
> > out on both ends by this chunkiter class. This normally works quite well, but
> > in your case, the large block of zeros is getting compressed by around 10000
> > to 1. This puts a huge bubble in the pipeline on the client side and is thus
> > exposing what is probably O(N**2) inefficiency in Python's string
> > implementation as used by our buffering algorithm. Not sure why it's only
> > affecting Windows so drastically, probably something to do with realloc().
> > 
> > --
> > Mathematics is the supreme nostalgia of our time.
> > 
> 


-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list