clone performance of experimental new http client library.

Antoine Pitrou solipsis at pitrou.net
Wed Oct 19 11:21:53 CDT 2011


On Wed, 19 Oct 2011 10:39:22 -0500
Augie Fackler <durin42 at gmail.com> wrote:
> > 
> > It seems as if the cloning goes slower and slower. For example, it starts off pulling the manifests at a good clip and then gets slower and slower till the progress arrow is moving with agonizing slowness by the end. Same for file changes. The progress prediction is always way off.
> 
> Can you get me a public repo to test against? I'm happy to spend some time in a profiler and speed things up, but I need a way to test.

Without knowing too much about this, this sounds like a classic case of
quadratic behaviour with repeated string concatenation.
And indeed in httpclient/__init__.py there's the following code:

        if self._chunked:
            self._chunked_parsedata(data)
            return
        elif self._body is not None:
            self._body += data
            return

where self._body apparently never gets reinitialized until the whole
response is received.

Do note that string concatenation is fast on OSes where realloc() is
smart enough not to copy data (like Linux, I assume).

Intuitively, you should probably use a StringIO or the fast ''.join()
idiom instead. I just took a look at httplib and it avoids repeated
concatenation (for instance, HTTPResponse._read_chunked() uses
''.join()).

Regards

Antoine.




More information about the Mercurial-devel mailing list