Performance with binary-heavy repositories

Christoph.Spiel at partner.bmw.de Christoph.Spiel at partner.bmw.de
Thu Aug 2 02:13:30 CDT 2007


Hi all!

        We are evaluating Mercurial for the use in our department.
Our typical projects contain a medium number of files (5000) and have
a moderate size (200MB).  However, the projects are _very_ binary
heavy, this is, around 2000 of the 5000 files contain pure binary
data.

We are under the constraint to use WinXP.
Our version of Mercurial is 0.9.4.

Here are two profiles of a "ci" and a "push" operation to a drive
mounted over the network.


Commit

CallCount     Total(s)    Inline(s) module:lineno(function)
     2075   7364.3450   7364.3450   <mercurial.bdiff.bdiff>
     2155     17.0886     17.0886   <zlib.compress>
     2188   7409.2225      8.3419   <mercurial\filelog.pyc>:49(add)
    +2188   7400.8761      0.0562
+<mercurial\revlog.pyc>:985(addrevision)
    +2188      0.0045      0.0045   +<method 'startswith' of 'str'
objects>
       37      5.7518      5.7518   <win32file.FlushFileBuffers>
     8273      5.5554      5.5554   <zlib.decompress>
    14269      4.3511      4.3511   <win32file.ReadFile>
    12960      2.5756      2.5756   <method 'update' of '_hashlib.HASH'
objects>
     9716      2.2877      2.2877   <win32file.CreateFile>
     5792      0.9962      0.9962   <win32file.WriteFile>
    17050      0.7550      0.7550   <method 'write' of
'cStringIO.StringO' objects>


Push

CallCount     Total(s)    Inline(s) module:lineno(function)
     2202   7218.6231   7218.6231   <mercurial.bdiff.bdiff>
    20099     51.4482     51.4482   <win32file.ReadFile>
     2160     16.4540     16.4540   <zlib.compress>
    10922     15.1340     15.1340   <win32file.CreateFile>
     5793     10.5107     10.5107   <win32file.WriteFile>
    10447      7.3514      7.3514   <zlib.decompress>
        1   7346.4351      5.4895
<mercurial\localrepo.pyc>:1753(addchangegroup)
    +2186   7293.6029      0.4730
+<mercurial\revlog.pyc>:1121(addgroup)
    +2185      0.0847      0.0265
+<mercurial\changegroup.pyc>:13(getchunk)
    +2184     47.1566      0.0181   +<mercurial\localrepo.pyc>:391(file)
    +4370      0.0158      0.0117   +<mercurial\revlog.pyc>:473(count)
    +2184      0.0042      0.0042   +<mercurial\ui.pyc>:405(debug)
     4671      3.8661      3.8661   <nt.stat>
     2432      2.0169      2.0169
<win32file.GetFileInformationByHandle>
     8800      1.4157      1.4157   <method 'update' of '_hashlib.HASH'
objects>


Obviously "bdiff" is the bottleneck.

We are not complaining about Mercurial's performance.  We just
consider our finding interesting and want to share our experience.


Cheers,
        Chris

PS: I am not subscribed to this list.  So please CC me if you want me
    to answer your questions.

--
Dr. Christoph L. Spiel
BMW Forschungs- und Innovationszentrum, EA-410
Lauchstaedterstrasse 5, 80995 Muenchen



More information about the Mercurial mailing list