[Bug 4750] New: Computing files list for progress during addchangegroup takes a long time

mercurial-bugs at selenic.com mercurial-bugs at selenic.com
Thu Jul 9 19:52:40 UTC 2015


http://bz.selenic.com/show_bug.cgi?id=4750

          Priority: normal
            Bug ID: 4750
                CC: mercurial-devel at selenic.com
          Assignee: bugzilla at selenic.com
           Summary: Computing files list for progress during
                    addchangegroup takes a long time
          Severity: bug
    Classification: Unclassified
                OS: All
          Reporter: gregory.szorc at gmail.com
          Hardware: All
            Status: UNCONFIRMED
           Version: 3.4
         Component: Mercurial
           Product: Mercurial

An automated process on an under-powered EC2 instance at Mozilla kept failing
to clone https://hg.mozilla.org/mozilla-central due to the network connection
dropping. Upon deeper inspection, the root cause is a several second delay when
computing the number of files in mercurial.changegroup.addchangegroup(). This
delay is long enough on this under-powered machine that it is triggering a
socket/network inactivity timeout somewhere (yet to be determined). While
aggressive timeouts are to blame, I believe there is enough of a performance
problem to warrant reporting.

The following lines from changegroup.py are responsible for the slowdown:

    efiles = set()
    for c in xrange(clstart, clend):
        efiles.update(repo[c].files())

On my 2014 MBP with an SSD, this iteration of the changelog takes a full 10s!
This is roughly 3% of the CPU time to clone mozilla-central.

The only use of efiles is for progress. It allows the files progress bar to
work properly. It also enables the N/M counter in --debug output to be sane.

I'm going to take a stab at moving the computation of files to _addrevision
time (if needed, of course). We'll still need to pay the cost of parsing the
revision blob. But we'll avoid an extra changelog traversal and the cost will
be paid at revision add time and won't result in a long pause between
changesets and manifests.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Mercurial-devel mailing list