[issue3360] dirstate handling gets slow with medium-sized manifest
Bryan O'Sullivan
bugs at mercurial.selenic.com
Mon Apr 9 18:44:50 CDT 2012
New submission from Bryan O'Sullivan <bos at serpentine.com>:
Running on a synthetic repo with 120,000 files in the tip rev.
# hg --time add a.txt
Time: real 1.750 secs (user 1.670+0.000 sys 0.070+0.000)
Run a second time, to guarantee that the dirstat will not be written:
# hg --time add a.txt
a.txt already tracked!
Time: real 1.250 secs (user 1.190+0.000 sys 0.070+0.000)
I interpret this as "read takes 1.2 seconds, write takes 0.5 seconds extra".
All the time seems to be related to case folding checks. A statprof run:
% cumulative self
time seconds seconds name
16.79 0.19 0.19 utf_8.py:16:decode
13.14 0.83 0.15 scmutil.py:57:__init__
12.04 0.13 0.13 utf_8.py:15:decode
10.22 0.11 0.11 dirstate.py:218:__getitem__
9.49 0.11 0.11 encoding.py:176:lower
9.49 0.11 0.11 dirstate.py:208:__getitem__
8.03 0.09 0.09 encoding.py:171:lower
7.66 0.09 0.09 encoding.py:179:lower
4.74 0.37 0.05 encoding.py:174:lower
2.55 0.03 0.03 encoding.py:168:lower
1.82 0.02 0.02 dirstate.py:225:__iter__
That second entry is from casecollisionauditor.
If I turn casecollisionauditor into a no-op, time for the expected-collision
case goes back down to something reasonable:
# hg --time add a.txt
a.txt already tracked!
Time: real 0.110 secs (user 0.070+0.000 sys 0.040+0.000)
Writing remains expensive:
# hg --time revert a
Time: real 1.500 secs (user 1.360+0.000 sys 0.140+0.000)
# hg --time add a.txt
Time: real 0.780 secs (user 0.730+0.000 sys 0.060+0.000)
This time, it's iterating through the dirstate to write it out that's causing
problems. Here's a statprof profile of "hg add" with the case-folding stuff
killed:
% cumulative self
time seconds seconds name
17.45 0.14 0.14 dirstate.py:486:write
17.02 0.14 0.14 dirstate.py:30:_finddirs
15.32 0.51 0.13 dirstate.py:120:_dirs
12.34 0.10 0.10 dirstate.py:38:_incdirs
5.96 0.05 0.05 dirstate.py:32:_finddirs
5.53 0.05 0.05 dirstate.py:488:write
4.26 0.24 0.03 dirstate.py:36:_incdirs
3.40 0.03 0.03 dirstate.py:119:_dirs
3.40 0.03 0.03 dirstate.py:40:_incdirs
2.55 0.02 0.02 dirstate.py:487:write
2.55 0.02 0.02 dirstate.py:471:write
2.13 0.02 0.02 dirstate.py:33:_finddirs
2.13 0.02 0.02 dirstate.py:118:_dirs
1.28 0.01 0.01 dirstate.py:35:_incdirs
1.28 0.01 0.01 dirstate.py:470:write
----------
messages: 19551
nosy: bos
priority: bug
status: unread
title: dirstate handling gets slow with medium-sized manifest
____________________________________________________
Mercurial issue tracker <bugs at mercurial.selenic.com>
<http://mercurial.selenic.com/bts/issue3360>
____________________________________________________
More information about the Mercurial-devel
mailing list