Bug 3181 - Degrade performance on project with many files
Summary: Degrade performance on project with many files
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: unspecified
Hardware: All All
: wish feature
Assignee: Bugzilla
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-04 06:16 UTC by Stanislav Spiridonov
Modified: 2012-05-13 05:06 UTC (History)
3 users (show)

See Also:
Python Version: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stanislav Spiridonov 2012-01-04 06:16 UTC
I am working on project that has been started one year ago and now my local 
repository contains 90 665 Files and 45 113 Folders. The performance are 
significant reduced compare with project begin. Especially it is visible on MQ 
operations. Each operation takes up to tens seconds.
Comment 1 Laurens Holst 2012-01-04 08:29 UTC
What extensions do you have enabled?

Also, try to run an operation that takes a long time with the --profile
command line option and paste the result here.
Comment 2 Matt Mackall 2012-01-04 11:48 UTC
Averaging two files per directory is not a recipe for filesystem performance.

What operating system are you using?
Comment 3 Stanislav Spiridonov 2012-01-04 11:58 UTC
Yes, but it is usual project on Java.
 
** Mercurial version (2.0.2).  TortoiseHg version (2.2.2)
** Command: 
** CWD: E:\Development\AGF3-BE
** Encoding: cp1251
** Extensions loaded: transplant, hggit, rebase, svn, fold, mq, fetch, 
convert, histedit
** Python version: 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 64 
bit (AMD64)]
** Windows version: (6, 1, 7601, 2, 'Service Pack 1')
** Processor architecture: x64
** Qt-4.7.4 PyQt-4.8.
Comment 4 Matt Mackall 2012-01-04 13:34 UTC
Give us the output of:

a) hg --time forget nosuchfile.zzzz
b) hg --time locate "**.zzzz"
c) hg --time status -mard
d) hg --time status

What sort of filesystem is "** CWD: E:\Development\AGF3-BE"? Is it a local disk?
Comment 5 Stanislav Spiridonov 2012-01-05 02:52 UTC
I just note that Mercurial Queues operation (pop, push) take much more time 
than in begin. Build visual tree in TortoiseHg also takes some time but I am 
not sure that it is an issue of hg.

NTFS, USB 2.0

E:\Development\AGF3-BE>hg --time forget nosuchfile.zzzz
nosuchfile.zzzz: The system cannot find the file specified
Time: real 0.154 secs (user 0.078+0.000 sys 0.078+0.000)

E:\Development\AGF3-BE>hg --time locate "**.zzzz"
Time: real 0.949 secs (user 0.672+0.000 sys 0.281+0.000)

E:\Development\AGF3-BE>hg --time status -mard
Time: real 0.480 secs (user 0.234+0.000 sys 0.250+0.000)

E:\Development\AGF3-BE>hg --time status
Time: real 0.856 secs (user 0.484+0.000 sys 0.375+0.000)
Comment 6 Laurens Holst 2012-01-05 04:55 UTC
The USB 2.0 connection sounds like the culprit. You could try the command
"dir /s *.zzzz" for comparison. If that also takes about a second, you’re
hitting a system limitation. You should be able to speed the repository up
significantly by moving it to a local (ATA/SATA connected) drive.
Comment 7 Stanislav Spiridonov 2012-01-05 06:06 UTC
Yes it take the time too, and I should say that hg works faster :) But I see 
the following points:

1. MQ operations. How can I test them?
2. The .hg storage. It contains ~66000 files and 32000 folders. So may be it 
has a sense to thinking about changing storage format to reduce amount of 
files to avoid such problems? I clearly understand that slow working with such 
amount of files it is issue of the my file system and it can be solve by hg. 
But these files producing by hg and it is a real hg issue :)
Comment 8 Laurens Holst 2012-01-05 06:27 UTC
Hg is probably faster because dir /s also searches into the .hg directory.

As for the storage, you’re just hitting a random access bandwidth limit on
USB 2.0, as shown by your test with dir. Bypassing the file system’s storage
structures by creating your own file system inside one big file (not gonna
happen btw, for compatibility reasons) will not help, the underlying
bottleneck remains. Also obviously you can’t do this for the working copy.

I would say the only solution is to stop working with such a huge repository
on an USB-attached drive and move the repository to a local one. Why is that
a problem for you?
Comment 9 Stanislav Spiridonov 2012-01-05 06:44 UTC
Ok, The USB drive is a external thing for me and I can change it. To copylocal repository each day fron/to USB is not a solution because it takes much more 
time then all hg operations in sum. Also I can't synchronize repo on USB drive 
and possible repo on local drive by hg because USB drive repository is a SVN 
clone and each push makes a rebase.

So still open only one thing - MQ operations. How I can test them? May be they 
are work not so effective as core hg commands.
Comment 10 Laurens Holst 2012-01-05 06:57 UTC
Refer to hg help mq and just perform some operations like qimport, qpop,
qpush, qfinish on the command line... You can use the --time option on them
too. It may be that TortoiseHg adds some extra status queries inbetween (mq
in TortoiseHg often seems to be rather slow in general, also on relatively
small repositories like Mercurial’s own source code).
Comment 11 Bugzilla 2012-05-12 09:26 UTC

--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:26 EDT  ---

This bug was previously known as _bug_ 3180 at http://mercurial.selenic.com/bts/issue3180