[PATCH 3 of 3] use per-directory clustered stat calls even in cases where known tree is walked

Christian Boos cboos at neuf.fr
Wed Oct 15 02:08:28 CDT 2008


Matt Mackall wrote:
> On Tue, 2008-10-14 at 17:34 +0200, Benoit Boissinot wrote:
>   
>> On Mon, Oct 6, 2008 at 12:59 PM, Petr Kodl <petrkodl at gmail.com> wrote:
>>     
>>>> Wouldn't it be cleaner to call normpath on those filenames ?
>>>>         
>>> normcase is probably what you want in this case. I submitted a patch based
>>> on Benoit's suggestions a while ago, but here it is again - just in case .
>>> It is reasonably clean except for one thing - the error handling of errors
>>> coming from os.listdir - result of some incompatibilities in Python versions
>>>       
>> Matt, do you have any objection to this patch ? conceptual or anything else ?
>>
>> It really helps windows for big trees.
>>     
>
> I'm a little worried that it's going to make things worse on Unix when
> doing diff with widely-scattered changes. But only a little worried.
>   


I did some testing end of last month, and indeed I found that there 
could be some performance degradation when there are lots of unversioned 
files in the working dir, on Linux.

When there were no extra unversioned files, there were no changes with 
the patch, but the performance degrades linearly with the number of 
extra unversioned files.

Timings for `hg stat -q` on a repo with 23k versioned files (the timings 
for `hg diff` were similar).

 Extra Files  || base  ||  cluster patch
--------------------------------------------------
 0            || 0.73s || 0.72s
 1            || 0.73s || 1.02s
 2            || 0.73s || 1.33s
 3            || 0.74s || 1.65s
 4            || 0.74s || 1.95s

Here 1 means extra 23k unversioned files, 2 means 46k unversioned files 
and so on.

This is pruned table D. from the original mail, see 
http://marc.info/?l=mercurial-devel&m=122279886808661 for details.

On the other hand, on Windows we need 460k extra files to see a 
performance degradation.  Though I didn't try on the same hardware, I 
have the impression that with this patch, the performance could be 
better on Windows than on Linux (or perhaps it's just that my more 
recent laptop disk now outperforms our aging SCSI disks on the server).

-- Christian







More information about the Mercurial-devel mailing list