[PATCH 3 of 3] use per-directory clustered stat calls even in cases where known tree is walked

Christian Boos cboos at neuf.fr
Mon Oct 6 05:36:16 CDT 2008


Benoit Boissinot wrote:
> On Wed, Oct 01, 2008 at 11:21:53AM +0200, Benoit Boissinot wrote:
>   
>> On Wed, Oct 01, 2008 at 10:51:34AM +0200, Christian Boos wrote:
>>     
>>> Benoit Boissinot wrote:
>>>       
>>>> On Tue, Sep 30, 2008 at 10:48:29PM -0400, Petr Kodl wrote:
>>>> ...
>>>> I was thinking something more like this, could you test on windows?
>>>> (the testsuite runs ok on linux)
>>>> - some doc needs be added to statfiles()
>>>> - the correct exception (ENOENT?) should be catched when listdir fails
>>>>
>>>> diff --git a/mercurial/dirstate.py b/mercurial/dirstate.p
>>>>         
>>> (snip)
>>>
>>> Just a quick note to mention that the patch seemed to works fine - at  
>>> first.
>>>       
>> Ok I know what went wrong.
>>
>> Please change:
>>     ls = dict([(f.lower(), st) for
>>                f, st in osutil.listdir(dir, stat=True)])
>> except:
>>
>> to (listdir returns a triplet, not a couple):
>>     ls = dict([(f.lower(), st) for
>>                f, kind, st in osutil.listdir(dir, stat=True)])
>> except OSError:
>>
>> I really shouldn't have catched all errors.
>>     
>
> There was another problem (it should remove the '/' when splitting, maybe I could
> add strutil.rsplit()).
>
> The following is better tested (I tried the same function minus the .lower() calls
> on linux):
>   

Nearly there, I ended up with a much reduced list of pseudo-deleted 
files, actually only those at the top-level.
Tracking the problem down to:

> diff --git a/mercurial/util.py b/mercurial/util.py
> --- a/mercurial/util.py
> +++ b/mercurial/util.py
> @@ -1162,6 +1162,25 @@
>          except NameError:
>              pass
>  
> +    def statfiles(files):
> +        dircache = {}
> +        for fn in files:
> +            pos = fn.rfind('/')
>   

Here we'd need to look for '\\' as well, otherwise if `fn` is something 
like  'C:\\something\\repo\\file' (which is what happens for top-level 
files), the `ls` dict will only contain the 'file' key and therefore 
stafiles will yield None in this case.

> +            if pos != -1:
> +                dir, base = fn[:pos].lower(), fn[pos+1:].lower()
> +            else:
> +                dir, base = '', fn.lower()
> +            try:
> +                ls = dircache[dir]
> +            except KeyError:
> +                try:
> +                    ls = dict([(f.lower(), st) for
> +                               f, kind, st in osutil.listdir(dir, stat=True)])
> +                except OSError:
> +                    ls = {}
> +                dircache[dir] = ls
> +            yield ls.get(base, None)
> +
>   

i.e. with the following change to your patch, everything is nice and 
fast, AFAICT

diff -r 802f5d806a9d mercurial/util.py
--- a/mercurial/util.py Mon Oct 06 11:52:32 2008 +0200
+++ b/mercurial/util.py Mon Oct 06 12:19:40 2008 +0200
@@ -1166,6 +1166,8 @@
         dircache = {}
         for fn in files:
             pos = fn.rfind('/')
+            if pos == -1:
+                pos = fn.rfind('\\')
             if pos != -1:
                 dir, base = fn[:pos].lower(), fn[pos+1:].lower()
             else:


-- Christian



More information about the Mercurial-devel mailing list