[PATCH 3 of 3] use per-directory clustered stat calls even in cases where known tree is walked

Benoit Boissinot bboissin at gmail.com
Wed Oct 1 02:38:20 CDT 2008


On Tue, Sep 30, 2008 at 10:48:29PM -0400, Petr Kodl wrote:
> I submitted modified patch, but there are some "not-so-clean" consequences.
> 
> 1) The path folding knowledge has to be propagated to util.py - or assume
> that win32 folds - which is what the patch does
> 
> 2) Exception handling for pretty much anything coming out of os.listdir has
> to be incorporated correctly to distinguish between file not exist and other
> errors, or there has to be an extra file exists test per directory
> 
> 3) The whole tree has to be cached again - the file sort order is not
> sufficient to guarantee correct directory sort order so all the
> osutil.listdir results have to be kept around
> 
> Performance of the new patch is comparable to the old one - in some
> pathological cases it should actually perform slightly better (large tree
> with very small percentage of files tracked)

I was thinking something more like this, could you test on windows?
(the testsuite runs ok on linux)
- some doc needs be added to statfiles()
- the correct exception (ENOENT?) should be catched when listdir fails

diff --git a/mercurial/dirstate.py b/mercurial/dirstate.py
--- a/mercurial/dirstate.py
+++ b/mercurial/dirstate.py
@@ -532,17 +532,13 @@
                         results[nf] = None
 
         # step 3: report unseen items in the dmap hash
-        visit = [f for f in dmap if f not in results and match(f)]
-        for nf in util.sort(visit):
-            results[nf] = None
-            try:
-                st = lstat(join(nf))
+        visit = util.sort([f for f in dmap if f not in results and match(f)])
+        for nf, st in zip(visit, util.statfiles([join(f) for f in visit])):
+            if st is not None:
                 kind = getkind(st.st_mode)
-                if kind == regkind or kind == lnkkind:
-                    results[nf] = st
-            except OSError, inst:
-                if inst.errno not in (errno.ENOENT, errno.ENOTDIR):
-                    raise
+                if kind != regkind and kind != lnkkind:
+                    st = None
+            results[nf] = st
 
         del results['.hg']
         return results
diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -1162,6 +1162,25 @@
         except NameError:
             pass
 
+    def statfiles(files):
+        dircache = {}
+        for fn in files:
+            pos = fn.rfind('/')
+            if pos != -1:
+                dir, base = fn[:pos].lower(), fn[pos:].lower()
+            else:
+                dir, base = '', fn.lower()
+            try:
+                ls = dircache[dir]
+            except KeyError:
+                try:
+                    ls = dict([(f.lower(), st) for
+                               f, st in osutil.listdir(dir, stat=True)])
+                except: # FIXME: catch only ENOENT ?
+                    ls = {}
+                dircache[dir] = ls
+            yield ls.get(base, None)
+
     try:
         # override functions with win32 versions if possible
         from util_win32 import *
@@ -1334,6 +1353,15 @@
 
     def set_signal_handler():
         pass
+
+    def statfiles(files):
+        for fn in files:
+            try:
+                yield os.lstat(fn)
+            except OSError, inst:
+                if inst.errno not in (errno.ENOENT, errno.ENOTDIR):
+                    raise
+                yield None
 
 def find_exe(name, default=None):
     '''find path of an executable.

-- 
:wq


More information about the Mercurial-devel mailing list