[PATCH 1 of 8] Add a new function, fspath

Matt Mackall mpm at selenic.com
Thu May 1 17:11:09 CDT 2008


On Thu, 2008-05-01 at 21:51 +0100, Paul Moore wrote:
> 2008/5/1 Matt Mackall <mpm at selenic.com>:
> >  First, I haven't yet spotted any users of the optional args. We probably
> >  shouldn't have any if they're not used. And if they're always used, they
> >  shouldn't be optional.
> 
> Both are used by localrepo.testcase, and cmdutil.addremove uses root.
> 
> The root arg is needed when passing in a full pathname where we only
> want to normalise the case of the part below the repo root (walking
> all the way up to the filesystem root is both unnecessary extra cost
> and fiddly to get the endcases right on Windows, where you need to
> worry about drive letters and UNC filenames).
> 
> The default arg was suggested by Patrick Mézard, and avoids an extra
> file existence check.
> 
> Personally, I'm comfortable with the optional arguments, although I'd
> give up the default arg if a second os.exists in localrepo.testcase
> was deemed acceptable.
> 
> >  Second, this function could be tons faster if it instead took a list of
> >  files. Then it could use an internal cache of filename components to
> >  avoid repeated directory scans.
> >
> >  As we primarily will be using this on filenames supplied as a list on
> >  the command line, this should work out nicely.
> >
> >  Alternately, we could add a cache dict argument to reuse a cache across
> >  calls.
> 
> That sounds reasonable, but at this point I'm struggling to follow the
> logic in dirstate.statwalk. The logic and the arguments to that
> function are so complex that I can't guarantee not breaking anything
> if I try to do anything more complicated than a simple lookup of one
> filename at a time.
> 
> Adding a static cache of previously-encountered filenames to fspath
> might be reasonable, but I'd like to see some timing tests proving it
> was needed - my intuition says we won't get many repeated arguments to
> fspath in a typical command.

It's not about repeated arguments, silly. It's about "a/a a/b a/c"
checking "a/" three times. There are bugs reports where people complain
about trouble with >10K arguments because they've passed in a huge
number of file names with `find`...

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial-devel mailing list