[PATCH 3 of 8] Add filesystem path to dirstate.statwalk return value

Matt Mackall mpm at selenic.com
Wed May 7 22:17:32 CDT 2008


On Wed, 2008-05-07 at 22:57 +0100, Paul Moore wrote:
> 2008/5/1 Matt Mackall <mpm at selenic.com>:
> >  Guess what? I've always thought this pattern of passing a ton of
> >  arguments and getting back a ton of results was horrible, so I've been
> >  working on changing it. This pattern now looks like this in my repo:
> >
> >  m = cmdutil.match(repo, pats, opts)
> >  for abs in repo.walk(m):
> >     if m.exact(abs):
> >     ...
> >
> >  ..so I'm afraid this is an untimely step in the wrong direction.
> 
> If I follow this example, repo.walk returns pathnames, and it's these
> that are likely in some cases to need to be converted to the
> filesystem case on case-folding systems. Assuming that's right, what
> sort of paths are involved? Specifically:

I don't think that's right.

Let's consider:

$ touch A
$ hg add a  # should add 'A' - disk trumps command line
$ rename A a
$ hg ci a # should check in 'A' - dirstate trumps command line and disk 

So first let's look at 

m = cmdutil.match(repo, pats, opts)

This gives us back a 'match object' that has lots of information about
the match we're doing. For instance, we can do match.exact(f) to see if
f matches an explicit filename given in the patterns. We can also do
match.rel(f) to convert f to a relative pathname. So far, so good.

The second piece is:

for abs in repo.walk(m)

which means "return a list of paths in the working directory matching
m". Ideally, the filenames returned are in a 'normalized' form so we
don't have to do any extra work on them, and that means converted from
whatever they were on the command line or on disk to what they are (or
should be) in the dirstate.

So somewhere between passing off the command line patterns ('pats') to
cmdutil.match and getting filenames back from repo.walk, we've got to
match the command line arguments with whatever's on disk, and return
them in a form that matches what's in the dirstate (if there is
something in the dirstate). 

You'll notice that we don't actually know what we're going to be using a
'match object' for until we actually use it (ie by calling
repo.walk/status/etc on a dirstate or manifest) so we have to delay the
matching a bit. And I think we've already figured out that if we're
doing an operation like 'hg cat' that operates on the manifest, we
should do it in a case-sensive manner.

So that means we only have to worry about the dirstate. In particular,
dirstate.statwalk. This function is a big mess, but it's got basically
two pieces: explicit matching and directory walking. First we take all
the files explicitly listed on the command line and check for them. Here
we need to do the slow fspath conversion. Then we do walking, which
inherently gives us the case on disk.

But for every file we find here (in both pieces), we have to convert
every file we already know about back to its case in the dirstate.

> - are they always relative pathnames, relative to the repo root?
> - can they be absolute paths referring to files under the repo root?
> - can they be arbitrary user-supplied pathnames?
> - can they be relative pathnames, relative to the cwd rather than to
> the repo root?

I believe they'll always be normalized and relative to the repo root.
But nothing in util.py should have a notion of what a 'repo' is, so be
careful to avoid adding such notions.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial-devel mailing list