Exceeding the windows API MAX_PATH limit

Sat Apr 21 12:49:15 CDT 2012

On Fri, 2012-04-20 at 23:20 +0200, Angel Ezquerra wrote:
> On Fri, Apr 20, 2012 at 10:45 PM, Matt Mackall <mpm at selenic.com> wrote:
> > On Fri, 2012-04-20 at 22:06 +0200, Angel Ezquerra wrote:
> >> On Apr 20, 2012 8:33 PM, "Matt Mackall" <mpm at selenic.com> wrote:
> >> >
> >> > On Fri, 2012-04-20 at 14:14 +0200, Noel Grandin wrote:
> >> > >
> >> > > On 2012-04-20 14:04, Adrian Buehlmann wrote:
> >> > > > So I'd recommend to make sure that *all* your other tools can handle
> >> > > > long paths _first_ before proposing to change mercurial's working copy
> >> > > > handling on Windows to use the long path api's.
> >> > >
> >> > > The point is that these kinds of problems ARE popping up, and we (being
> >> > > the users) don't need ALL of our tools to support long paths.
> >> >
> >> > This is not even remotely worth the trouble on our end, where we'll have
> >> > to handle bug reports like "I can't delete this file Mercurial created
> >> > without reinstalling Windows, you guys suck!"
> >> >
> >> > Ask again when at least Explorer can handle these paths.
> >>
> >> I think you are right. However I think that mercurial's error message could
> >> be better on this case. A little example:
> >>
> >> c:\>mkdir short
> >> c:\>cd short
> >> c:\short>hg init
> >> c:\short>echo 1 >
> >> dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd.txt
> >> c:\short>hg add
> >> dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd.txt
> >> c:\short>hg commit -m "long file added"
> >> c:\short>cd ..
> >> c:\>hg clone short a_longer_clone_name
> >> updating to branch default
> >> abort:
> >> c:\a_longer_clone_name\dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd.txt:
> >> The system cannot find the path specified
> >>
> >> "The system cannot find the path specified" is unclear error message IMHO.
> >> I know that is the message that windows gives when you try to handle a file
> >> with a long filename, but I think mercurial could give a better error
> >> message. For example:
> >>
> >> abort:
> >> c:\a_longer_clone_name\dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd.txt:
> >> The file name exceeds windows' max filename length of 255 characters
> >
> > So.. your proposal is that we should take the error number (ie 2) that
> > Windows gives back and then guess why it happened from the large
> > possible space of reasons, via what's effectively a mini expert system?
> 
> But at some level mercurial must already be translating some of the
> errors it encounters to make them understandable by the user, isn't
> it? There is a wide range between showing raw error numbers and
> creating an expert system...

Here's all of the handling we do on IOError:

http://www.selenic.com/hg/file/cbf2ea2f5ca1/mercurial/dispatch.py#l164

To the extent that we translate numbers to strings, that's a) done via
standard errno table and b) done for us by Python.

> > I don't consider that sort of thing to be good engineering practice.
> > First, it's directing effort to the wrong place: once such an error has
> > occurred, you've already lost. Second: you are liable to get your
> > diagnostic wrong, at which point you've probably stymied diagnosis by
> > humans.
> 
> If mercurial tries to open a file for writing and it fails because the
> file is too long, is it helpful to tell the user "cannot find the path
> specified", even if that is what windows reports? Does it help the
> human in diagnosing the problem? I think it doesn't. I think it is
> misleading and could make it hard to diagnose the problem.

And what if we decide the name is too long, when in fact there is some
other problem at play, like a missing directory? What if we say "oh, the
name is only 65 characters, just fine", when in fact we're on a
filesystem type like VFAT where that breaks? Now a user who has seen
'name too long' from Mercurial will prematurely rule out that
possibility when we have a false negative.

This is just another case of the perennial "tell me why permission was
denied" wish. To analyze why we get an EPERM error, we -must duplicate
the entirety of the kernel's permission-checking logic-, walking paths
from the top down, often without access to information we don't even
know we need (ACLs, SELinux state, filesystem type, locks held by other
processes, pending deletes, concurrent operations...). In general, it
can't be done and doing it partially will give misleading results.

-- 
Mathematics is the supreme nostalgia of our time.