[PATCH] Allow manipulating files with long names on Windows

Aaron Cohen aaron at assonance.org
Mon Jan 24 00:39:07 CST 2011


On Thu, Jan 20, 2011 at 7:55 PM, Mads Kiilerich <mads at kiilerich.com> wrote:
> Aaron Cohen wrote, On 01/20/2011 11:06 PM:
>>> But it seems like a bad idea that chdir doesn't do a chdir. That will
>>> most
>>> likely have unexpected consequences. Wouldn't it be better to just fail
>>> if
>>> cwd is too long?
>>
>> That would be possible I guess, there aren't that many places in hg
>> where os.chdir is used, and it's almost always to repo.root. The most
>> important usages seem to be in the record extension and the patch
>> command.
>>
>> I originally just documented that the root of the repo shouldn't be
>> more than 244 characters long, but figured I'd give it my best shot to
>> fix it "right".
>
> FWIW I think it would be more right to just give a nice error message.
>
> I guess that fixing it right would involve expanding relative paths in calls
> to for example os.open with absolute paths starting with the latest chdir
> value.

This happens automagically, because all wrapped functions go through
unc() which causes all paths to be abs'ed before going to the library
functions.


> Yes, your way of doing it isn't any worse than win32mbcs - and I doubt there
> are any more elegant solutions to this ugly problem at all. But having two
> different implementations where one of them is buggy is also ugly. I don't
> know how that could be solved.

I've been reading up over the past week, trying to figure out exactly
where the various gotchas are.

The problem is, that mercurial wants a "get filename as byte string"
function, and Windows just doesn't provide one.

The current usage of the single-byte-wide APIs approximates it, but
it's going against Microsoft's recommendations to use the unicode
APIs. It also fails in surprising ways sometimes. For instance,
currently

>>  From a quick readthrough, I would expect the two extensions to
>> actually be able to coexist, but only if mine loads first.
>
> So it would make sense to fail if win32mbcs has been loaded first?

Actually, I setup in uisetup and mbcs does its work in extsetup, so it
seems to be ok. I'd need someone to who has shift-JIS to verify that
though.

> I don't think anybody disagree with that - _if_ it can be done sufficiently
> efficient, reliably and elegant.
>
> If your approach works fine then it might even be better to do it directly
> in core Mercurial than in an extension.

I think there are still enough gotchas with long paths on Windows that
I like have the insulation of requiring the user to opt-in like this.

> (I think more people are annoyed by the interoperability problems between
> for example UCS16 based and UTF8 based systems with unicode file names.
> Fixing that is however more controversial, but I think compatibility with
> fixutf8 is important for your audience.)

I've been looking into the general unicode issue over the past week or
so, it's a fairly big mess.

It seems pretty impossible to use the unicode APIs on Windows and
guarantee round-tripping.

I'm planning to keep looking into it though.

>>> Extensions might be accepted in Mercurial if they have proven that they
>>> are
>>> stable and widely used and actively maintained.
>>>
>>> I suggest you publish this extension somewhere (for example on bitbucket)
>>> and add it to http://mercurial.selenic.com/wiki/UsingExtensions . Time
>>> will
>>> tell if it would be better to distribute it with Mercurial.
>>
>> Alright
>
> If you do that you don't have to listen to any stupid review comments ;-)

Well, I've found the review very valuable, I doubt the extension would
be anywhere near as good as it currently is if I'd just stuck it in
some repo somewhere. ;)

Thanks again,
Aaron


More information about the Mercurial-devel mailing list