[PATCH 00 of 11 RFC] consistent and more reliable URL parsing

Mads Kiilerich mads at kiilerich.com
Sun Mar 27 18:19:45 CDT 2011


Adrian Buehlmann wrote, On 03/28/2011 12:32 AM:
> On 2011-03-26 07:29, Brodie Rao wrote:
>> This patch series implements a forgiving but reliable URL parser and
>> refactors URL manipulation across hg.
>>
>> The parser is based on RFC 2396, but splits URLs into their
>> constituent parts primarily by delimiters, placing few restrictions on
>> the contents of each part.
>>
>> The first patch adds the url.url class. Further patches progressively
>> replace manual URL manipulation with url.url in minor areas. The final
>> patch updates commands like clone, pull, push, etc. to use
>> url.url. It's the most invasive of the patches, and slightly changes
>> the behavior of how bundle:// URLs are handled.
>>
>> The final patch also implements "drive promotion" for URLs on
>> Windows. This means that url.localpath('file://D:/foo') returns
>> 'D:/foo'.
> On Windows 7 SP1 x64:
>
>    $ hg version -q
>    Mercurial Distributed SCM (version 1.8.1+151-46c3043253fb)
>    $ hg clone a a1
>    updating to branch default
>    1 files updated, 0 files merged, 0 files removed, 0 files unresolved
>    $ cd a1
>    $ hg out
>    comparing with C:%5CUsers%5Cadi%5Chgrepos%5Ctests%5Ca
>    searching for changes
>    no changes found
>
> That path there should be "C:\Users\adi\hgrepos\tests\a"
>
> Backslash ('\') should probably not be encoded as %5 for file paths.

I think the real problem is that the url parser doesn't recognize 
windows file paths but see this as something using the 'C' 
scheme/protocol - and then it makes sense to urlencode the backslashes. 
It wouldn't have encoded the backslashes if it had recognized it as a 
local file path.

I wonder how many different kinds of windows paths it has to recognize. 
Will the following constraints on schemes cover everything:
* the scheme must not contain \ (or / , which also will ensure that 
'./bundle:foo' is seen as a local file)
* the scheme must be more than one character
?

/Mads


More information about the Mercurial-devel mailing list