[PATCH 19 of 19] util: don't encode ':' in url paths
Mads Kiilerich
mads at kiilerich.com
Mon Nov 7 07:56:51 CST 2011
On 11/07/2011 09:04 AM, Maxim Dounin wrote:
> Hello!
>
> On Mon, Nov 07, 2011 at 03:41:09AM +0100, Mads Kiilerich wrote:
>
>> # HG changeset patch
>> # User Mads Kiilerich<mads at kiilerich.com>
>> # Date 1320632710 -3600
>> # Node ID 4fe69cfd994abbc0a1cf00e26bc3e48037923bcb
>> # Parent f88984c9f46c77e21500cc7e4c50bb100789a83f
>> util: don't encode ':' in url paths
>>
>> ':' has no special meaning in paths, so there is no need for encoding it.
>
> This isn't really true:
>
> ... In addition, a URI reference
> (Section 4.1) may be a relative-path reference, in which case the
> first path segment cannot contain a colon (":") character.
>
> (from RFC 3986, http://tools.ietf.org/html/rfc3986#section-3.3)
>
> The colon is critical to distinguish the first path segment of a
> relative reference from an absolute URI starting with scheme
> ("mailto:something" is an URI in the "mailto" scheme, while
> "mailto%3Asomething" is a relative-path reference).
Mercurials url class is not intended to be a strict implementation of
RFC 3986. It is more important that it remains backward compatible and
can handle plain filenames (as described in hg help urls), also on windows.
The url class can thus handle all these cases like this:
>>> url(r'c:\foo\bar')
<url path: 'c:\\foo\\bar'>
>>> url('c:foo/bar')
<url path: 'c:foo/bar'>
>>> url('c://foo/bar')
<url path: 'c://foo/bar'>
>>> url('c:/foo/bar')
<url path: 'c:/foo/bar'>
>>> url('file:c:/foo/bar')
<url scheme: 'file', path: 'c:/foo/bar'>
>>> url('file:///c:/foo/bar')
<url scheme: 'file', path: 'c:/foo/bar'>
but we have
>>> str(url('file:c:/foo/bar'))
'file:c%3A/foo/bar'
>>> str(url('file:///c:/foo/bar'))
'file:c%3A/foo/bar'
With this change we would get
>>> str(url('file:c:/foo/bar'))
'file:c:/foo/bar'
Do you see any real-world examples where this change would be bad for
Mercurials use of urls?
(For the record: we have a known issue with handling of encoded / in
paths. I admit that this change could be seen as taking an extra step in
the wrong direction.)
It would be nice to have a better overview in which way Mercurial urls
are different from RFC urls.
/Mads
More information about the Mercurial-devel
mailing list