urls and %nn encoding

Matt Mackall mpm at selenic.com
Wed Jan 18 17:06:55 CST 2012


On Wed, 2012-01-18 at 14:39 +0100, Mads Kiilerich wrote:
> On 01/18/2012 06:08 AM, Kevin Bullock wrote:
> > On 17 Jan 2012, at 7:21 PM, Mads Kiilerich wrote:
> >
> >> Following up on http://markmail.org/message/olyscnnw7nn5e7sd :
> >>
> >> What kinds of 'urls' should %nn encoding apply to?
> >>
> >> urls.txt does not give a clear answer, and we do not strictly follow
> >> the RFCs.
> >>
> >> It would be reasonable to expect that 'hg clone
> >> http://server/foo%20bar' created './foo bar/' locally (issue3145).
> >
> > Yes.
> >
> >> It is perhaps also reasonable that 'hg clone /tmp/foo%20bar' creates
> >> './foo%20bar/'.
> >
> > Yes, since that's not a URL. The URL form would of course be
> > 'file:///tmp/foo%2520bar/'. ;)
> >
> >> But what path should 'hg clone file:///tmp/foo%20bar' clone from and to?
> >
> > Ideally, it should clone from '/tmp/foo bar' to './foo bar' (per RFC
> > 1738). I say 'ideally' because I'm not sure what the impact on current
> > users would be. But if we can reasonably make our URL handling follow
> > the RFC, I'm of the opinion that we should.
> 
> The implicit question is: What is the _safe_ way to pass a local 
> repository path to Mercurial, a path that literally can contain 'file:', 
> '//' and '%'? What is the right (and feasible) thing to do in a script?
>
> I got the impression that simply prepending 'file:' perhaps was intended 
> for escaping from further special processing. An evil path such as 
> 'file://%20' (aka 'file:/%20') could be specified as 'file:file://%20'. 
> (It would however not work for paths with two leading slashes, 
> http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_11 
> .)
> 
> This is however all further complicated by the possibility of having '#' 
> in paths and how '#' sometimes can be used to specify revisions.

Ok, let's throw a couple invariants in here:

1. Anything prefixed with http:// obviously applies percent-encoding and
# support
2. Anything that doesn't start with foo: obviously is a native path and
doesn't apply percent-encoding
3. If it supports # today for a given syntax, we have to keep it that
way

My inclination is to treat anything that starts with file: and apply
percent-encoding, with or without slashes.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list