RFC: pattern handling in subpaths

Mads Kiilerich mads at kiilerich.com
Mon Apr 9 06:34:58 CDT 2012


FUJIWARA Katsunori wrote, On 04/09/2012 10:30 AM:
> Hi, devels.
>
> Current '[subpaths]' implementation doesn't allow you to map some URL
> into the path which has the path component starting with digit on Win
> environment (e.g.: 'C:\1st\2nd'), because of automatic escaping below:

First: Why not just use 'C:/1st/2nd'?

Mercurial do come from the unix world where \ not is a directory 
separator but a valid character in file names and also often used to 
escape special characters in various languages. Windows do support / as 
directory separator on all(?) APIs, so windows users would be well 
advised to avoid using \ as directory separator.

(http://mercurial.selenic.com/bts/issue2644 discusses a related issue in 
.hgsub .)

>
>              # Turn r'C:\foo\bar' into r'C:\\foo\\bar' since re.sub
>              # does a string decode.
>              repl = repl.encode('string-escape')
>              # However, we still want to allow back references to go
>              # through unharmed, so we turn r'\\1' into r'\1'. Again,
>              # extra escapes are needed because re.sub string decodes.
>              repl = re.sub(r'\\\\([0-9]+)', r'\\\1', repl)
>
> # introduced by f3075ffa6b30

Yes, that use of string-escape seems to be incorrect. re.sub will not do 
a string decode - sre_parse.parse_template will decode in a way that is 
more close to what re.escape encodes ... but different. The current code 
will for example fail if repl contains ' .

But I must say that I don't like this, even if the right kind of 
escaping was used. It defines a very special and apparently inconsistent 
syntax, and trying to patch something that has been escaped without 
parsing it fully will fail in some corner cases.

I think it would be better to consider this a bug and remove the special 
handling of \ and improve the documentation to make it clear that it the 
right hand side _is_ a regexp replacement template and that \ thus has 
to be escaped as \\ ... and that windows users probably should prefer to 
use / instead of \.

/Mads


More information about the Mercurial-devel mailing list