[PATCH] Accept file://localhost/ style urls as local

Jonathan S. Shapiro shap at eros-os.com
Wed Sep 12 07:39:22 CDT 2007


On Wed, 2007-09-12 at 13:50 +0200, Christian Ebert wrote:

> Accept
> file:///path
> file:/path
> file://localhostname/path

>From memory, I believe that file:///path is a malformed URI and should
not be accepted. The general pattern for file:// and http[s]:// URI's
is:

  file:[//[user[:password]@]host[:port]]path

where path defaults to / if not present. Note that the host portion is
not optional if the leading // is present.

Similarly for http[s] URI's

For other schemes, the layout of everything after the : is left to be
defined by the scheme. As a matter of wide practice, ssh: URI's should
obey the rule given above. In mercurial they do not, which is a bug.

In general the URI scheme does not permit for the specification of
relative paths at all, because relative paths do not make sense in a
distributed name space. The "extra slash" hack that mercurial is
currently using to distinguish relative remote file names is
incompatible with the generally accepted format of the ssh: scheme, and
as a practical matter it simply isn't necessary at all.

Unfortunately, there is no good means to deal simultaneously with the
union-like mechanism of hg-ssh and fully rooted filenames -- the problem
here is that we need to overload "/". In my opinion, the best way to
deal with this (within the intent of the URI scheme) is to view the list
of directories on the ssh line as forming a union directory that defines
"/" in the path portion of the URI and stop trying to overload these two
things. Unfortunately, this breaks current mercurial conventions.

Aside: the entire concept of an ssh: URI is malformed within the
conception of URIs. ssh is a transport, not a protocol. Various attempts
have been made to generalize URI's to deal with this, all of them pretty
ugly. For example, some systems use a tweaked scheme that might take the
form:

  hg+ssh://...

My recollection of the URI specification may not be quite right, but
here is my point: if we are going to do URIs, we should do them in
accordance with the specification and widely accepted common practice.
Making up our own meanings and our own conventions is merely a way to
impose future, ad-hoc compatibility requirements on ourselves. It is
also brittle: we don't want to be in a situation where the URI spec
moves forward and we cannot because we hacked the URI concept.

Summary: quit making things up as you go and read the spec!

shap

> 
> Do not accept
> file:path/not/starting/with/slash
> file://invalidlocalhostname/path
> 
> diff --git a/mercurial/util.py b/mercurial/util.py
> --- a/mercurial/util.py
> +++ b/mercurial/util.py
> @@ -14,7 +14,7 @@ platform-specific details from the core.
>  
>  from i18n import _
>  import cStringIO, errno, getpass, popen2, re, shutil, sys, tempfile, strutil
> -import os, stat, threading, time, calendar, ConfigParser, locale, glob
> +import os, stat, threading, time, calendar, ConfigParser, locale, glob, socket
>  
>  try:
>      set = set
> @@ -1683,12 +1683,30 @@ def bytecount(nbytes):
>              return format % (nbytes / float(divisor))
>      return units[-1][2] % nbytes
>  
> +def _localhostnames():
> +    hl = []
> +    for h in socket.gethostbyaddr(socket.gethostname()):
> +        if isinstance(h, str):
> +            h = [h]
> +        hl += h
> +    return hl
> +
>  def drop_scheme(scheme, path):
>      sc = scheme + ':'
>      if path.startswith(sc):
>          path = path[len(sc):]
> +        if scheme == 'file' and not path.startswith('/'):
> +            raise Abort(_('invalid file url'))
>          if path.startswith('//'):
>              path = path[2:]
> +            if scheme == 'file' and not path.startswith('/'):
> +                if path[:10] in ('localhost/', '127.0.0.1/'):
> +                    return path[9:]
> +                for h in _localhostnames():
> +                    if path.startswith(h + '/'):
> +                        return path[len(h):]
> +                inv = path.split('/', 1)[0]
> +                raise Abort(_('%s: invalid local hostname') % inv)
>      return path
>  
>  def uirepr(s):
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
-- 
Jonathan S. Shapiro
Managing Director
The EROS Group, LLC
www.coyotos.org, www.eros-os.org



More information about the Mercurial-devel mailing list