[PATCH] hgweb: make raw file download configurable and disabled (BC) (issue2923)

Augie Fackler lists at durin42.com
Mon Aug 1 15:43:23 CDT 2011


On Mon, Aug 1, 2011 at 3:39 PM, Matt Mackall <mpm at selenic.com> wrote:
> On Sun, 2011-07-31 at 02:02 +0200, Mads Kiilerich wrote:
>> # HG changeset patch
>> # User Mads Kiilerich <mads at kiilerich.com>
>> # Date 1312069612 -7200
>> # Branch stable
>> # Node ID cfa2db1db62e2602c97dff06829000dab1a0d8d8
>> # Parent  192e02680d094dc22cf856e70f07348bd6de18d1
>> hgweb: make raw file download configurable and disabled (BC) (issue2923)
>>
>
> Here's my rewrite of this based on my earlier comments:
>
> # HG changeset patch
> # User Matt Mackall <mpm at selenic.com>
> # Date 1312069612 -7200
> # Branch stable
> # Node ID d06b9c55ddabed66f7e1b5f4193534957232de95
> # Parent  dd74cd1e5d49cdedfd2a0cf142a9ce1178e4b748
> hgweb: raw file mimetype guessing configurable, off by default (BC) (issue2923)
>
> Before: hgweb made it possible to download file content with a content type
> detected from the file extension. It would serve .html files as text/html and
> could thus cause XSS vulnerabilities if the web site had any kind of session
> authorization and the repository content wasn't fully trusted.
>
> Now: all files default to "application/binary", which all important
> browsers will refuse to treat as text/html. See the table here:
>
> https://code.google.com/p/browsersec/wiki/Part2#Survey_of_content_sniffing_behaviors

LGTM, seems like it should be plenty safe based on the table in the link.

>
> diff -r dd74cd1e5d49 -r d06b9c55ddab mercurial/help/config.txt
> --- a/mercurial/help/config.txt Mon Aug 01 09:48:10 2011 +0200
> +++ b/mercurial/help/config.txt Sun Jul 31 01:46:52 2011 +0200
> @@ -1154,6 +1154,13 @@
>     be present in this list. The contents of the allow_push list are
>     examined after the deny_push list.
>
> +``guessmime``
> +    Control MIME types for raw download of file content.
> +    Set to True to let hgweb guess the content type from the file
> +    extension. This will serve HTML files as ``text/html`` and might
> +    allow cross-site scripting attacks when serving untrusted
> +    repositories. Default is False.
> +
>  ``allow_read``
>     If the user has not already been denied repository access due to
>     the contents of deny_read, this list determines whether to grant
> diff -r dd74cd1e5d49 -r d06b9c55ddab mercurial/hgweb/webcommands.py
> --- a/mercurial/hgweb/webcommands.py    Mon Aug 01 09:48:10 2011 +0200
> +++ b/mercurial/hgweb/webcommands.py    Sun Jul 31 01:46:52 2011 +0200
> @@ -32,6 +32,8 @@
>         return changelog(web, req, tmpl)
>
>  def rawfile(web, req, tmpl):
> +    guessmime = web.configbool('web', 'guessmime', False)
> +
>     path = webutil.cleanpath(web.repo, req.form.get('file', [''])[0])
>     if not path:
>         content = manifest(web, req, tmpl)
> @@ -50,9 +52,11 @@
>
>     path = fctx.path()
>     text = fctx.data()
> -    mt = mimetypes.guess_type(path)[0]
> -    if mt is None:
> -        mt = binary(text) and 'application/octet-stream' or 'text/plain'
> +    mt = 'application/binary'
> +    if guessmime:
> +        mt = mimetypes.guess_type(path)[0]
> +        if mt is None:
> +            mt = binary(text) and 'application/binary' or 'text/plain'
>     if mt.startswith('text/'):
>         mt += '; charset="%s"' % encoding.encoding
>
> diff -r dd74cd1e5d49 -r d06b9c55ddab tests/test-hgweb-raw.t
> --- a/tests/test-hgweb-raw.t    Mon Aug 01 09:48:10 2011 +0200
> +++ b/tests/test-hgweb-raw.t    Sun Jul 31 01:46:52 2011 +0200
> @@ -22,6 +22,28 @@
>   $ sleep 1 # wait for server to scream and die
>   $ cat getoutput.txt
>   200 Script output follows
> +  content-type: application/binary
> +  content-length: 157
> +  content-disposition: inline; filename="some \"text\".txt"
> +
> +  This is just some random text
> +  that will go inside the file and take a few lines.
> +  It is very boring to read, but computers don't
> +  care about things like that.
> +  $ cat access.log error.log
> +  127.0.0.1 - - [*] "GET /?f=a23bf1310f6e;file=sub/some%20%22text%22.txt;style=raw HTTP/1.1" 200 - (glob)
> +
> +  $ rm access.log error.log
> +  $ hg serve -p $HGPORT -A access.log -E error.log -d --pid-file=hg.pid \
> +  > --config web.guessmime=True
> +
> +  $ cat hg.pid >> $DAEMON_PIDS
> +  $ ("$TESTDIR/get-with-headers.py" localhost:$HGPORT '/?f=a23bf1310f6e;file=sub/some%20%22text%22.txt;style=raw' content-type content-length content-disposition) >getoutput.txt &
> +  $ sleep 5
> +  $ kill `cat hg.pid`
> +  $ sleep 1 # wait for server to scream and die
> +  $ cat getoutput.txt
> +  200 Script output follows
>   content-type: text/plain; charset="ascii"
>   content-length: 157
>   content-disposition: inline; filename="some \"text\".txt"
>
>
>
> --
> Mathematics is the supreme nostalgia of our time.
>
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>


More information about the Mercurial-devel mailing list