[PATCH] hgweb: make raw file download configurable and disabled (BC) (issue2923)

Matt Mackall mpm at selenic.com
Mon Aug 1 14:39:30 CDT 2011


On Sun, 2011-07-31 at 21:51 +0200, Mads Kiilerich wrote:
> My own comments and objections to this patch:
> 
> Mads Kiilerich wrote, On 07/31/2011 02:02 AM:
> > # HG changeset patch
> > # User Mads Kiilerich<mads at kiilerich.com>
> > # Date 1312069612 -7200
> > # Branch stable
> > # Node ID cfa2db1db62e2602c97dff06829000dab1a0d8d8
> > # Parent  192e02680d094dc22cf856e70f07348bd6de18d1
> > hgweb: make raw file download configurable and disabled (BC) (issue2923)
> >
> > Before: hgweb made it possible to download file content with a content type
> > detected from the file extension. It would serve .html files as text/html and
> > could thus cause XSS vulnerabilities if the web site had any kind of session
> > authorization and the repository content wasn't fully trusted.
> >
> > Now: Serving of raw file content is now made configurable with the web.allowraw
> > config setting.
> >
> > Note: Raw file download is now disabled by default. Sites that need this
> > feature and know what they are doing must enable it in the site configuration.
> 
> > The hgweb menu entries for raw is not removed.
> 
> That there is a menu entry that doesn't work is annoying, but fixing 
> that is out of my scope. It would be nice to have, but it requires a lot 
> of template editing. Alternatively a better error page could be nice.

Fine for now.

> > diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> > --- a/mercurial/help/config.txt
> > +++ b/mercurial/help/config.txt
> > @@ -1154,6 +1154,17 @@
> >       be present in this list. The contents of the allow_push list are
> >       examined after the deny_push list.
> >
> > +``allowraw``
> 
> Underscore or not? We are very inconsistent ...

Indeed. Just like in source, we really need to pick one style. I prefer
no underbars. But we should document just one style and -accept- both.
That's obviously a patch for another day though.

The name isn't ideal, as we'll still allow other raw things (changesets,
etc.). See below..

> > +    Control raw download of file revision content.
> > +    Set to ``guess`` to let hgweb guess the content type from the file
> > +    extension. Note: This will serve html as ``text/html`` and might thus
> > +    be vulnerable to XSS attacks.
> 
> Is 'guess' good? Should it be 'mime'? Or is mime just an implementation 
> detail?

It's not great, I'd prefer a binary option. I think we should either
trust the repository contents entirely (guess mime types) or not at all.
Also, I think we should spell out XSS here.

> Should we mention that this is the pre-1.9.1 behaviour?

Yes.

> > +    Set to ``plain`` to serve text files as ``text/plain`` and binary
> > +    files as ``application/octet-stream``. Note: Some browsers might
> > +    ignore the content type and might thus be vulnerable to XSS attacks anyway.
> 
> 'plain' is not a good descriptive name. Is 'simple' better?
> 
> Alternatively we could blacklist or transform the known bad types from 
> mime detection (especially text/html), but I neither like blacklisting 
> nor userfriendly guessing in a security sensitive area. It is probably 
> also not very useful to have automatic detection of fancy content types 
> if we don't have html.
> 
> Is this fix sufficient? How many browsers ignore content types these 
> days? I know old IEs did, but do any relevant browsers still do that?
> 
> > +    Set to empty to disable raw download.
> > +    Default is empty.
> 
> So now we have 3 options. Are they all needed?

On further thought, I would say no.

I'm not convinced we should care about blatant design bugs in MIME type
handling in old IE that aren't really attached to XSS. XSS is a problem
with web content, and isn't _browser_-specific. If we say something is
"text/plain" and you decide to treat it as HTML anyway and -then- get
bitten by an XSS bug, well that's a bug in your browser and not an XSS
bug.

And if you're running one of these old IEs that ignores MIME types and
decides that it's ok to -run- random binaries, well, that's REALLY a
browser bug and the whole internet can kill you and Mercurial is the
least of your worries.

So I think we should have two modes:

- guess mime type (current behavior)
- report application/binary (new, safe default)

According to the table here:

https://code.google.com/p/browsersec/wiki/Part2#Survey_of_content_sniffing_behaviors

...this is safe on all interesting browsers. That same table says we
-can't- serve up text/plain as lots of browsers will upgrade it to HTML
automatically.

And it seems like that option should be named 'guessmime'.

This still leaves open the possibility for malware to use hg servers as
a vector for downloading updates or whatever, but that's not an XSS
issue.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list