[PATCH] hgweb: specify a charset when sending raw text files

Matt Mackall mpm at selenic.com
Tue Jun 8 16:07:53 CDT 2010

On Sun, 2010-06-06 at 12:00 -1000, Julian Cowley wrote:
> Hi,
> Here is a patch that adds a MIME charset parameter to any text file served 
> by hgweb, as long as the encoding parameter is set in the "[web]" section 
> of the appropriate hgrc file.
> This came about when I was using "hg serve" and clicked on a raw file 
> download link.  The file was encoded in UTF-8, but the web server didn't 
> specify the charset and my browser didn't auto-detect it either.
> Note that this doesn't inherit from mercurial.encoding.encoding, since 
> that applies more to web pages served by Mercurial itself rather than the 
> encoding of the underlying repo.

I don't know if that's the right approach. When we present text from a
repository in the web interface, we report it as being in the detected
or configured encoding. If you've got a repo that predominantly contains
eg Latin1 or UTF8, you should have either your system-wide or your hgweb
encoding configured to match.

I don't see why rawfile should behave differently.

Otherwise, I think the notion of adding an encoding header is good.

> # HG changeset patch
> # User Julian Cowley <julian at lava.net>
> # Date 1275857889 36000
> # Node ID 5084ac94f3980caaeb318e863b605ac8608cee41
> # Parent  0e5ce2325795325e41f6df9203373d2858e88f88
> hgweb: specify a charset when sending raw text files
> Gets the charset from web.encoding parameter, but if unset, leaves
> the charset unspecified.  Does not inherit from encoding.encoding,
> which means the charset will only be sent if the user explicitly
> lists it in the config.
> diff --git a/mercurial/hgweb/webcommands.py b/mercurial/hgweb/webcommands.py
> --- a/mercurial/hgweb/webcommands.py
> +++ b/mercurial/hgweb/webcommands.py
> @@ -51,6 +51,10 @@
>       mt = mimetypes.guess_type(path)[0]
>       if mt is None:
>           mt = binary(text) and 'application/octet-stream' or 'text/plain'
> +    if mt.startswith('text/'):
> +        charset = web.config('web', 'encoding')
> +        if charset is not None:
> +            mt += '; charset="%s"' % charset
>       req.respond(HTTP_OK, mt, path, len(text))
>       return [text]

Mathematics is the supreme nostalgia of our time.

More information about the Mercurial-devel mailing list