[PATCH] mercurial ignores setlocale and uses ascii instead of utf8
Yuya Nishihara
yuya at tcha.org
Sun Oct 30 06:33:15 EDT 2016
(please CC the list)
On Sat, 29 Oct 2016 21:04:14 +0300, Eugene Maslovich wrote:
> There is some repo with several commits. Each commit has a description
> containing unicode letters.
> I'm trying to output a detailed log using mercurial as a library via Flask.
>
> If do this:
>
> ---------------------------------------------------------
> # -*- coding: utf-8 -*-
> import locale
> from flask import Flask
> from mercurial import ui, hg, commands
> app = Flask(__name__)
> @app.route('/')
> def hello():
> hgUi = ui.ui()
> output = ''
> hgRepo = hg.repository(hgUi, '/var/www/hg/repos/web/test1/')
> hgUi.pushbuffer()
> commands.log(hgUi, hgRepo, verbose=True, stat=True)
> output += hgUi.popbuffer()
> return output
> ---------------------------------------------------------
>
> Then I will receive something like this:
>
> ---------------------------------------------------------
> changeset: 100:7b8e1b1d9714
> branch: outgoing
> tag: tip
> user: ehpc
> date: Sat Oct 29 10:33:50 2016 +0300
> summary: ??????? ?? ???????
> ---------------------------------------------------------
For this problem, you can simply set encoding.encoding = 'UTF-8'.
Perhaps, the recommended way would be to use hglib to isolate Mercurial
process, but that might require more work for your flask app to manage the
hg process.
https://www.mercurial-scm.org/wiki/PythonHglib
> All unicode characters are replaced with "?".
> Well that may be ok. But then I do this:
>
> ---------------------------------------------------------
> # -*- coding: utf-8 -*-
> import locale
> locale.setlocale(locale.LC_ALL, 'ru_RU.utf8')
> locale.setlocale(locale.LC_CTYPE, 'ru_RU.utf8')
> from flask import Flask
> from mercurial import ui, hg, commands
> app = Flask(__name__)
> @app.route('/')
> def hello():
> hgUi = ui.ui()
> output = ''
> hgRepo = hg.repository(hgUi, '/var/www/hg/repos/web/test1/')
> hgUi.pushbuffer()
> commands.log(hgUi, hgRepo, verbose=True, stat=True)
> output += hgUi.popbuffer()
> return output
> ---------------------------------------------------------
>
> I still receive "?".
>
> That was kind of unclear because I looked at encoding.py:
>
> ---------------------------------------------------------
> try:
> encoding = environ.get("HGENCODING")
> if not encoding:
> encoding = locale.getpreferredencoding() or 'ascii'
> encoding = _encodingfixers.get(encoding, lambda: encoding)()
> except locale.Error:
> encoding = 'ascii'
> encodingmode = environ.get("HGENCODINGMODE", "strict")
> fallbackencoding = 'ISO-8859-1'
> ---------------------------------------------------------
>
> "encoding" was still "ascii". It was unexpected.
>
> There is a solution to the problem: setting environment variable
> HGENCODING to "UTF-8".
> But it is kind of unclear why settings the actual locale doesn't
> result in proper symbols instead of "?".
More information about the Mercurial-devel
mailing list