[PATCH] mercurial: i18n-ja: extra white spaces at line ending
Martin Geisler
mg at lazybytes.net
Mon Nov 1 03:56:07 CDT 2010
FUJIWARA Katsunori <foozy at lares.dti.ne.jp> writes:
Everybody: please always send Mercurial-related mails to the mailinglist
and not to me privately. In this case, other languages may be affected
too so I want us to discuss this in public. If you have a patch in an
area that I happen to have played with, then send it to the *list* and
add me in CC.
If you must write to me off-list, then say so explicitly in your mail,
otherwise I'll try to get the discussion back on the list when I reply.
> On Mon, 01 Nov 2010 09:05:35 +0100, Martin Geisler <mg at lazybytes.net> wrote:
>> <foozy at lares.dti.ne.jp> writes:
>
>>> These white-spaces were inserted to increase line-wrapping points
>>> for UTF-8 encoding by replacing Japanese comma/period mechanically.
>>>
>>> Because UTF-8 requires 3 BYTES per one Japanese character even
>>> though it requires only 2 COLUMNS, and Japanese text often has no
>>> white space even if it is about line length, so line-wrapping in
>>> UTF-8 was not so good :-<
>>
>> Aha, are we wrapping the UTF-8 encoded bytestrings and not the
>> decoded Unicode strings? That sounds like a bug -- please send a mail
>> to mercurial-devel if that is true, then I'll have a look.
>
> No, it is problem of Python built-in TextWrap module.
>
> It uses not COLUMNS but BYTES to decide line filling limit, so text
> are filled into less columns than expected in many languages which
> require more bytes than columns of characters.
Okay, but does our mercurial.util.MBTextWrapper class not deal with
that? I can see it uses the east_asian_width function to compute the
width of each character.
> In addition to it, white spaces are not placed at split-able points of
> ordinary Japanese text, so TextWrap can not pre-fetch padding text
> from succeeding line not having white spaces in many cases.
>
> In my last translation, I introduced white spaces after after Japanese
> full-width comma/period characters, even though such white spaces are
> not used in ordinary text.
Okay. Let me know if you think we can handle this in a nice way somehow.
Otherwise I'll let you two guys figure out how to best work around the
textwrap bugs/deficiencies.
--
Martin Geisler
aragost Trifork
Professional Mercurial support
http://aragost.com/mercurial/
More information about the Mercurial-devel
mailing list