[PATCH i18n stable] i18n: fix untranslated prompts with translated responses (issue3936)

Wed May 22 11:56:06 CDT 2013

At Tue, 21 May 2013 15:17:55 -0500,
Matt Mackall wrote:
> 
> On Tue, 2013-05-21 at 21:50 +0200, Martin Geisler wrote:
> > Matt Mackall <mpm at selenic.com> writes:
> > 
> > > On Mon, 2013-05-20 at 19:51 -0300, Wagner Bruna wrote:
> > >> Em 16-05-2013 15:58, Matt Mackall escreveu:
> > >> > On Wed, 2013-05-15 at 21:37 -0500, Kevin Bullock wrote:
> > >> >> # HG changeset patch
> > >> >> # User Kevin Bullock <kbullock at ringworld.org>
> > >> >> # Date 1368671779 18000
> > >> >> #      Wed May 15 21:36:19 2013 -0500
> > >> >> # Branch stable
> > >> >> # Node ID fc1c4221dd82de958b9be5f05c57622679625d21
> > >> >> # Parent  278057693a1ddb93f95fa641e30e7a966ac98434
> > >> >> i18n: fix untranslated prompts with translated responses (issue3936)
> > >> > 
> > >> > Queued for stable, thanks.
> > >> > 
> > >> > I'd like to see a consensus among translators on how to do this.
> > >> 
> > >> IMHO we should always accept the English keys, even with translated prompts
> > >> (to help with muscle memory). And, ideally, detect conflicts at build_mo time.
> > >> 
> > >> > I also thing we should unify all the string args used by prompt() into a
> > >> > single string so that translators have the full context.
> > >> 
> > >> Seems like a good approach (as long as we ensure there's no paragraph boundary
> > >> on that single string, of course).
> > >
> > > Ok, here's a typical use:
> > >
> > >             elif repo.ui.promptchoice(
> > >                 _("local changed %s which remote deleted\n"
> > >                   "use (c)hanged version or (d)elete?") % f,
> > >                 (_("&Changed"), _("&Delete")), 0):
> > >
> > > How shall we format this string to include all the components? Perhaps:
> > >
> > > _("local changed %s which remote deleted\n"
> > >   "use (c)changed version or (d)elete?"
> > >   "\f&Changed"
> > >   "\f&Delete")
> > >   
> > > I think something like \f or \b (not \t) will work well as a separator
> > > but I have no idea if our l10n tools will agree with me.
> > 
> > Gettext will warn when it sees an escape sequence other than \n and \t.
> > It writes:
> > 
> >   warning: internationalized messages should not contain the `\f' escape
> >   sequence
> 
> Bah.
> 
> > This was apparently introduced around 2005:
> > 
> >   http://lists.gnu.org/archive/html/bug-gnu-utils/2005-05/msg00065.html
> > 
> > I found another mail that explains the rationale a bit more: POT files
> > are supposed to contain just \n and xgettext will apparently turn \r\n
> > into \n in the POT file:
> > 
> >   http://lists.gnu.org/archive/html/bug-gnu-utils/2010-08/msg00021.html
> > 
> > When I quickly tested this, I found that a string with \f wasn't
> > translated, despite showing up in the hg.pot file.
> 
> Huh.
> 
> > Would \t not be an easier choice here or are you afraid that it will be
> > difficult for translators to produce a TAB character in the translation?
> 
> Mostly, I thought there was a possibility we might actually want to use
> \t as a real tab, whereas I'm quite sure we'll never want to use \f.
> Frankly, my first thought was \0.
> 
> I also don't want to preclude using any of the printing ASCII characters
> either.
> 
> But now I'm reminded that we have Shift-JIS out there, which means we
> can't reliably split translated strings on single ASCII bytes anyway.

In fact, in message file for Japanese, any strings for choices are
intentionally not translated, because switching keyboard layout to
answer question is not friendly for users, as mentioned by Nikolaj in
his reply.

So, splitting by single ASCII bytes will work on Shift-JIS, too.

OK, I understand that you use "Shift-JIS" as a symbolic encoding,
which uses normal ASCII (maybe back-slash, too) bytes as 2nd byte or
after in multi-byte characters :-)

> Perhaps '$$' as the separator?

Would you mean translatable message like below ?

  _("local changed %s which remote deleted\n"
    "use (c)changed version or (d)elete?"
    "$$&Changed $$&Delete")

Picking single byte following '&' up like below may cause problem with
strings in MBCS, because it breaks byte sequences of MBCS.

    resps = [s[s.index('&') + 1].lower() for s in choices]

So, IMHO, the way to surround "the symbolic letter" should be also
needed. What about surrounding by '&' ?

  _("local changed %s which remote deleted\n"
    "use (c)changed version or (d)elete?"
    "$$&C&hanged $$&D&elete")

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp