[PATCH i18n stable] i18n: fix untranslated prompts with translated responses (issue3936)

Matt Mackall mpm at selenic.com
Wed May 22 17:37:05 CDT 2013


On Thu, 2013-05-23 at 01:56 +0900, FUJIWARA Katsunori wrote:
> At Tue, 21 May 2013 15:17:55 -0500,
> Matt Mackall wrote:
> > 
> > On Tue, 2013-05-21 at 21:50 +0200, Martin Geisler wrote:
> > > Matt Mackall <mpm at selenic.com> writes:
> > > 
> > > > On Mon, 2013-05-20 at 19:51 -0300, Wagner Bruna wrote:
> > > >> Em 16-05-2013 15:58, Matt Mackall escreveu:
> > > >> > On Wed, 2013-05-15 at 21:37 -0500, Kevin Bullock wrote:
> > > >> >> # HG changeset patch
> > > >> >> # User Kevin Bullock <kbullock at ringworld.org>
> > > >> >> # Date 1368671779 18000
> > > >> >> #      Wed May 15 21:36:19 2013 -0500
> > > >> >> # Branch stable
> > > >> >> # Node ID fc1c4221dd82de958b9be5f05c57622679625d21
> > > >> >> # Parent  278057693a1ddb93f95fa641e30e7a966ac98434
> > > >> >> i18n: fix untranslated prompts with translated responses (issue3936)
> > > >> > 
> > > >> > Queued for stable, thanks.
> > > >> > 
> > > >> > I'd like to see a consensus among translators on how to do this.
> > > >> 
> > > >> IMHO we should always accept the English keys, even with translated prompts
> > > >> (to help with muscle memory). And, ideally, detect conflicts at build_mo time.
> > > >> 
> > > >> > I also thing we should unify all the string args used by prompt() into a
> > > >> > single string so that translators have the full context.
> > > >> 
> > > >> Seems like a good approach (as long as we ensure there's no paragraph boundary
> > > >> on that single string, of course).
> > > >
> > > > Ok, here's a typical use:
> > > >
> > > >             elif repo.ui.promptchoice(
> > > >                 _("local changed %s which remote deleted\n"
> > > >                   "use (c)hanged version or (d)elete?") % f,
> > > >                 (_("&Changed"), _("&Delete")), 0):
> > > >
> > > > How shall we format this string to include all the components? Perhaps:
> > > >
> > > > _("local changed %s which remote deleted\n"
> > > >   "use (c)changed version or (d)elete?"
> > > >   "\f&Changed"
> > > >   "\f&Delete")
> > > >   
> > > > I think something like \f or \b (not \t) will work well as a separator
> > > > but I have no idea if our l10n tools will agree with me.
> > > 
> > > Gettext will warn when it sees an escape sequence other than \n and \t.
> > > It writes:
> > > 
> > >   warning: internationalized messages should not contain the `\f' escape
> > >   sequence
> > 
> > Bah.
> > 
> > > This was apparently introduced around 2005:
> > > 
> > >   http://lists.gnu.org/archive/html/bug-gnu-utils/2005-05/msg00065.html
> > > 
> > > I found another mail that explains the rationale a bit more: POT files
> > > are supposed to contain just \n and xgettext will apparently turn \r\n
> > > into \n in the POT file:
> > > 
> > >   http://lists.gnu.org/archive/html/bug-gnu-utils/2010-08/msg00021.html
> > > 
> > > When I quickly tested this, I found that a string with \f wasn't
> > > translated, despite showing up in the hg.pot file.
> > 
> > Huh.
> > 
> > > Would \t not be an easier choice here or are you afraid that it will be
> > > difficult for translators to produce a TAB character in the translation?
> > 
> > Mostly, I thought there was a possibility we might actually want to use
> > \t as a real tab, whereas I'm quite sure we'll never want to use \f.
> > Frankly, my first thought was \0.
> > 
> > I also don't want to preclude using any of the printing ASCII characters
> > either.
> > 
> > But now I'm reminded that we have Shift-JIS out there, which means we
> > can't reliably split translated strings on single ASCII bytes anyway.
> 
> In fact, in message file for Japanese, any strings for choices are
> intentionally not translated, because switching keyboard layout to
> answer question is not friendly for users, as mentioned by Nikolaj in
> his reply.
> 
> So, splitting by single ASCII bytes will work on Shift-JIS, too.
> 
> OK, I understand that you use "Shift-JIS" as a symbolic encoding,
> which uses normal ASCII (maybe back-slash, too) bytes as 2nd byte or
> after in multi-byte characters :-)
> 
> 
> > Perhaps '$$' as the separator?
> 
> Would you mean translatable message like below ?
> 
>   _("local changed %s which remote deleted\n"
>     "use (c)changed version or (d)elete?"
>     "$$&Changed $$&Delete")

Yep. I've pushed a patch based on this.

> Picking single byte following '&' up like below may cause problem with
> strings in MBCS, because it breaks byte sequences of MBCS.
> 
>     resps = [s[s.index('&') + 1].lower() for s in choices]
> 
> 
> So, IMHO, the way to surround "the symbolic letter" should be also
> needed. What about surrounding by '&' ?
> 
>   _("local changed %s which remote deleted\n"
>     "use (c)changed version or (d)elete?"
>     "$$&C&hanged $$&D&elete")

I'm tempted to worry about this part of the problem when we encounter
it.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list