[PATCH 2 of 2 STABLE] help: search section of help topic by translated section name correctly
FUJIWARA Katsunori
foozy at lares.dti.ne.jp
Mon May 16 02:20:28 EDT 2016
At Sun, 15 May 2016 20:44:46 -0700,
timeless wrote:
>
> [1 <text/plain; UTF-8 (7bit)>]
> The changes seem OK, but this is begging for a pair of check code tests
> (one for lower, and one for upper-- each objecting to anything that isn't
> encoding.)
Yes, I also think that detection by check-code.py is better.
But, there are many '.lower()'/'.upper()' invocations in Mercurial
source code, and I don't' have a good idea to pick up only invalid
'.lower()'/'.upper()' invocations (or low-cost trick to hide
false-positive cases).
Would you have any good ideas ?
> On May 12, 2016 7:12 PM, "FUJIWARA Katsunori" <foozy at lares.dti.ne.jp> wrote:
>
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> > # Date 1463091599 -32400
> > # Fri May 13 07:19:59 2016 +0900
> > # Branch stable
> > # Node ID aaabed77791a75968a12b8c43ad263631a23ee81
> > # Parent 9d38a2061fd8a7a4fd80ead8d5798f38b359bfe3
> > help: search section of help topic by translated section name correctly
> >
> > Before this patch, "hg help topic.section" might show unexpected
> > section of help topic in some encoding.
> >
> > It applies str.lower() instead of encoding.lower(str) on translated
> > message to search section case-insensitively, but some encoding uses
> > 0x41(A) - 0x5a(Z) as the second or later byte of multi-byte character
> > (for example, ja_JP.cp932), and str.lower() causes unexpected result.
> >
> > To search section of help topic by translated section name correctly,
> > this patch replaces str.lower() by encoding.lower(str) for both query
> > string (in commands.help()) and translated help text (in
> > minirst.getsections()).
> >
> > diff --git a/mercurial/commands.py b/mercurial/commands.py
> > --- a/mercurial/commands.py
> > +++ b/mercurial/commands.py
> > @@ -4590,7 +4590,7 @@ def help_(ui, name=None, **opts):
> > subtopic = None
> > if name and '.' in name:
> > name, section = name.split('.', 1)
> > - section = section.lower()
> > + section = encoding.lower(section)
> > if '.' in section:
> > subtopic, section = section.split('.', 1)
> > else:
> > diff --git a/mercurial/minirst.py b/mercurial/minirst.py
> > --- a/mercurial/minirst.py
> > +++ b/mercurial/minirst.py
> > @@ -724,7 +724,7 @@ def getsections(blocks):
> > x = b['key']
> > else:
> > x = b['lines'][0]
> > - x = x.lower().strip('"')
> > + x = encoding.lower(x).strip('"')
> > if '(' in x:
> > x = x.split('(')[0]
> > return x
> > diff --git a/tests/test-help.t b/tests/test-help.t
> > --- a/tests/test-help.t
> > +++ b/tests/test-help.t
> > @@ -1524,6 +1524,78 @@ Test section lookup
> > files List of strings. All files modified, added, or
> > removed by
> > this changeset.
> >
> > +Test section lookup by translated message
> > +
> > +str.lower() instead of encoding.lower(str) on translated message might
> > +make message meaningless, because some encoding uses 0x41(A) - 0x5a(Z)
> > +as the second or later byte of multi-byte character.
> > +
> > +For example, "\x8bL\x98^" (translation of "record" in ja_JP.cp932)
> > +contains 0x4c (L). str.lower() replaces 0x4c(L) by 0x6c(l) and this
> > +replacement makes message meaningless.
> > +
> > +This tests that section lookup by translated string isn't broken by
> > +such str.lower().
> > +
> > + $ python <<EOF
> > + > def escape(s):
> > + > return ''.join('\u%x' % ord(uc) for uc in s.decode('cp932'))
> > + > # translation of "record" in ja_JP.cp932
> > + > upper = "\x8bL\x98^"
> > + > # str.lower()-ed section name should be treated as different one
> > + > lower = "\x8bl\x98^"
> > + > with open('ambiguous.py', 'w') as fp:
> > + > fp.write("""# ambiguous section names in ja_JP.cp932
> > + > u'''summary of extension
> > + >
> > + > %s
> > + > ----
> > + >
> > + > Upper name should show only this message
> > + >
> > + > %s
> > + > ----
> > + >
> > + > Lower name should show only this message
> > + >
> > + > subsequent section
> > + > ------------------
> > + >
> > + > This should be hidden at "hg help ambiguous" with section name.
> > + > '''
> > + > """ % (escape(upper), escape(lower)))
> > + > EOF
> > +
> > + $ cat >> $HGRCPATH <<EOF
> > + > [extensions]
> > + > ambiguous = ./ambiguous.py
> > + > EOF
> > +
> > + $ python <<EOF | sh
> > + > upper = "\x8bL\x98^"
> > + > print "hg --encoding cp932 help -e ambiguous.%s" % upper
> > + > EOF
> > + \x8bL\x98^ (esc)
> > + ----
> > +
> > + Upper name should show only this message
> > +
> > +
> > + $ python <<EOF | sh
> > + > lower = "\x8bl\x98^"
> > + > print "hg --encoding cp932 help -e ambiguous.%s" % lower
> > + > EOF
> > + \x8bl\x98^ (esc)
> > + ----
> > +
> > + Lower name should show only this message
> > +
> > +
> > + $ cat >> $HGRCPATH <<EOF
> > + > [extensions]
> > + > ambiguous = !
> > + > EOF
> > +
> > Test dynamic list of merge tools only shows up once
> > $ hg help merge-tools
> > Merge Tools
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel at mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> >
> [2 <text/html; UTF-8 (quoted-printable)>]
>
----------------------------------------------------------------------
[FUJIWARA Katsunori] foozy at lares.dti.ne.jp
More information about the Mercurial-devel
mailing list