[PATCH 2 of 2 STABLE] help: search section of help topic by translated section name correctly
timeless
timeless at gmail.com
Mon May 16 11:37:48 EDT 2016
I'm pretty sure it's possible to write a regular expression that will look
for any instance of them not following encoding,I think a negative look
behind should work. I'll try it in an hour or so.
On May 16, 2016 2:20 AM, "FUJIWARA Katsunori" <foozy at lares.dti.ne.jp> wrote:
>
> At Sun, 15 May 2016 20:44:46 -0700,
> timeless wrote:
> >
> > [1 <text/plain; UTF-8 (7bit)>]
> > The changes seem OK, but this is begging for a pair of check code tests
> > (one for lower, and one for upper-- each objecting to anything that isn't
> > encoding.)
>
> Yes, I also think that detection by check-code.py is better.
>
> But, there are many '.lower()'/'.upper()' invocations in Mercurial
> source code, and I don't' have a good idea to pick up only invalid
> '.lower()'/'.upper()' invocations (or low-cost trick to hide
> false-positive cases).
>
> Would you have any good ideas ?
>
>
> > On May 12, 2016 7:12 PM, "FUJIWARA Katsunori" <foozy at lares.dti.ne.jp>
> wrote:
> >
> > > # HG changeset patch
> > > # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> > > # Date 1463091599 -32400
> > > # Fri May 13 07:19:59 2016 +0900
> > > # Branch stable
> > > # Node ID aaabed77791a75968a12b8c43ad263631a23ee81
> > > # Parent 9d38a2061fd8a7a4fd80ead8d5798f38b359bfe3
> > > help: search section of help topic by translated section name correctly
> > >
> > > Before this patch, "hg help topic.section" might show unexpected
> > > section of help topic in some encoding.
> > >
> > > It applies str.lower() instead of encoding.lower(str) on translated
> > > message to search section case-insensitively, but some encoding uses
> > > 0x41(A) - 0x5a(Z) as the second or later byte of multi-byte character
> > > (for example, ja_JP.cp932), and str.lower() causes unexpected result.
> > >
> > > To search section of help topic by translated section name correctly,
> > > this patch replaces str.lower() by encoding.lower(str) for both query
> > > string (in commands.help()) and translated help text (in
> > > minirst.getsections()).
> > >
> > > diff --git a/mercurial/commands.py b/mercurial/commands.py
> > > --- a/mercurial/commands.py
> > > +++ b/mercurial/commands.py
> > > @@ -4590,7 +4590,7 @@ def help_(ui, name=None, **opts):
> > > subtopic = None
> > > if name and '.' in name:
> > > name, section = name.split('.', 1)
> > > - section = section.lower()
> > > + section = encoding.lower(section)
> > > if '.' in section:
> > > subtopic, section = section.split('.', 1)
> > > else:
> > > diff --git a/mercurial/minirst.py b/mercurial/minirst.py
> > > --- a/mercurial/minirst.py
> > > +++ b/mercurial/minirst.py
> > > @@ -724,7 +724,7 @@ def getsections(blocks):
> > > x = b['key']
> > > else:
> > > x = b['lines'][0]
> > > - x = x.lower().strip('"')
> > > + x = encoding.lower(x).strip('"')
> > > if '(' in x:
> > > x = x.split('(')[0]
> > > return x
> > > diff --git a/tests/test-help.t b/tests/test-help.t
> > > --- a/tests/test-help.t
> > > +++ b/tests/test-help.t
> > > @@ -1524,6 +1524,78 @@ Test section lookup
> > > files List of strings. All files modified, added, or
> > > removed by
> > > this changeset.
> > >
> > > +Test section lookup by translated message
> > > +
> > > +str.lower() instead of encoding.lower(str) on translated message might
> > > +make message meaningless, because some encoding uses 0x41(A) - 0x5a(Z)
> > > +as the second or later byte of multi-byte character.
> > > +
> > > +For example, "\x8bL\x98^" (translation of "record" in ja_JP.cp932)
> > > +contains 0x4c (L). str.lower() replaces 0x4c(L) by 0x6c(l) and this
> > > +replacement makes message meaningless.
> > > +
> > > +This tests that section lookup by translated string isn't broken by
> > > +such str.lower().
> > > +
> > > + $ python <<EOF
> > > + > def escape(s):
> > > + > return ''.join('\u%x' % ord(uc) for uc in s.decode('cp932'))
> > > + > # translation of "record" in ja_JP.cp932
> > > + > upper = "\x8bL\x98^"
> > > + > # str.lower()-ed section name should be treated as different one
> > > + > lower = "\x8bl\x98^"
> > > + > with open('ambiguous.py', 'w') as fp:
> > > + > fp.write("""# ambiguous section names in ja_JP.cp932
> > > + > u'''summary of extension
> > > + >
> > > + > %s
> > > + > ----
> > > + >
> > > + > Upper name should show only this message
> > > + >
> > > + > %s
> > > + > ----
> > > + >
> > > + > Lower name should show only this message
> > > + >
> > > + > subsequent section
> > > + > ------------------
> > > + >
> > > + > This should be hidden at "hg help ambiguous" with section name.
> > > + > '''
> > > + > """ % (escape(upper), escape(lower)))
> > > + > EOF
> > > +
> > > + $ cat >> $HGRCPATH <<EOF
> > > + > [extensions]
> > > + > ambiguous = ./ambiguous.py
> > > + > EOF
> > > +
> > > + $ python <<EOF | sh
> > > + > upper = "\x8bL\x98^"
> > > + > print "hg --encoding cp932 help -e ambiguous.%s" % upper
> > > + > EOF
> > > + \x8bL\x98^ (esc)
> > > + ----
> > > +
> > > + Upper name should show only this message
> > > +
> > > +
> > > + $ python <<EOF | sh
> > > + > lower = "\x8bl\x98^"
> > > + > print "hg --encoding cp932 help -e ambiguous.%s" % lower
> > > + > EOF
> > > + \x8bl\x98^ (esc)
> > > + ----
> > > +
> > > + Lower name should show only this message
> > > +
> > > +
> > > + $ cat >> $HGRCPATH <<EOF
> > > + > [extensions]
> > > + > ambiguous = !
> > > + > EOF
> > > +
> > > Test dynamic list of merge tools only shows up once
> > > $ hg help merge-tools
> > > Merge Tools
> > > _______________________________________________
> > > Mercurial-devel mailing list
> > > Mercurial-devel at mercurial-scm.org
> > > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> > >
> > [2 <text/html; UTF-8 (quoted-printable)>]
> >
>
> ----------------------------------------------------------------------
> [FUJIWARA Katsunori] foozy at lares.dti.ne.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20160516/2d0f2681/attachment.html>
More information about the Mercurial-devel
mailing list