[PATCH 4 of 4 STABLE] i18n: use "encoding.lower()" to normalize string in hgweb search query

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Sat Dec 24 08:31:14 CST 2011


# HG changeset patch
# User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
# Date 1324735129 -32400
# Branch stable
# Node ID 2ff1bde4b23d0ebc3f1f35eb9aed9934f480be97
# Parent  f51cde5621601b93245da68c3ba3723730bac188
i18n: use "encoding.lower()" to normalize string in hgweb search query

some problematic encoding (e.g.: cp932) uses ASCII alphabet characters
in byte sequence of multi byte characters.

"str.lower()" on such byte sequence may treat distinct characters as
same one, and cause unexpected log matching.

this patch uses "encoding.lower()" instead of "str.lower()" to
normalize strings for compare.

diff -r f51cde562160 -r 2ff1bde4b23d mercurial/hgweb/webcommands.py
--- a/mercurial/hgweb/webcommands.py	Sat Dec 24 22:58:49 2011 +0900
+++ b/mercurial/hgweb/webcommands.py	Sat Dec 24 22:58:49 2011 +0900
@@ -124,7 +124,8 @@
 
     def changelist(**map):
         count = 0
-        qw = query.lower().split()
+        lower = encoding.lower
+        qw = lower(query).split()
 
         def revgen():
             for i in xrange(len(web.repo) - 1, 0, -100):
@@ -139,9 +140,9 @@
         for ctx in revgen():
             miss = 0
             for q in qw:
-                if not (q in ctx.user().lower() or
-                        q in ctx.description().lower() or
-                        q in " ".join(ctx.files()).lower()):
+                if not (q in lower(ctx.user()) or
+                        q in lower(ctx.description()) or
+                        q in lower(" ".join(ctx.files()))):
                     miss = 1
                     break
             if miss:


More information about the Mercurial-devel mailing list