D6788: hgweb: fix websub regex flag syntax on Python 3

sheehan (Connor Sheehan) phabricator at mercurial-scm.org
Fri Sep 6 11:43:57 EDT 2019


sheehan created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  The `websub` config section for hgweb is broken under Python 3
  when using regex flags syntax (ie the optional `i` in the example
  from `hg help config.websub`:
  
    patternname = s/SEARCH_REGEX/REPLACE_EXPRESSION/[i]
  
  Flags are pulled out of the specified byte-string using a regular
  expression, and uppercased. The flags are then iterated over and
  passed to the `re` module using `re.__dict__[item]`, to get the
  object attribute of the same name from the `re` module. So on Python
  2 if the `il` flags are passed, this transition looks like:
  
    `'il'` -> `'IL'` -> `'I'` -> `re.__dict__['I']` -> `re.I`
  
  However on Python 3, these are bytes objects. When we iterate over
  a bytes object in Python 3, instead of getting the individual characters
  in the string as string objects of length one, we get the integer \
  value corresponding to that byte. So the same transition looks like:
  
    `b'il'` -> `b'IL'` -> `73` -> `re.__dict__[73]` -> `KeyError`
  
  This commit fixes the type mismatch by decoding the bytes to an
  ascii string before iterating over each element to pass to `re`.
  The transition will now look like:
  
    `b'il'` -> `u'IL'` -> `u'I'` -> `re.__dict__[u'I']` -> `re.I`
  
  In addition we expand `test-websub.t` to cover the regex flag case
  (for both the `websub` section and `interhg`).

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D6788

AFFECTED FILES
  mercurial/hgweb/webutil.py
  tests/test-websub.t

CHANGE DETAILS

diff --git a/tests/test-websub.t b/tests/test-websub.t
--- a/tests/test-websub.t
+++ b/tests/test-websub.t
@@ -11,16 +11,18 @@
   > 
   > [websub]
   > issues = s|Issue(\d+)|<a href="http://bts.example.org/issue\1">Issue\1</a>|
+  > tickets = s|ticket(\d+)|<a href="http://ticket.example.org/issue\1">Ticket\1</a>|i
   > 
   > [interhg]
   > # check that we maintain some interhg backwards compatibility...
   > # yes, 'x' is a weird delimiter...
   > markbugs = sxbugx<i class="\x">bug</i>x
+  > problems = sxPROBLEMx<i class="\x">problem</i>xi
   > EOF
 
   $ touch foo
   $ hg add foo
-  $ hg commit -d '1 0' -m 'Issue123: fixed the bug!'
+  $ hg commit -d '1 0' -m 'Issue123: fixed the bug! Ticket456 and problem789 too'
 
   $ hg serve -n test -p $HGPORT -d --pid-file=hg.pid -A access.log -E errors.log
   $ cat hg.pid >> $DAEMON_PIDS
@@ -28,7 +30,7 @@
 log
 
   $ get-with-headers.py localhost:$HGPORT "rev/tip" | grep bts
-  <div class="description"><a href="http://bts.example.org/issue123">Issue123</a>: fixed the <i class="x">bug</i>!</div>
+  <div class="description"><a href="http://bts.example.org/issue123">Issue123</a>: fixed the <i class="x">bug</i>! <a href="http://ticket.example.org/issue456">Ticket456</a> and <i class="x">problem</i>789 too</div>
 errors
 
   $ cat errors.log
diff --git a/mercurial/hgweb/webutil.py b/mercurial/hgweb/webutil.py
--- a/mercurial/hgweb/webutil.py
+++ b/mercurial/hgweb/webutil.py
@@ -791,7 +791,7 @@
         flagin = match.group(3)
         flags = 0
         if flagin:
-            for flag in flagin.upper():
+            for flag in flagin.upper().decode('ascii'):
                 flags |= re.__dict__[flag]
 
         try:



To: sheehan, #hg-reviewers
Cc: mercurial-devel


More information about the Mercurial-devel mailing list