[PATCH STABLE V2] demandimport: avoid infinite recursion at actual module importing (issue5304)

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Sat Jul 30 16:46:34 EDT 2016

# HG changeset patch
# User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
# Date 1469911199 -32400
#      Sun Jul 31 05:39:59 2016 +0900
# Branch stable
# Node ID 3ee78885a361e06d1729101c5b1bd68b782bf08e
# Parent  491ee264b9f6e32b6e4dfe34180fb48226fc1641
demandimport: avoid infinite recursion at actual module importing (issue5304)

Before this patch, importing C module on Windows environment causes
infinite recursion call, if py2exe is used with -b2 option.

At importing C module "a.b", extra hooking by zipextimporter of py2exe

  0. assumption before accessing "b" of "a":

     - built-in module object is created for "a",
       (= "a" is actually imported)
     - _demandmod is created for "a.b" as a proxy object, and
       (= "a.b" is not yet imported)
     - an attribute "b" of "a" is initialized by the latter

  1. invocation of __import__ via _hgextimport() in _demandmod._load()
     for "a.b" implies _demandimport() for "a.b"

     This is unintentional, because _demandmod might be returned by
     _hgextimport() instead of built-in module object.

  2. _demandimport() at (1) is invoked with not context of "a", but
     context of zipextimporter

     Just after invocation of _hgextimport() in _demandimport(), an
     attribute "b" of the built-in module object for "a" is still
     bound to the proxy object for "a.b", because context of "a" isn't
     updated by actual importing "a.b". even though the built-in
     module object for "a.b" already appears in sys.modules.

     Therefore, chainmodules() returns _demandmod for "a.b", which is
     gotten from the attribute "b" of "a".

  3. processfromitem() on "a.b" causes _demandmod._load() for "a.b"

     _demandimport() takes context of "a" in this case.

     Therefore, attributes below are bound to built-in module object
     for "a.b", as expected:

     - "b" of built-in module object for "a"
     - _module of _demandmod for "a.b"

  4. but _demandimport() invoked at (1) returns _demandmod object

     because _demandimport() just returns the object returned by
     chainmodules() at (3) above.

  5. then, _demandmod._load() causes infinite recursion call

     _demandimport() returns _demandmod for "a.b", and it is "self" at

To avoid infinite recursion at actual module importing, this patch
uses self._module, if _hgextimport() returns _demandmod itself. If
_demandmod._module isn't yet bound at this point, execution should be
aborted, because actual importing failed.

In this patch, _demandmod._module is examined not on _demandimport()
side, but on _demandmod._load() side, because:

  - the former has some exit points
  - only the latter uses _hgextimport(), except for _demandimport()

BTW, this issue occurs only in the code path for non .py/.pyc files in
zipextimporter (strictly speaking, in _memimporter) of py2exe.

Even if zipextimporter is enabled, .py/.pyc files are handled by
zipimporter, and it doesn't imply unintentional _demandimport() at
invocation of __import__ via _hgextimport().

diff --git a/mercurial/demandimport.py b/mercurial/demandimport.py
--- a/mercurial/demandimport.py
+++ b/mercurial/demandimport.py
@@ -94,6 +94,23 @@ class _demandmod(object):
         if not self._module:
             head, globals, locals, after, level, modrefs = self._data
             mod = _hgextimport(_import, head, globals, locals, None, level)
+            if mod is self:
+                # In this case, _hgextimport() above should imply
+                # _demandimport(). Otherwise, _hgextimport() never
+                # returns _demandmod. This isn't intentional behavior,
+                # in fact. (see also issue5304 for detail)
+                #
+                # If self._module is already bound at this point, self
+                # should be already _load()-ed while _hgextimport().
+                # Otherwise, there is no way to import actual module
+                # as expected, because (re-)invoking _hgextimport()
+                # should cause same result.
+                # This is reason why _load() returns without any more
+                # setup but assumes self to be already bound.
+                mod = self._module
+                assert mod and mod is not self, "%s, %s" % (self, mod)
+                return
             # load submodules
             def subload(mod, p):
                 h, t = p, None

More information about the Mercurial-devel mailing list