[PATCH 6 of 6 import-refactor V3] mercurial: support loading modules from zipimporter

Yuya Nishihara yuya at tcha.org
Fri Dec 4 07:06:15 CST 2015


On Thu, 03 Dec 2015 21:51:12 -0800, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1449206705 28800
> #      Thu Dec 03 21:25:05 2015 -0800
> # Node ID 782783ea499c7e03253b55f0b0c221b9946356e2
> # Parent  dd8b23489d2a5f5580905c995333d139fe457d1c
> mercurial: support loading modules from zipimporter
> 
> The previous refactor to module importing broke module loading when
> mercurial.* modules were loaded from a zipfile (using a zipimporter).
> This scenario is likely encountered when using py2exe.
> 
> Supporting zipimporter and the traditional importer side-by-side
> turns out to be quite a pain. In Python 2.x, the standard, file-based
> import mechanism is partially implemented in C. The sys.meta_path
> and sys.path_hooks hook points exist to allow custom importers in
> Python/userland. zipimport.zipimporter and our "hgimporter" class
> from earlier in this patch series are 2 of these.
> 
> In a standard Python installation (no matter if running in py2exe
> or similar or not), zipimport.zipimporter appears to be registered
> in sys.path_hooks. This means that as each sys.path entry is
> consulted, it will ask zipimporter if it supports that path and
> zipimporter will be used if that entry is a zip file. In a
> py2exe environment, sys.path contains an entry with the path to
> the zip file containing the Python standard library along with
> Mercurial's Python files.
> 
> The way the importer mechanism works is the first importer that
> declares knowledge of a module (via find_module() returning an
> object) gets to load it. Since our "hgimporter" is registered
> in sys.meta_path and returns an interest in specific mercurial.*
> modules, the zipimporter registered on sys.path_hooks never comes
> into play for these modules. So, we need to be zipimporter aware
> and call into zipimporter to load modules.
> 
> This patch teaches "hgimporter" how to call out into zipimporter
> when necessary. We detect the necessity of zipimporter by looking
> at the loader for the "mercurial" module. If it is a zipimporter
> instance, we load via zipimporter.
> 
> The behavior of zipimporter is a bit wonky.
> 
> You appear to need separate zipimporter instances for each directory
> in the zip file. I'm not sure why this is. I suspect it has
> something to do with the low-level importing mechanism (implemented
> in C) operating on a per-directory basis. PEP-302 makes some
> references to this. I was not able to get a zipimporter to
> import modules outside of its immediate directory no matter how
> I specified the module name. This is why we use separate
> zipimporter instances for the ".zip/mercurial" and
> ".zip/mercurial/pure" locations.
> 
> The zipimporter documentation for Python 2.7 explicitly states that
> zipimporter does not import dynamic modules (C extensions). Yet from
> a py2exe distribution on Windows - where the .pyd files are *not*
> in the zip archive - zipimporter imported these dynamic modules
> just fine! I'm not sure if dynamic modules can't be imported from
> *inside* the zip archive or whether zipimporter looks for dynamic
> modules outside the zip archive. All I know is zipimporter does
> manage to import the .pyd files on Windows and this patch makes
> our new importer compatible with py2exe.
> 
> In the ideal world, We'd probably reimplement or fall back to parts
> of the built-in import mechanism instead of handling zipimporter
> specially. After all, if we're loading Mercurial modules via
> something that isn't the built-in file-based importer or zipimporter,
> our custom importer will likely fail because it doesn't know how to
> call into it. I'd like to think that we'll never encounter this
> in the wild, but you never know. If we do encounter it, we can
> come up with another solution.
> 
> It's worth nothing that Python 3 has moved a lot of the importing
> code from C to Python. Python 3 gives you near total control over
> the import mechanism. So in the very distant future when Mercurial
> drops Python 2 support, it's likely that our custom importer code
> can be refactored to something a bit saner.
> 
> diff --git a/mercurial/__init__.py b/mercurial/__init__.py
> --- a/mercurial/__init__.py
> +++ b/mercurial/__init__.py
> @@ -5,16 +5,17 @@
>  # This software may be used and distributed according to the terms of the
>  # GNU General Public License version 2 or any later version.
>  
>  from __future__ import absolute_import
>  
>  import imp
>  import os
>  import sys
> +import zipimport
>  
>  __all__ = []
>  
>  # Rules for how modules can be loaded. Values are:
>  #
>  #    c - require C extensions
>  #    allow - allow pure Python implementation when C loading fails
>  #    py - only load pure Python modules
> @@ -55,16 +56,46 @@ class hgimporter(object):
>  
>      def load_module(self, name):
>          mod = sys.modules.get(name, None)
>          if mod:
>              return mod
>  
>          mercurial = sys.modules['mercurial']
>  
> +        # The zip importer behaves sufficiently differently from the default
> +        # importer to warrant its own code path.
> +        loader = getattr(mercurial, '__loader__', None)
> +        if isinstance(loader, zipimport.zipimporter):
> +            def ziploader(*paths):
> +                """Obtain a zipimporter for a directory under the main zip."""
> +                path = os.path.join(loader.archive, *paths)
> +                zl = sys.path_importer_cache.get(path)
> +                if not zl:
> +                    zl = zipimport.zipimporter(path)
> +                return zl
> +
> +            try:
> +                if modulepolicy == 'py':
> +                    raise ImportError()
> +
> +                zl = ziploader('mercurial')
> +                mod = zl.load_module(name)
> +                # Unlike imp, ziploader doesn't expose module metadata that
> +                # indicates the type of module. So just assume what we found
> +                # is OK (even though it could be a pure Python module).
> +            except ImportError:
> +                if modulepolicy == 'c':
> +                    raise
> +                zl = ziploader('mercurial', 'pure')
> +                mod = zl.load_module(name)
> +
> +            sys.modules[name] = mod
> +            return mod

Wow, great. My poor idea was to disable our importer if frozen assuming that
it would use the c modules.


More information about the Mercurial-devel mailing list