[PATCH 1 of 5 import-refactor] hg: implement import hook for handling C/Python modules

Gregory Szorc gregory.szorc at gmail.com
Tue Nov 24 15:47:37 CST 2015


On Sun, Nov 22, 2015 at 8:31 PM, Yuya Nishihara <yuya at tcha.org> wrote:

> On Sat, 21 Nov 2015 22:14:03 -0800, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc <gregory.szorc at gmail.com>
> > # Date 1442115288 25200
> > #      Sat Sep 12 20:34:48 2015 -0700
> > # Node ID 3d32b988b6d49f95d507e456b3e627f27e816ba8
> > # Parent  df9b73d2d444ae82fe8d3fe6cf682a93b2c4a7ef
> > hg: implement import hook for handling C/Python modules
> >
> > There are a handful of modules that have both pure Python and C
> > extension implementations. Currently, setup.py copies files from
> > mercurial/pure/*.py to mercurial/ during the install process if C
> > extensions are not available. This way, "import mercurial.X" will
> > work whether C extensions are available or not.
> >
> > This approach has a few drawbacks. First, there aren't run-time checks
> > verifying the C extensions are loaded when they should be. This could
> > lead to accidental use of the slower pure Python modules. Second, the
> > C extensions aren't compatible with PyPy and running Mercurial with
> > PyPy requires installing Mercurial - you can't run ./hg from a source
> > checkout. This makes developing while running PyPy somewhat difficult.
> >
> > This patch implements a PEP-302 import hook for finding and loading the
> > modules with both C and Python implementations. When a module with dual
> > implementations is requested for import, its import is handled by our
> > import hook.
> >
> > This patch should be backwards compatible: we are merely reimplementing
> > what Python does behind the scenes when an import occurs.
> >
> > Future patches will add functionality to the import hook to plug some
> > of the aforementioned deficiencies with the import mechanism.
> >
> > diff --git a/hg b/hg
> > --- a/hg
> > +++ b/hg
> > @@ -2,16 +2,17 @@
> >  #
> >  # mercurial - scalable distributed SCM
> >  #
> >  # Copyright 2005-2007 Matt Mackall <mpm at selenic.com>
> >  #
> >  # This software may be used and distributed according to the terms of
> the
> >  # GNU General Public License version 2 or any later version.
> >
> > +import imp
> >  import os
> >  import sys
> >
> >  if os.environ.get('HGUNICODEPEDANTRY', False):
> >      reload(sys)
> >      sys.setdefaultencoding("undefined")
> >
> >
> > @@ -29,15 +30,59 @@ try:
> >      from mercurial import demandimport; demandimport.enable()
> >  except ImportError:
> >      import sys
> >      sys.stderr.write("abort: couldn't find mercurial libraries in
> [%s]\n" %
> >                       ' '.join(sys.path))
> >      sys.stderr.write("(check your install and PYTHONPATH)\n")
> >      sys.exit(-1)
> >
> > +# Install a PEP-302 custom module finder and loader that knows how to
> > +# import modules with implementations in both Python and C.
> > +
> > +# List of modules that have both Python and C implementations. See also
> the
> > +# set of .py files under mercurial/pure/.
> > +dualmodules = set([
> > +    'mercurial.base85',
> > +    'mercurial.bdiff',
> > +    'mercurial.diffhelpers',
> > +    'mercurial.mpatch',
> > +    'mercurial.osutil',
> > +    'mercurial.parsers',
> > +])
> > +
> > +import mercurial
> > +
> > +class hgimporter(object):
> > +    def find_module(self, name, path=None):
> > +        # We only care about modules that have both C and pure
> implementations.
> > +        if name in dualmodules:
> > +            return self
> > +        return None
> > +
> > +    def load_module(self, name):
> > +        mod = sys.modules.get(name, None)
> > +        if mod:
> > +            return mod
> > +
> > +        # imp.find_module doesn't support submodules (modules with ".").
> > +        # Instead you have to pass the parent package's __path__
> attribute
> > +        # as the path argument.
> > +        stem = name.split('.')[-1]
> > +
> > +        # C extensions are available under mercurial.*.
> > +        # Pure Python available under mercurial.* if they are installed
> there
> > +        # or mercurial.pure.* if they aren't installed.
> > +        modinfo = imp.find_module(stem, mercurial.__path__)
> > +
> > +        mod = imp.load_module(name, *modinfo)
> > +        sys.modules[name] = mod
> > +        return mod
> > +
> > +sys.meta_path.insert(0, hgimporter())
>
> Should it be defined in "hg" script? I guess the hgimporter will be
> necessary
> for other entry scripts such as hgweb.wsgi.
>

Yuya, you are spot on here. I neglected to catch a regression in the
ability to run hgweb with a pure install as a result of part 5 (which
stopped copying from mercurial/pure/* to mercurial/* during install).

Mercurial WSGI apps typically do something like:

from mercurial import demandimport; demandimport.enable()
from mercurial.hgweb import hgweb
application = hgweb('/path/to/config/file')

For backwards compatibility with these deployed WSGI files, we'll need to
figure out a way to inject the custom module finder/loader into
mercurial/hgweb/__init__.py. I may end up putting the shared code in
demandimport.py, as it seems like the best place for shared init-time code
since we probably don't want to inject a new file/module into the startup
sequence.

I was then thinking about refactoring mercurial/hgweb/__init__.py to defer
mercurial.* module imports until hgweb() or hgwebdir() call time. Then we
can load/call demandimport at hgweb() and hgwebdir() call time.

Since you have been hacking on hgweb, I'm curious what your thoughts are.
Are there any other Python entry points we need to be concerned about?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20151124/e63e8988/attachment.html>


More information about the Mercurial-devel mailing list