[PATCH 1 of 2] match: delay import of fileset module

Matt Mackall mpm at selenic.com
Thu Apr 12 08:19:21 CDT 2012


On Thu, 2012-04-12 at 11:24 +0200, Angel Ezquerra wrote:
> On Wed, Apr 11, 2012 at 9:31 PM, Matt Mackall <mpm at selenic.com> wrote:
> > On Wed, 2012-04-11 at 21:07 +0200, Angel Ezquerra wrote:
> >> # HG changeset patch
> >> # User Angel Ezquerra <angel.ezquerra at gmail.com>
> >> # Date 1334141104 -7200
> >> # Node ID 2c32ff4561f6594957596cb711fe853c42f9bd38
> >> # Parent  063c5b7ded78b03f7b9b765dde2dade81fcfd6ac
> >> match: delay import of fileset module
> >
> > You've made a trivial 2-line patch into a difficult to review 50-line
> > patch by combining your change with unrelated code movement. Don't do
> > that. When you combine code movement or whitespace changes with
> > substantive changes, the former completely obscure the latter.
> 
> I replied to this yesterday but for some reason I don't see it on my
> Sent Mail folder. I hope I am not replying twice to this.
> 
> The change on the first patch on the series seemed too small to split
> it in two. I forgot that you receive and review tons of patches
> everyday! Sorry about that!
> 
> Assuming that the patch were correct I should have split it into two,
> first move the "import fileset" statement into _expandsets() and then
> move _expandsets() to the end of the file, right?

I have no idea why you're moving the code, but yes, if there were a
reason to do that, it needs to be a different patch.

> > This is probably not the right way to do this, as _expandsets will get
> > called on basically every operation involving files.
> 
> I find module importing in python quite mysterious. I thought that
> modules were loaded when python first parsed the function containing
> the import statement, not when it was executed.

Python is quite different from a traditional language like C in this
regard. First, the code is parsed, then the code is compiled, then the
code is executed. And the execute phase includes evaluating all the
statements in column 1, including things you might have thought were
purely declarative, like imports, defs, and class statements, in
whatever order they appear. Note that the compile phase isn't doing any
initialization, not even building function objects!

Imports statements inside of functions do nothing here, they simply get
bound inside of a function like everything else in a function body, and
then the import gets run every single time the function gets invoked and
bound to a variable inside the function scope.

Fortunately, Python actually caches imports, so the actual
initialization of modules typically only happens once.

Now we can see why circular imports break:

        -- foo.py --
        
        print "importing bar"
        import bar
        
        print "define baz"
        class baz(object):
            pass
        
        -- bar.py --
        
        print "importing foo"
        import foo
        print "define quux"
        class quux(foo.baz):
            pass
        
If we import foo first, we get:

        importing bar
        importing foo  <- no-op, foo is cached
        define quux <- exception: foo.baz not defined yet!
        
But if we import bar first, we get:

    importing foo
    importing bar <- no-op
    define baz
    define quux

Because both class statements and imports are actually executed, we can
get a nasty surprise if we load these modules in the wrong order. And
because Mercurial hacks the importer to delay imports until use.. both
orders may work in testing!

> > It's better to
> > leave match.py alone and add a delayed import of match in the
> > infrequently-used predicate in fileset.py. This also more closely
> > matches the "real" import hierarchy.
> 
> I think that is what I did on the second patch on this series. I moved
> the "import match as matchopt" statement into the subset() function.

We only need to break the loop in one place.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list