Limit which classes in a .py file are importable from elsewhere - python

I have a python source file with a class defined in it, and a class from another module imported into it. Essentially, this structure:
from parent import SuperClass
from other import ClassA
class ClassB(SuperClass):
def __init__(self): pass
What I want to do is look in this module for all the classes defined in there, and only to find ClassB (and to overlook ClassA). Both ClassA and ClassB extend SuperClass.
The reason for this is that I have a directory of plugins which are loaded at runtime, and I get a full list of the plugin classes by introspecting on each .py file and loading the classes which extend SuperClass. In this particular case, ClassB uses the plugin ClassA to do some work for it, so is dependent upon it (ClassA, meanwhile, is not dependent on ClassB). The problem is that when I load the plugins from the directory, I get 2 instances of ClassA, as it gets one from ClassA's file, and one from ClassB's file.
For packages there is the approach:
__all__ = ['module_a', 'module-b']
to explicitly list the modules that you can import, but this lives in the __init__.py file, and each of the plugins is a .py file not a directory in its own right.
The question, then, is: can I limit access to the classes in a .py file, or do I have to make each one of them a directory with its own init file? Or, is there some other clever way that I could distinguish between these two classes?

You meant "for packages there is the approach...". Actually, that works for every module (__init__.py is a module, just with special semantics). Use __all__ inside the plugin modules and that's it.
But remember: __all__ only limits what you import using from xxxx import *; you can still access the rest of the module, and there's no way to avoid that using the standard Python import mechanism.
If you're using some kind of active introspection technique (eg. exploring the namespace in the module and then importing classes from it), you could check if the class comes from the same file as the module itself.
You could also implement your own import mechanism (using importlib, for example), but that may be overkill...
Edit: for the "check if the class come from the same module":
Say that I have two modules, mod1.py:
class A(object):
pass
and mod2.py:
from mod1 import A
class B(object):
pass
Now, if I do:
from mod2 import *
I've imported both A and B. But...
>>> A
<class 'mod1.A'>
>>> B
<class 'mod2.B'>
as you see, the classes carry information about where did they originate. And actually you can check it right away:
>>> A.__module__
'mod1'
>>> B.__module__
'mod2'
Using that information you can discriminate them easily.

Related

Python 3.5 "ImportError: cannot import name 'SomeName'

I am trying to implement a small library for Python 3.5 but keep struggling with how to correctly handle the structuring of the packages/modules and how to get the imports to work.
I keep running into the problem where python complains of being unable to import some name with an error like
ImportError: cannot import name 'SubClass1'
This seems to happen when "SubClass1" needs to import some other module but that other module also needs to know about SubClass1 (a cyclic import).
I need the cyclic import in my library because the base class has a factory method that creates the proper subclass instances (there are also other situations where cyclic imports are needed, e.g. checking the type of a function argument needs the import of where that type is defined, but that module may itself need the class where that check is done: another cyclic dependency!)
Here is example code:
Root directory contains the subdirectory dir1. The directory dir1 contains and empty file init.py, a file baseclass.py and a file subclass1.py.
The file ./dir1/subclass1.py contains:
from . baseclass import BaseClass
class SubClass1(BaseClass):
pass
The file ./dir1/baseclass.py contains:
from . subclass1 import SubClass1
class BaseClass(object):
def make(self,somearg):
# .. some logic to decide which subclass to create
ret = SubClass1()
# .. which gets eventually returned by this factory method
return ret
The file ./test1.py contains:
from dir1.subclass1 import SubClass1
sc1 = SubClass1()
This results in the following error:
Traceback (most recent call last):
File "test1.py", line 1, in <module>
from dir1.subclass1 import SubClass1
File "/data/johann/tmp/python1/dir1/subclass1.py", line 1, in <module>
from . baseclass import BaseClass
File "/data/johann/tmp/python1/dir1/baseclass.py", line 1, in <module>
from . subclass1 import SubClass1
ImportError: cannot import name 'SubClass1'
What is the standard/best way to solve this problem, ideally in a way that is backwards compatible to python 2.x and python 3 up to version 3.2?
I have read elsewhere that importing the module instead of something from a module may help here but I do not know how to just import the module (e.g. subclass1) in a relative way because "import . subclass1" or similar does not work.
Your issue is caused by a circular import. The baseclass module is trying to import SubClass1 from the subclass1 module, but subclass is trying to import BaseClass right back. You get NameError because the classes haven't been defined yet when the import statements are running.
There are a few ways to solve the issue.
One option would be to change your style of import. Instead of importing the classes by name, just import the modules and look up the names as attributes later on.
from . import baseclass
class SubClass1(baseclass.BaseClass):
pass
And:
from . import subclass1
class BaseClass:
def make(self,somearg):
# ...
ret = subclass1.SubClass1()
Because SubClass1 needs to be able to use BaseClass immediately at definition time, this code may still fail if the baseclass module is imported before subclass1. So it's not ideal
Another option would be to change baseclass to do its import below the definition of BaseClass. This way the subclass module will be able to import the name when it needs to:
class BaseClass:
def make(self,somearg):
# .. some logic to decide which subclass to create
ret = SubClass1()
from .subclass1 import SubClass1
This is not ideal because the normal place to put imports is at the top of the file. Putting them elsewhere makes the code more confusing. You may want to put a comment up at the top of the file explaining why you're delaying the import if you go this route.
Another option may be to combine your two modules into a single file. Python doesn't require each class to have its own module like some other languages do. When you have tightly coupled classes (like the ones in your example), it makes a lot of sense to put them all in one place. This lets you avoid the whole issue, since you don't need any imports at all.
Finally, there are some more complicated solutions, like dependency injection. Rather than the base class needing to know about the subclasses, each subclass could register itself by calling some function and passing a reference to itself. For example:
# no imports of subclasses!
def BaseClass:
subclasses = []
def make(self, somearg):
for sub in self.subclasses:
if sub.accepts(somearg):
return sub()
raise ValueError("no subclass accepts value {!r}".format(somearg))
#classmethod
def register(cls, sub):
cls.subclasses.append(sub)
return sub # return the class so it can be used as a decorator!
And in subclass.py
from .baseclass import BaseClass
#BaseClass.register
class SubClass1(BaseClass):
#classmethod
def accepts(cls, somearg):
# put logic for picking this subclass here!
return True
This style of programming is a bit more complicated, but it can be nice since it's easier to extend than a version where BaseClass needs to know about all of the subclasses up front. There are a variety of ways you can implement this style of code, using a register function is just one of them. One nice thing about it is that it doesn't strictly require inheritance (so you could register a class that doesn't actually inherit from BaseClass if you wanted to). If you are only dealing with actual inheriting subclasses, you might want to consider using a metaclass that does all the registration of subclasses for you automatically.

Define cross-module/global variable from __init__.py

I am trying to simplify my package by making importing straightforward (like requests module does).
For that, I thought using __init__.py would be the best choice. Since, when user imports my package, __init__.py is called. Then i add little code inside, which imports object from specific module.
Please imagine, That my package and class object both have same names:
# __init__.py
from index import packagename # This is class object
myclass = myclass # This is just example to be substituted, I know it has no effect
print(myclass)
Whenever package is imported, Python logs the type of myclass:
<class 'packagename.index.packagename'> # "packagename" in the beginning is my package, "packagename" in the end is class object
However, New instance to the class is not set:
<module 'packagename' from 'packagename/__init__.pyc'>
From my research i couldn't find any significantly reliable answer for my question yet (apologies if i missed something), However, in short, How could i define variable from __init__.py so it can be used by user?
So whenever user imports packagename, variable packagename will be instance of class object and not a module.
Thanks!
Found it out.
requests package uses the same technique as i did, Their functions are imported from __init__.py right away, single example:
from .api import request, get, head, post, patch, put, delete, options
However, This functions are imported only inside from package, So they still should be called from package:
requests.post
So in my case, Package and Class Object have the same names, But unfortunately, i don't think they can replace each other in a neat way.
If my package was called Apple and Class Object inside index.py module was called Apple too.
If i have imported Apple Class Object from __init__.py, I would call it like this:
Apple.Apple (where Apple[0] is the name of package and Apple[1] is the name of class object)

How does Python inheritance detection work?

I have a base class, and several sub classes that inherit from it. I am trying to detect dynamically which sub classes inherit from the base class dynamically. I am currently doing it by dynamically importing all the sub classes in the base class __init__(), and then using the __subclasses__() method.
I have the following file structure:
proj/
|-- __init__.py
|-- base.py
`-- sub
|-- __init__.py
|-- sub1.py
|-- sub2.py
`-- sub3.py
base.py:
import importlib
class Base(object):
def __init__(self):
importlib.import_module('sub.sub1')
importlib.import_module('sub.sub2')
importlib.import_module('sub.sub3')
#classmethod
def inheritors(cls):
print(cls.__subclasses__())
b = Base()
b.inheritors()
sub1.py:
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from base import Base
class Sub1(Base):
pass
sub2.py:
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from base import Base
class Sub2(Base):
pass
and finally sub3.py:
import sys
import os
class Sub3(object):
pass
You will notice that sub.sub1.Sub1 and sub.sub2.Sub2 both inherit from base.Base while sub.sub3.Sub3 does not.
When I open IPython3, and run import base I get the following output:
In [1]: import base
[<class 'sub.sub1.Sub1'>, <class 'sub.sub2.Sub2'>]
The output above is exactly as I would expect it to be. It gets weird when I run base.py using Python command line:
python3 base.py
[<class 'sub.sub2.Sub2'>]
[]
Now I think that I understand that there are two prints in the second case because the Python importer initially does not see base.py in the sys.modules global variable, so when a subclass is imported it will import base.py again and the code will be executed a second time. This explanation does not explain why the first time it prints [<class 'sub.sub2.Sub2'>] and not [<class 'sub.sub1.Sub1'>] as sub.sub1.Sub1 is imported first, and it does not explain why only sub.sub2.Sub2 appears in the __subclasses__() while sub.sub1.Sub1 does not.
Any explanation that would help me understand how Python works in this regard will be greatly appreciated!
EDIT: I would like to run the module using python base.py, so maybe I can be pointed in the correct direction for that?
You made a knot.
A complicated, uneeded knot. I could figure it out - but I don't know if I can keep it in mind to explain what is going on in a clear way :-)
But one thing first: this has less to do with "inheritance detection", andvall to do with the import system - which you tied in a complicated knot.
So, you get the unexpected result because when you do python base.py, the contents of base are recorded as the module named __main__ in sys.modules.
Ordinarily, Python will never import the module and run the same code again: upon fiding an import statement that tries to import an existing module, it just creates a new variable poiting to the existing module. If that module did not finish the execution of its body yet, not all classes or variables will be seem on the place where there is the second import statement. Calls to importlib do no better - they just don t automate the variable biding part. When you do circular imports, change the import path, and import a module named base from another file, Python does not know this is the same base that is __main__. So, the new one gets a new fresh import, and a second entry in sys.modules,as base.
If you just print the __class__ in your inheritors method, it will be clear:
#classmethod
def inheritors(cls):
print("At class {}. Subclasses: {}".format(__class__, cls.__subclasses__()))
Then you will see that "base.Base" has the "sub2" subclass and __main__.Base has no subclasses.
Now, let me try to put the timeline for it:
base.py is imported as __main__ and runs up to the line b =
Base(). At this point the __init__ method of Base will import the
submodules
submodule sub1 is run, changes the sys.path, and
re-imports base.py as the base module.
The contents of the
base module are run until the __init__ method in base.Base is met;
therein, it imports sub.sub1,and Python finds out this module has
already been imported and is in sys.modules. Its code has not been
completed, and the Sub1 base is not yet defined, though.
Inside the sub1 import of base, __init__ tries to import sub.sub2. That
is a new module to Python, so it is imported
On the import of
sub2, when import base is met, Python recognizes the module as
imported already (although, again, not all the initialization code
is complete)- it just brings the name alias to sub2 globals, and
keeps on
Sub2 is defined as subclass of base.Base
sub.sub2 import finishes, and Python resumes to the __init__ method on step (4); Python imports sub.sub3 and resumes to the b.inheritors() call
(from base, not from main). At this point the only subclass of
base.Base is sub2 - that is printed
The importing of
base.py as base finishes, and Python resumes executing the bodu
of sub.sub1- class Sub1 is defined as a subclass of base.Base
Python resumes the __main__.base.__init__ execution, imports
sub.sub2 - but it is already run, the same for sub.sub3
__main__.Base.inheritors is called in __main__, and prints no
sub-classes.
And that is the end of a complicated history.
What you should be doing
first: if you need to do the sys.path.append trickery, there is something wrong with your package. Let your package be proj, and point proj.__init__ to import base if you want that to be run (and dynamically import the other modules) - but stop fidling with sys.path to find things in your own package.
second:
the cls.__subclasses__ call is of little use, as it will only tell you about the imediate subclasses of cls - if there is a grand-chid subclass it will go unoticed,
The most usual pattern is to have a register of subclasses of your Base - an as they are created, just add the new classes to this record. This can be done with a metaclass, in Python < 3.6, or with the __init_subclass__ method on Python 3.6 and on.

what does importing a module in python mean?

I am new to python and found that I can import a module without importing any of the classes inside it. I have the following structure --
myLib/
__init__.py
A.py
B.py
driver.py
Inside driver.py I do the following --
import myLib
tmp = myLib.A()
I get the following error trying to run it.
AttributeError: 'module' object has no attribute A
Eclipse does not complain when I do this, in fact the autocomplete shows A when I type myLib.A.
What does not it mean when I import a module and not any of the classes inside it?
Thanks
P
Python is not Java. A and B are not classes. They are modules. You need to import them separately. (And myLib is not a module but a package.)
The modules A and B might themselves contain classes, which might or might not be called A and B. You can have as many classes in a module as you like - or even none at all, as it is quite possible to write a large Python program with no classes.
To answer your question though, importing myLib simply places the name myLib inside your current namespace. Anything in __init__.py will be executed: if that file itself defines or imports any names, they will be available as attributes of myLib.
If you do from myLib import A, you have now imported the module A into the current namespace. But again, any of its classes still have to be referenced via the A name: so if you do have a class A there, you would instantiate it via A.A().
A third option is to do from myLib.A import A, which does import the class A into your current namespace. In this case, you can just call A() to instantiate the class.
You need to do
from mylib import A
Because A is not an attribute of __init__.py inside mylib
When you do import mylib it imports __init__.py
See my answer.
About packages

Python - can a class act like a module?

I'm considering a package implementation set up like this:
wordproc
__init__.py
_generic.py
gedit.py
oofice.py
word.py
_generic.py would have a class like this:
class WordProc (object):
def __init__ (self):
pass
def createNewDoc (self):
print "createNewDoc unimplemented in current interface"
def getWordCount (self):
print "getWordCount unimplemented in current interface"
etc...
These could print out as shown, or raise errors. App-specific modules would just be copies of _generic.py with the WordProc classes deriving from _generic.WordProc. In this way, functionality could be implemented iteratively over time, with messages about unimplemented things simply raising alerts.
I'm imagining that __init__.py could look for the following things (listed in order) to figure out which module to use:
a wordproc module variable
a settings file in the path
a wordproc environment variable
a function that attempts to determine the environment
a default in __init__.py (probably _generic.py)
I think 3 could be a function in each app's module, or these could go into folders with particularly named environment test scripts (e.g. env.py), and __init__.py could loop over them.
I'd like then in any libraries that want to use wordproc to simply be able to do this:
import wordproc as wp
wp.createNewDoc()
etc...
What I don't know is how to have wp resolve to the proper class in the proper module as determined by __init__.py. It doesn't make sense to do this:
import wordproc.gedit as wp
This destroys the point of having __init__.py determine which module in wordproc to use. I need something like class inheritance, but on the module level.
You can achieve your desired effect by writing __init__.py like this:
Import the appropriate module first. See python docs on importlib.import_module or __import__ for help on dynamic imports.
Instantiate the class from which you want to export methods
Assign the instance methods to locals()
# import appropriate module as mod depending on settings, environment
# using importlib.import_module, or __import__
__all__ = []
_instance = mod.WordProc()
for attr in dir(_instance):
if not attr.startswith('_') and callable(getattr(_instance, attr)):
locals()[attr] = getattr(_instance, attr)

Categories