Say I use a python package with the following structure:
package/
bar.py
foo.py
__init__.py
bar.py contains the class bar and foo.py contains the function foo.
When I want to import the function/class do I have to write
from package.bar import bar
from package.foo import foo
or can I write
from package import bar
from package import foo
More generally asked:
Can I always omit the class/function name, when I import a module with the same name as the class/function?
No, you can't omit the module or object name. There is no mechanism that'll implicitly do such imports.
From the Zen of Python:
Explicit is better than implicit.
Note that importing the module itself should always be a valid option too. If from package import bar imported the package.bar.bar object instead, then you'd have to go out of your way to get access to package.bar module itself.
Moreover, such implicit behaviour (auto-importing the object contained in a module rather than the module itself) leads to confusing inconsistencies.
What does import package.bar add to your namespace? Would referencing package.bar be the module or the contained object?
What should happen to code importing such a name, when you rename the contained object? Does from package import bar then give you the module instead? Some operations will still succeed, leading to weird, hard to debug errors, instead of a clear ImportError exception.
Generally speaking, Python modules rarely contain just one thing. Python is not Java, modules consist of closely related groups of objects, not just one class or function.
Now, there is an inherent namespace collision in packages; from package import foo can refer both to names set on the package module, or to a nested module name. Python will first look at the package namespace in that case.
This means you can make an explicit decision to provide the foo and bar objects at the package level, in package/__init__.py:
# in package/__init__.py
from .foo import foo
from .bar import bar
Now from package import foo and from package import bar will give you those objects, masking the nested modules.
The general mechanism of importing objects from submodules into the package namespace is a common method of composing your public API whilst still using internal modules to group your code logically. For example, the json.JSONDecodeError exception in the Python standard library is defined in the json.exceptions module, then imported into json/__init__.py. I generally would discourage masking submodules however; but foo and bar into a module with a different name.
Related
My tree looks like
parent/
|--__init__.py
\--a.py
And the content of __init__.py is
import parent.a as _a
a = 'some string'
When I open up a Python at the top level and import parent.a, I would get the string instead of module. For example import parent.a as the_a; type(the_a) == str.
So I think OK probably import is importing the name from the parent namespace, and it's now overridden. So I figure I can go import parent._a as a_module. But this doesn't work as there is "No module named _a".
This is very confusing. A function can override a module with the same name, but a module cannot take on a new name and "reexport".
Is there any explanation I'm not aware of? Or is this documented feature?
Even more confusing, if I remove the import statement in __init__.py, everything is back normal again (import parent.a; type(parent.a) is module). But why is this different? The a name in parent namespace is still a string.
(I ran on Python 3.5.3 and 2.7.13 with the same results)
In an import statement, the module reference never uses attribute lookups. The statements
import parent.a # as ...
and
from parent.a import ... # as ...
will always look for parent.a in the sys.modules namespace before trying to further initiate module loading from disk.
However, for from ... import name statements, Python does look at attributes of the resolved module to find name, before looking for submodules.
Module globals and the attributes on a module object are the same thing. On import, Python adds submodules as attributes (so globals) to the parent module, but you are free to overwrite those attributes, as you did in your code. However, when you then use an import with the parent.a module path, attributes do not come into play.
From the Submodules section of the Python import system reference documentation:
When a submodule is loaded using any mechanism [...] a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
Your import parent.a as _a statement adds two names to the parent namespace; first a is added pointing to the parent.a submodule, and then _a is also set, pointing to the same object.
Your next line replaces the name a with a binding to the 'some string' object.
The Searching section of the same details how Python goes about finding a module when you import:
To begin the search, Python needs the fully qualified name of the module [...] being imported.
[...]
This name will be used in various phases of the import search, and it may be the dotted path to a submodule, e.g. foo.bar.baz. In this case, Python first tries to import foo, then foo.bar, and finally foo.bar.baz. If any of the intermediate imports fail, a ModuleNotFoundError is raised.
then further on
The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths. So if foo.bar.baz was previously imported, sys.modules will contain entries for foo, foo.bar, and foo.bar.baz. Each key will have as its value the corresponding module object.
During import, the module name is looked up in sys.modules and if present, the associated value is the module satisfying the import, and the process completes. [...] If the module name is missing, Python will continue searching for the module.
So when trying to import parent.a all that matters is that sys.modules['parent.a'] exists. sys.modules['parent'].a is not consulted.
Only from module import ... would ever look at attributes. From the import statement documentation:
The from form uses a slightly more complex process:
find the module specified in the from clause, loading and initializing it if necessary;
for each of the identifiers specified in the import clauses:
check if the imported module has an attribute by that name
if not, attempt to import a submodule with that name and then check the imported module again for that attribute
[...]
So from parent import _a would work, as would from parent import a, and you'd get the parent.a submodule and the 'some string' object, respectively.
Note that sys.modules is writable, if you must have import parent._a work, you can always just alter sys.modules directly:
sys.modules['parent._a'] = sys.modules['parent.a'] # make parent._a an alias for parent.a
import parent._a # works now
I think I have a coherent understanding of this problem now, just documenting my findings in case others run into this.
What Martijn said above is mostly true, expanding on that answer, import parent.a as _a is a two step process. The first step is module lookup of parent.a, which never goes through attribute lookup, and then it does a binding onto sys.modules, and then an attribute binding of the module to attribute a in parent. In fact this is all you get if you only use import parent.a. This part is described thoroughly by the previous answer.
The second part as _a does an attribute lookup of parent.a, and binds it onto the name _a. So to answer my original question, now if I go outside and start an interactive Python interpreter, now parent.a has been overwritten to the string in __init__.py, and import parent.a as the_a; the_a would get me the string. In fact, this is the same as import parent.a; parent.a. Both the_a and parent.a are the results of attribute lookup. I could still get the submodule by parent._a or sys.modules["parent.a"].
To answer my follow up question:
Even more confusing, if I remove the import statement in __init__.py, everything is back normal again (import parent.a; type(parent.a) is module). But why is this different? The a name in parent namespace is still a string.
This is when I import parent.a in the outside interactive Python interpreter, it first evaluates __init__.py, which does the overwriting of parent.a to a string. But the import hasn't finished yet, it goes on importing the submodule parent.a, and since we are still in the importing part, we don't do attribute lookups, and so we find the correct submodule. When all this is done, it binds the submodule to a of parent, thus overwriting the string that was overwriting the submodule, and making it all correct again.
This sounds very confusing, but remember (https://docs.python.org/3/reference/import.html#submodules):
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule. Let’s say you have the following directory structure:
An import parent.a first runs all the module set-up code, and then binds the name.
If I create a package named foo that imports bar, why is bar visible under foo as foo.bar when I import foo in another module? Is there a way to prevent this; to keep bar hidden so as not to clutter the namespace?
Import bar wherever you use it, rather than globally
If bar is being used in a function, import as
def func():
import bar
....
Or even,
if __name__ == '__main__':
import bar
my_main(bar)
Or if you love classes,
class Fubar():
def __init__(self):
import bar
self.bar = bar
Imports in Python are really just another form of name assignment. There is really no difference between an object that has been imported into foo and one that has been defined in foo - they are both visible internally and externally in exactly the same way. So no, there is no way to prevent this.
I don't really see how this is cluttering the namespace, though. You've still only imported one name, foo, into your other module.
TL;DR: Python Imports create named bindings for pieces of code so they can be referenced and used.
An import is essentially binding a piece of code to a name. So the namespace should always reflect what has been imported. If you hide that you may end up causing unexpected problems for someone else or yourself.
If you are importing the wrong modules, importing modules you don't use, or have a ton of imports because you have 10 classes in one file you should consider fixing the underlying issue(s). Not trying to hide it by messing with how modules are imported.
I have a python source file with a class defined in it, and a class from another module imported into it. Essentially, this structure:
from parent import SuperClass
from other import ClassA
class ClassB(SuperClass):
def __init__(self): pass
What I want to do is look in this module for all the classes defined in there, and only to find ClassB (and to overlook ClassA). Both ClassA and ClassB extend SuperClass.
The reason for this is that I have a directory of plugins which are loaded at runtime, and I get a full list of the plugin classes by introspecting on each .py file and loading the classes which extend SuperClass. In this particular case, ClassB uses the plugin ClassA to do some work for it, so is dependent upon it (ClassA, meanwhile, is not dependent on ClassB). The problem is that when I load the plugins from the directory, I get 2 instances of ClassA, as it gets one from ClassA's file, and one from ClassB's file.
For packages there is the approach:
__all__ = ['module_a', 'module-b']
to explicitly list the modules that you can import, but this lives in the __init__.py file, and each of the plugins is a .py file not a directory in its own right.
The question, then, is: can I limit access to the classes in a .py file, or do I have to make each one of them a directory with its own init file? Or, is there some other clever way that I could distinguish between these two classes?
You meant "for packages there is the approach...". Actually, that works for every module (__init__.py is a module, just with special semantics). Use __all__ inside the plugin modules and that's it.
But remember: __all__ only limits what you import using from xxxx import *; you can still access the rest of the module, and there's no way to avoid that using the standard Python import mechanism.
If you're using some kind of active introspection technique (eg. exploring the namespace in the module and then importing classes from it), you could check if the class comes from the same file as the module itself.
You could also implement your own import mechanism (using importlib, for example), but that may be overkill...
Edit: for the "check if the class come from the same module":
Say that I have two modules, mod1.py:
class A(object):
pass
and mod2.py:
from mod1 import A
class B(object):
pass
Now, if I do:
from mod2 import *
I've imported both A and B. But...
>>> A
<class 'mod1.A'>
>>> B
<class 'mod2.B'>
as you see, the classes carry information about where did they originate. And actually you can check it right away:
>>> A.__module__
'mod1'
>>> B.__module__
'mod2'
Using that information you can discriminate them easily.
I am new to python and found that I can import a module without importing any of the classes inside it. I have the following structure --
myLib/
__init__.py
A.py
B.py
driver.py
Inside driver.py I do the following --
import myLib
tmp = myLib.A()
I get the following error trying to run it.
AttributeError: 'module' object has no attribute A
Eclipse does not complain when I do this, in fact the autocomplete shows A when I type myLib.A.
What does not it mean when I import a module and not any of the classes inside it?
Thanks
P
Python is not Java. A and B are not classes. They are modules. You need to import them separately. (And myLib is not a module but a package.)
The modules A and B might themselves contain classes, which might or might not be called A and B. You can have as many classes in a module as you like - or even none at all, as it is quite possible to write a large Python program with no classes.
To answer your question though, importing myLib simply places the name myLib inside your current namespace. Anything in __init__.py will be executed: if that file itself defines or imports any names, they will be available as attributes of myLib.
If you do from myLib import A, you have now imported the module A into the current namespace. But again, any of its classes still have to be referenced via the A name: so if you do have a class A there, you would instantiate it via A.A().
A third option is to do from myLib.A import A, which does import the class A into your current namespace. In this case, you can just call A() to instantiate the class.
You need to do
from mylib import A
Because A is not an attribute of __init__.py inside mylib
When you do import mylib it imports __init__.py
See my answer.
About packages
I thought about this for a while and can't think of a better title, sorry.
I'm new'ish to Python, and (like many other's it seems) I just can't get my head around import.
I think I understand 'modules' and 'packages', classes and attributes and all that. It's one specific behavior I need clarified.
Say I have a file, foo.py. It has one line it:
x = 1
If, in another file, I `import foo", I can reference x. And, wonderfully, in another file I can import foo and now those two files can share x. Leaving classes out of the discussion for simplicity, I believe this is the pythonic way to share attributes between files.
Here's the question: Is is fair to say, when I import foo, that foo.py itself is, (for lack of a better metaphor), secretly instantiated by the interpreter?
I realize if I define a class in a module, it follow traditional rules and only become instantiated if I explicitly do so. But, the python interpreter (via the import statement) instantiating an instance of my module in the global namespace is the only way to explain the attribute sharing behavior.
Is this true? Semi-true? Or am I wandering with the Sleestaks in the Land of the Lost?
When you import a module:
if the module has not been previously imported, the file is parsed in to a module object which is added to sys.modules with a key that is the import path from the pythonpath to your module
that module object (or some member thereof) is aliased in the importing namespace, the alias and object being referenced being determined by the specific form of import you used
So when you import foo, the interpreter checks sys.modules for something registered with the name foo. If it finds it, it provides a label foo in the local namespace for the foo module. If it doesn't, it searches down the pythonpath until it finds a foo module, parses that to a module object, adds that object to sys.modules, and adds a label in the local namespace for that module object.
import foo as foof does the same thing, only the local namespace label created is foof. from foo import x follows the same process up to the point of creating a label and reference in the local namespace, instead providing a label x in the namespace for the attribute x from the foo module. from foo import x as foox just combines the 2 ideas.
With classes, you can actually poke around this whole system by crawling up and down the tree using the __module__ attribute.
The import creates an instance of a "module" object. It is worth knowing that this is created only the first time the module is imported. The following times it is imported you are getting a reference to the original. You can create your own module objects on the fly with a bit of instrospection.
import glob # Import any python module
moduleType = type(glob)
onTheFly = moduleType("OnTheFly", "Docstring for this module")
Although there isn't much benefit to creating these.
Yes, indeed its true. If you execute import foo a module object foo is instatiated and the contents of your file e.g a class bar is added as a member of that object.