Why relative import so restrict? - python

With directory:
app/
sub1/
__init__.py
module1.py
sub2/
__init__.py
test.py
what I imagine import a module to do is:
create a scope(or thread?)
run module.py in that scope
from ..sub import module1 is invalid with top-level at test.py
but open('../sub1/module1.py', 'r') works !!
So it's readable, but not importable.
Start with something similar to import moudule as *
exec(open('../sub1/module1.py', 'r').read())
Do further by execute this script in a sepcific scope, and name that scope.
class would provide a scope, also calling class variables is similar to calling module variables.
import module1 as cus
class Module:
exec(open('../sub1/module1.py', 'r').read(), locals(), locals())
cus = Mudule()
cus.function_inside_module1()
function exec(object[, globals[, locals]]) run object under globals scope, and store variables into locals. (I guess)
Since argument globals and locals are both locals() of class Module, it's like what i imagine import to do.
If this work properly, module under module can be writen as nested class i guess.
What kind of problems will this odd importing cause?
If not, why a file is readable but not importable(with top-level restriction)?
Edit
#user2357112 sorry I don't knew how to write multiline comment:
would this gives the behavior you asked for loading parent package?
class sub1:
exec(open('../sub1/__init__.py', 'r').read(), locals(), locals())
class Module:
exec(open('../sub1/module1.py', 'r').read(), locals(), locals())
cus = sub1.Module()
del sub1

Relative imports are not a directory traversal mechanism. from ..a import b does not mean "go up a directory, enter the a directory, and load b.py". It means "import the b member of the a submodule of the current package's parent package". This usually looks a lot like what the directory traversal would do, but it is not the same, especially for cases involving namespace packages, custom module loaders, or sys.modules manipulation.
sub2 has no parent package. Trying to refer to a nonexistent parent package is an error. Also, if you ran test.py directly by file name, sub2 is not even considered a package at all.

Related

Is it bad practice to define a __dir__ function at a python module level?

I have a main function that dynamically imports a module in another file using importlib and then uses the dir built in method to look at attributes of that module. Is it bad to manually define the __dir()__ magic method in the module (file) itself?
main.py
import importlib
def main():
module = importlib.import_module("foo")
attributes = dir(module)
print(attributes)
foo.py
def __dir()__:
return [s]
s = "bar"
Is it looked down upon in the Python community to define dir at the module level as I have shown above?

Python import module sharing name with a function in __init__.py

My tree looks like
parent/
|--__init__.py
\--a.py
And the content of __init__.py is
import parent.a as _a
a = 'some string'
When I open up a Python at the top level and import parent.a, I would get the string instead of module. For example import parent.a as the_a; type(the_a) == str.
So I think OK probably import is importing the name from the parent namespace, and it's now overridden. So I figure I can go import parent._a as a_module. But this doesn't work as there is "No module named _a".
This is very confusing. A function can override a module with the same name, but a module cannot take on a new name and "reexport".
Is there any explanation I'm not aware of? Or is this documented feature?
Even more confusing, if I remove the import statement in __init__.py, everything is back normal again (import parent.a; type(parent.a) is module). But why is this different? The a name in parent namespace is still a string.
(I ran on Python 3.5.3 and 2.7.13 with the same results)
In an import statement, the module reference never uses attribute lookups. The statements
import parent.a # as ...
and
from parent.a import ... # as ...
will always look for parent.a in the sys.modules namespace before trying to further initiate module loading from disk.
However, for from ... import name statements, Python does look at attributes of the resolved module to find name, before looking for submodules.
Module globals and the attributes on a module object are the same thing. On import, Python adds submodules as attributes (so globals) to the parent module, but you are free to overwrite those attributes, as you did in your code. However, when you then use an import with the parent.a module path, attributes do not come into play.
From the Submodules section of the Python import system reference documentation:
When a submodule is loaded using any mechanism [...] a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
Your import parent.a as _a statement adds two names to the parent namespace; first a is added pointing to the parent.a submodule, and then _a is also set, pointing to the same object.
Your next line replaces the name a with a binding to the 'some string' object.
The Searching section of the same details how Python goes about finding a module when you import:
To begin the search, Python needs the fully qualified name of the module [...] being imported.
[...]
This name will be used in various phases of the import search, and it may be the dotted path to a submodule, e.g. foo.bar.baz. In this case, Python first tries to import foo, then foo.bar, and finally foo.bar.baz. If any of the intermediate imports fail, a ModuleNotFoundError is raised.
then further on
The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths. So if foo.bar.baz was previously imported, sys.modules will contain entries for foo, foo.bar, and foo.bar.baz. Each key will have as its value the corresponding module object.
During import, the module name is looked up in sys.modules and if present, the associated value is the module satisfying the import, and the process completes. [...] If the module name is missing, Python will continue searching for the module.
So when trying to import parent.a all that matters is that sys.modules['parent.a'] exists. sys.modules['parent'].a is not consulted.
Only from module import ... would ever look at attributes. From the import statement documentation:
The from form uses a slightly more complex process:
find the module specified in the from clause, loading and initializing it if necessary;
for each of the identifiers specified in the import clauses:
check if the imported module has an attribute by that name
if not, attempt to import a submodule with that name and then check the imported module again for that attribute
[...]
So from parent import _a would work, as would from parent import a, and you'd get the parent.a submodule and the 'some string' object, respectively.
Note that sys.modules is writable, if you must have import parent._a work, you can always just alter sys.modules directly:
sys.modules['parent._a'] = sys.modules['parent.a'] # make parent._a an alias for parent.a
import parent._a # works now
I think I have a coherent understanding of this problem now, just documenting my findings in case others run into this.
What Martijn said above is mostly true, expanding on that answer, import parent.a as _a is a two step process. The first step is module lookup of parent.a, which never goes through attribute lookup, and then it does a binding onto sys.modules, and then an attribute binding of the module to attribute a in parent. In fact this is all you get if you only use import parent.a. This part is described thoroughly by the previous answer.
The second part as _a does an attribute lookup of parent.a, and binds it onto the name _a. So to answer my original question, now if I go outside and start an interactive Python interpreter, now parent.a has been overwritten to the string in __init__.py, and import parent.a as the_a; the_a would get me the string. In fact, this is the same as import parent.a; parent.a. Both the_a and parent.a are the results of attribute lookup. I could still get the submodule by parent._a or sys.modules["parent.a"].
To answer my follow up question:
Even more confusing, if I remove the import statement in __init__.py, everything is back normal again (import parent.a; type(parent.a) is module). But why is this different? The a name in parent namespace is still a string.
This is when I import parent.a in the outside interactive Python interpreter, it first evaluates __init__.py, which does the overwriting of parent.a to a string. But the import hasn't finished yet, it goes on importing the submodule parent.a, and since we are still in the importing part, we don't do attribute lookups, and so we find the correct submodule. When all this is done, it binds the submodule to a of parent, thus overwriting the string that was overwriting the submodule, and making it all correct again.
This sounds very confusing, but remember (https://docs.python.org/3/reference/import.html#submodules):
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule. Let’s say you have the following directory structure:
An import parent.a first runs all the module set-up code, and then binds the name.

Why this import related code works in __init__.py but not in different .py file?

Let's have this __init__.py in a Python3 package:
from .mod1 import *
from .mod2 import *
from .mod3 import *
__all__ = mod1.__all__ + mod2.__all__ + mod3.__all__
The code looks quite simple and does what is expected: it imports from modules mod1, mod2 and mod3 all symbols that these modules have put into their __all__ list and then a summary of all three __all__ lists is created.
I tried to run the very same code in a module, i.e. not in the __init__.py. It imported the three modules, but mod1, mod2 and mod3 were undefined variables.
(BTW, if you run pylint on the original __init__.py, you will get this error too.)
The same statement from .mod1 import * creates a mod1 object when executed in the __init__.py, but does not create it elsewhere. Why?
__init__.py is a special file, but till now, I thought only its name was special.
According to the documentation, this is expected behaviour:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
In other words, when you do a from .whatever import something within a module, you will magically get a whatever attribute bound to the module. Naturally, you can access module's own attributes within __init__.py as if they were defined as variables there. When you are in another module you cannot do it. In this sense __init__.py is special indeed.

what does importing a module in python mean?

I am new to python and found that I can import a module without importing any of the classes inside it. I have the following structure --
myLib/
__init__.py
A.py
B.py
driver.py
Inside driver.py I do the following --
import myLib
tmp = myLib.A()
I get the following error trying to run it.
AttributeError: 'module' object has no attribute A
Eclipse does not complain when I do this, in fact the autocomplete shows A when I type myLib.A.
What does not it mean when I import a module and not any of the classes inside it?
Thanks
P
Python is not Java. A and B are not classes. They are modules. You need to import them separately. (And myLib is not a module but a package.)
The modules A and B might themselves contain classes, which might or might not be called A and B. You can have as many classes in a module as you like - or even none at all, as it is quite possible to write a large Python program with no classes.
To answer your question though, importing myLib simply places the name myLib inside your current namespace. Anything in __init__.py will be executed: if that file itself defines or imports any names, they will be available as attributes of myLib.
If you do from myLib import A, you have now imported the module A into the current namespace. But again, any of its classes still have to be referenced via the A name: so if you do have a class A there, you would instantiate it via A.A().
A third option is to do from myLib.A import A, which does import the class A into your current namespace. In this case, you can just call A() to instantiate the class.
You need to do
from mylib import A
Because A is not an attribute of __init__.py inside mylib
When you do import mylib it imports __init__.py
See my answer.
About packages

Python: relative import imports whole package

I just noticed that relative import like this:
from .foo import myfunc
print myfunc # ok
print foo # ok
imports both foo and myfunc. Is such behaviour documented anywhere? Can I disable it?
-- Update
Basically problem is following.
bar/foo/__init__.py:
__all__ = ['myfunc']
def myfunc(): pass
bar/__init__.py:
from .foo import *
# here I expect that there is only myfunc defined
main.py:
import foo
from bar import * # this import shadows original foo
I can add __all__ to the bar/__init__.py as well, but that way I have to repeat names in several places.
I am assuming your package layout is
my_package/
__init__.py
from .foo import myfunc
foo.py
def myfunc(): pass
The statement from .foo import myfunc first imports the module foo, generally without introducing any names into the local scope. After this first step, myfunc is imported into the local namespace.
In this particular case, however, the first step also imports the module into the local namespace: sub-modules of packages are put in the package's namespace upon importing, regardless from where they are imported. Since __init__.py is also executed in the package's namespace, this happens to conincide with the local namespace.
You cannot reasonably disable this behaviour. If you don't want the name foo in your package's namespace, my advice is to rename the module to _foo to mark it as internal.

Categories