The Python documentation for the import statement (link) contains the following:
The public names defined by a module are determined by checking the module’s namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module.
The Python documentation for modules (link) contains what is seemingly a contradictory statement:
if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered.
It then gives an example where an __init__.py file imports nothing, and simply defines __all__ to be some of the names of modules in that package.
I have tested both ways of using __all__, and both seem to work; indeed one can mix and match within the same __all__ value.
For example, consider the directory structure
foopkg/
__init__.py
foo.py
Where __init__.py contains
# Note no imports
def bar():
print("BAR")
__all__ = ["bar", "foo"]
NOTE: I know one shouldn't define functions in an __init__.py file. I'm just doing it to illustrate that the same __all__ can export both names that do exist in the current namespace, and those which do not.
The following code runs, seemingly auto-importing the foo module:
>>> from foopkg import *
>>> dir()
[..., 'bar', 'foo']
Why does the __all__ attribute have this strange double-behaviour?
The docs seem really unclear on how it is supposed to be used, only mentioning one of its two sides in each place I linked. I understand the overall purpose is to explicitly set the names imported by a wildcard import, but am confused by the additional, seemingly auto-importing behaviour. Is this just a magic shortcut that avoids having to write the import out as well?
The documentation is a bit hard to parse because it does not mention that packages generally also have the behavior of modules, including their __all__ attribute. The behavior of packages is necessarily a superset of the behavior of modules, because packages, unlike modules, can have sub-packages and sub-modules. Behaviors not related to that feature are identical between the two as far as the end-user is concerned.
The python docs can be minimalistic at times. They did not bother to mention that
Package __init__ performs all the module-like code for a package, including support for star-import for direct attributes via __all__, just like a module does.
Modules support all the features of a package __init__.py, except that they can't have a sub-package or sub-module.
It goes without saying that to make a name refer to a sub-module, it has to be imported, hence the apparent, but not really double-standard.
Update: How from M import * actually works?
The __all__ in __init__.py of folder foopkg works the same way as __all__ in foopkg.py
Why it'll auto-import foo you can see here: https://stackoverflow.com/a/54799108/12565014
The most import thing is to look at the cpython implementation: https://github.com/python/cpython/blob/fee552669f21ca294f57fe0df826945edc779090/Python/ceval.c#L5152
It basically loop through __all__ and try to import each element in __all__
That's why it'll auto-import foo and also achieve white listing
Related
I want to define a constant that should be available in all of the submodules of a package. I've thought that the best place would be in in the __init__.py file of the root package. But I don't know how to do this. Suppose I have a few subpackages and each with several modules. How can I access that variable from these modules?
Of course, if this is totally wrong, and there is a better alternative, I'd like to know it.
You should be able to put them in __init__.py. This is done all the time.
mypackage/__init__.py:
MY_CONSTANT = 42
mypackage/mymodule.py:
from mypackage import MY_CONSTANT
print "my constant is", MY_CONSTANT
Then, import mymodule:
>>> from mypackage import mymodule
my constant is 42
Still, if you do have constants, it would be reasonable (best practices, probably) to put them in a separate module (constants.py, config.py, ...) and then if you want them in the package namespace, import them.
mypackage/__init__.py:
from mypackage.constants import *
Still, this doesn't automatically include the constants in the namespaces of the package modules. Each of the modules in the package will still have to import constants explicitly either from mypackage or from mypackage.constants.
You cannot do that. You will have to explicitely import your constants into each individual module's namespace. The best way to achieve this is to define your constants in a "config" module and import it everywhere you require it:
# mypackage/config.py
MY_CONST = 17
# mypackage/main.py
from mypackage.config import *
You can define global variables from anywhere, but it is a really bad idea. import the __builtin__ module and modify or add attributes to this modules, and suddenly you have new builtin constants or functions. In fact, when my application installs gettext, I get the _() function in all my modules, without importing anything. So this is possible, but of course only for Application-type projects, not for reusable packages or modules.
And I guess no one would recommend this practice anyway. What's wrong with a namespace? Said application has the version module, so that I have "global" variables available like version.VERSION, version.PACKAGE_NAME etc.
Just wanted to add that constants can be employed using a config.ini file and parsed in the script using the configparser library. This way you could have constants for multiple circumstances. For instance if you had parameter constants for two separate url requests just label them like so:
mymodule/config.ini
[request0]
conn = 'admin#localhost'
pass = 'admin'
...
[request1]
conn = 'barney#localhost'
pass = 'dinosaur'
...
I found the documentation on the Python website very helpful. I am not sure if there are any differences between Python 2 and 3 so here are the links to both:
For Python 3: https://docs.python.org/3/library/configparser.html#module-configparser
For Python 2: https://docs.python.org/2/library/configparser.html#module-configparser
Python Double-Underscore methods are hiding everywhere and behind everything in Python! I am curious about how this is specifically working with the interpreter.
import some_module as sm
From my current understanding:
Import searches for requested module
It binds result to the local assignment (if given)
It utilizes the __init__.py . . . ???
There seems to be something going on that is larger than my scope of understanding. I understand we use __init__() for class initialization. It is functioning as a constructor for our class.
I do not understand how calling import is then utilizing the __init__.py.
What exactly is happening when we run import?
How is __init__.py different from other dunder methods?
Can we manipulate this dunder method (if we really wanted to?)
import some_module is going to look for one of two things. It's either going to look for a some_module.py in the search path or a some_module/__init__.py. Only one of those should exist. The only thing __init__.py means when it comes to modules is "this is the module that represents this folder". So consider this folder structure.
foo/
__init__.py
module1.py
bar.py
Then the three modules available are foo (which corresponds to foo/__init__.py), foo.module1 (which corresponds to foo/module1.py), and bar (which corresponds to bar.py). By convention, foo/__init__.py will usually import important names from module1.py and reexport some of them for convenience, but this is by no means a requirement.
consider this:
/
test.py
lib/
L __init__.py
+ x/
L __init__.py
L p.py
with p.py:
class P():
pass
p1 = P()
With test.py:
import sys
import os
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)), "lib"))
import lib.x.p
import x.p
print(id(lib.x.p.p1))
print(id(x.p.p1))
Here I get different object IDs though I am importing the same object from the same package/module Can someone please explain this behaviour, as it is very confusing, and I did not find any documentation about it.
Thanks!
Modules are cached in the dicitonary sys.modules using their dotted names as keys. Since you are importing the same module by two different dotted names, you end up with two copies of this module, and also with two copies of everything inside them.
The solution is easy: Don't do this, and try to avoid messing around with sys.path.
x.p and lib.x.p aren't the same module. They come from the same file, but Python doesn't determine a module's identity by its file; a module's identity is based on its package-qualified name. The module search logic may have found the same file for both modules, but they're still loaded and executed separately, and objects created in one module are distinct from objects created in another.
Let's assume I have a main script, main.py, that imports another python file with import coolfunctions and another: import chores
Now, suppose coolfunctions also uses stuff from chores, hence I declare import chores inside coolfunctions.
Since both main.py, and coolfunctions import chores ~ is this redundant? Is there any other way of doing this? Am I doing it correctly?
I'm confused about how python projects should be structured in general. I have a "conf.py" file, that I import for a bunch of variables ~ is this a module or not? I load this conf file in multiple places as well.
If two modules want to use chores, then each one must import chores (or some equivalent import). Each import creates a name binding only in the namespace of the module that does the import; that is, import's namespace effect is local to a module's namespace.
This is good, because by looking at a module's code you can (barring pathological cases) know where each name is bound to by the import statements that explicitly bind modules or module attributes to names. Imports made in other modules won't affect this module's namespace.
Each module X should import all (and only) the modules Y, Z, T, ... whose functionality it requires, without any worry about what other modules Fee, Fie, Foo ... (if any) may have already done part or all of those imports, or may be going to do so in the future.
It would make a module extremely fragile (indeed, it would be the very opposite of modularity!) if each module had to worry about such subtle, "covert-channel" effects.
What other modules Y, Z, T, ..., each module X chooses to import (if any) is part of X's implementation details, and shouldn't concern anybody except the developers who are coding, testing, or maintaining X.
In order to ensure that this is the case, and that this clearly-best strategy of decoupling can and will fully be followed by sane code, Python "caches" modules as they get imported: a module is "loaded" only once per run of a program, the first time anybody imports it (or anything from inside it) -- all other imports use the same object obtained by that first loading, which Python keeps in a cache (which is specified as being the dict sys.modules, but you need to know that detail only for somewhat-advanced programming techniques... don't worry about it, 98.7% of the time -- just remember that "import is cheap"!-).
Sure, a conf.py that you use from several other modules via import conf is definitely a module (you may think you're loading it multiple times, but you aren't unless you're using pretty advanced and deliberate techniques indeed for the purpose) -- why shouldn't it be?
No, this isn't redundant - it's fine to import chores in both the main module and coolfunctions.
The exact import mechanics of Python are complex (for example, module imports are only done once, meaning in your case that the actual parsing and loading of the chores module will only happen once, which is a nice optimization) but in general you shouldn't worry about it because it just works.
Each Python file is a module, so your conf.py is also a module.
It is always the best practice to import all necessary modules in the file that uses them. Take for example:
A.py contains: import coolfunctions
B.py contains: import A
Main.py contains: import B and uses functions that are defined in A.py (this is possible because by importing B, Main.py has imported everything that B imports)
If in the future, you change B.py to function without needing to import A.py and therefore remove the import A, then your Main.py will suffer the loss of not having imported A.
I want to define a constant that should be available in all of the submodules of a package. I've thought that the best place would be in in the __init__.py file of the root package. But I don't know how to do this. Suppose I have a few subpackages and each with several modules. How can I access that variable from these modules?
Of course, if this is totally wrong, and there is a better alternative, I'd like to know it.
You should be able to put them in __init__.py. This is done all the time.
mypackage/__init__.py:
MY_CONSTANT = 42
mypackage/mymodule.py:
from mypackage import MY_CONSTANT
print "my constant is", MY_CONSTANT
Then, import mymodule:
>>> from mypackage import mymodule
my constant is 42
Still, if you do have constants, it would be reasonable (best practices, probably) to put them in a separate module (constants.py, config.py, ...) and then if you want them in the package namespace, import them.
mypackage/__init__.py:
from mypackage.constants import *
Still, this doesn't automatically include the constants in the namespaces of the package modules. Each of the modules in the package will still have to import constants explicitly either from mypackage or from mypackage.constants.
You cannot do that. You will have to explicitely import your constants into each individual module's namespace. The best way to achieve this is to define your constants in a "config" module and import it everywhere you require it:
# mypackage/config.py
MY_CONST = 17
# mypackage/main.py
from mypackage.config import *
You can define global variables from anywhere, but it is a really bad idea. import the __builtin__ module and modify or add attributes to this modules, and suddenly you have new builtin constants or functions. In fact, when my application installs gettext, I get the _() function in all my modules, without importing anything. So this is possible, but of course only for Application-type projects, not for reusable packages or modules.
And I guess no one would recommend this practice anyway. What's wrong with a namespace? Said application has the version module, so that I have "global" variables available like version.VERSION, version.PACKAGE_NAME etc.
Just wanted to add that constants can be employed using a config.ini file and parsed in the script using the configparser library. This way you could have constants for multiple circumstances. For instance if you had parameter constants for two separate url requests just label them like so:
mymodule/config.ini
[request0]
conn = 'admin#localhost'
pass = 'admin'
...
[request1]
conn = 'barney#localhost'
pass = 'dinosaur'
...
I found the documentation on the Python website very helpful. I am not sure if there are any differences between Python 2 and 3 so here are the links to both:
For Python 3: https://docs.python.org/3/library/configparser.html#module-configparser
For Python 2: https://docs.python.org/2/library/configparser.html#module-configparser