__init__.py code called twice and its significance with package import - python

I have a simple python project fro learning with two files __init__.py and __main__.py
When I executed python -m pkg_name
it runs both __init__.py and __main__.py
When I execute python -m pkg_name.__init__.py
it invokes __init__.py twice.
I want to know why __init__.py is called twice when i call __init__.py
Is it like the static code in java where when we call the class all the data
in static code is automatically triggered.
What is the relevance of __init__.py in python and benefits of it getting executed when package is imported/loaded or called for processing.
Please help me understand the concepts better.
"""Run a sequence of programs, testing python code __main__ variable
Each program (except the first) receives standard output of the
previous program on its standard input, by default. There are several
alternate ways of passing data between programs.
"""
def _launch():
print('Pipeline Launched')
if __name__ == '__main__':
print('This module is running as the main module!')
_launch()
> __init__.py
"""This is the __init__.py file of pipleline package
Used for testing the import statements.
"""
print(__name__)
print('This is the __init__.py file of pipleline package')
print('Exiting __init__ of pipeline package after all initialization')

The following command is used to execute a Python module or package:
python -m module
Where module is the name of the module/package without .py extension.
if the name matches a script, it is byte-compiled and executed,
if the name matches a directory with a __init__.py file and a __main__.py file, the directory is considered as being a Python package and is loaded first. Then the __main__.py script is executed.
if the name contains dots, e.g.: "mylib.mypkg.mymodule", Python browse each package and sub-package matching the dotted name and load it. Then it execute the last module (or last package which must contain a __main__.py file).
A (short) description is done in the official documentation: 29.4. main — Top-level script environment.
Your problem
If you run this command:
python -m pkg_name
It loads (and run) the __init__.py and __main__.py: this is the normal behavior.
If you run this command:
python -m pkg_name.__init__.py
It should fail if you leave the ".py" extension.
If it runs, the command loads the pkg_name package first: it execute the __init__.py first. Then it runs it again.

It is used to define a folder as a package, which contains required modules and resources.
You can use is as an empty file or add docs about the package or setup initial conditions for the module.
Please checkout the python documentation.
Also, as mentioned by Natecat, __init__.py gets executed whenever you load a package. That's why when you explicitly call __init__.py, it loads the package (1st load) then executes __init__.py (2nd load).

Related

Under which circumstances __init__.py runs?

I have __init__.py file in current directory.
I need a complete list of circumstances, under which this file will run.
First case is
import __init__
written in the script.py in the same directory and this file runs.
What are other cases?
A __init__.py file is run when the package that the corresponds to it is imported. So a file some_package\__init__.py is executed when you import some_package. When you import a submodule from a package the package is first loaded. So import aa.bb.cc, will load aa (and thus execute aa/__init__.py) before loading aa.bb and aa.bb.cc.
The folder some_package must be discoverable, which means that it must exists in one of the sys.path folders. This includes the current directory.
If you simply run a script (python some_script.py) and there happens to be a __init__.py file in the same folder then this means nothing, since the current folder is not a package itself. (unless of course if you execute a script that happens to reside inside a package).

run python module by just typing folder name

I have __init__.py in a folder called test_module. in the __init__.py i have below code. however when i try to execute from parent folder of test_module with following command python test_module i get following error can't find '__main__' module in 'test_module. is this not possible? or will i have to run python test_module/__init__.py?
def main():
print('test')
if __name__ == '__main__':
main()
The __init__.py module is executed when the package is imported. The purpose of __init__.py files per the documentation is as follows:
The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path. In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable, described later.
In order for a Python package to be directly executed, it needs to have an entry point, designated by a module within the package named __main__.py. Thus the error can't find '__main__' module in 'test_module': You have attempted to directly execute the package, but Python cannot locate an entry point to begin executing top-level code.
Consider the following package structure:
test_module/
__init__.py
__main__.py
Where __init__.py contains the following:
print("Running: __init__.py")
Where __main__.py contains the following:
print("Running: __main__.py")
When we execute the test_module package with the command python test_module, we get the following output:
> python test_module
Running: __main__.py
However, if we enter the Python shell and import test_module, the output is as follows:
>>> import test_module
Running: __init__.py
Thus in order to get the behavior you want when attempting to directly execute test_module, simply create a new __main__.py file within test_module and transfer the code from __init__.py to the new __main__.py.

python module and imports

If this is my directory tree
temp
├── __init__.py
└── __main__.py
0 directories, 2 files
And I have the following code in __init__.py and in __main__.py
__init__.py
"""Initializes the module"""
CONSTANT = 1
sys.exit("what is happening here")
__main__.py
# from . import CONSTANT
# from temp import CONSTANT
if __name__ == "__main__":
print "This should never run"
I am getting two problems here that I am trying to figure out
On running python . in the temp directory I get the output
This should never run, shouldn't the module be initialized first with the __init__.py file resulting in the abort?
Second how do I go about doing imports in python modules? Neither of the two options I have mentioned above works. I can neither do from . import CONSTANT nor from temp import CONSTANT in the code above. What is the right way to do relative imports?
I am running this on Python 2.7.5, apologies if this has already been asked before.
You should be running it from out of the temp directory. If someDir contains your temp directory, then:
someDir $ python -m temp #someDir/temp/__init__.py is your file.
On running python . in the temp directory I get the output This should never run, shouldn't the module be initialized first with the init.py file resulting in the abort?
If you run it from outside, __init__.py will be called. And sys.exit will be called too.
Second how do I go about doing imports in python modules? Neither of the two options I have mentioned above works. I can neither do from . import CONSTANT nor from temp import CONSTANT in the code above. What is the right way to do relative imports?
You are doing it just fine. Just import sys in your __init__.py file. And fix the spelling of CONSTANT.
Also why do I need the -m flag? Isn't it ok to just do python temp from the parent directory of temp?
You need the -m flag to tell that you are using packages. If you dont use it you wont be able to do relative imports.
When you tell Python to run a directory, Python does not treat the directory as a package. Instead, Python adds that directory to sys.path and runs its __main__.py. __init__.py is not executed, and relative imports will not view the directory as a package.
If you want to run a package's __main__.py and treat it as part of the package, with __init__.py executed and all, go to the directory containing the package and run
python -m packagename
You are running inside temp; this is not considered a package and __init__.py is not loaded. Only if the parent of the current directory is on the module loading path and you explicitly load temp as a module, is __init__.py loaded.
Because temp is not a package you can't use relative imports here. Instead, every Python file inside of the directory is considered a top-level module all by themselves.
You'd have move to the parent of the temp directory, then run:
python -m temp
for Python to import temp as a package and then run the __main__ module in that package.

What is the difference between __init__.py and __main__.py?

I know of these two questions about __init__.py and __main__.py:
What is __init__.py for?
What is __main__.py?
But I don't really understand the difference between them. Or I could say I don't understand how they interact together.
__init__.py is run when you import a package into a running python program. For instance, import idlelib within a program, runs idlelib/__init__.py, which does not do anything as its only purpose is to mark the idlelib directory as a package. On the otherhand, tkinter/__init__.py contains most of the tkinter code and defines all the widget classes.
__main__.py is run as '__main__' when you run a package as the main program. For instance, python -m idlelib at a command line runs idlelib/__main__.py, which starts Idle. Similarly, python -m tkinter runs tkinter/__main__.py, which has this line:
from . import _test as main
In this context, . is tkinter, so importing . imports tkinter, which runs tkinter/__init__.py. _test is a function defined within that file. So calling main() (next line) has the same effect as running python -m tkinter.__init__ at the command line.
__init__.py, among other things, labels a directory as a python directory and lets you set variables on a package wide level.
__main__.py, among other things, is run if you try to run a compressed group of python files. __main__.py allows you to execute packages.
Both of these answers were obtained from the answers you linked. Is there something else you didn't understand about these things?

What is __main__.py?

What is the __main__.py file for, what sort of code should I put into it, and when should I have one?
Often, a Python program is run by naming a .py file on the command line:
$ python my_program.py
You can also create a directory or zipfile full of code, and include a __main__.py. Then you can simply name the directory or zipfile on the command line, and it executes the __main__.py automatically:
$ python my_program_dir
$ python my_program.zip
# Or, if the program is accessible as a module
$ python -m my_program
You'll have to decide for yourself whether your application could benefit from being executed like this.
Note that a __main__ module usually doesn't come from a __main__.py file. It can, but it usually doesn't. When you run a script like python my_program.py, the script will run as the __main__ module instead of the my_program module. This also happens for modules run as python -m my_module, or in several other ways.
If you saw the name __main__ in an error message, that doesn't necessarily mean you should be looking for a __main__.py file.
What is the __main__.py file for?
When creating a Python module, it is common to make the module execute some functionality (usually contained in a main function) when run as the entry point of the program. This is typically done with the following common idiom placed at the bottom of most Python files:
if __name__ == '__main__':
# execute only if run as the entry point into the program
main()
You can get the same semantics for a Python package with __main__.py, which might have the following structure:
.
└── demo
├── __init__.py
└── __main__.py
To see this, paste the below into a Python 3 shell:
from pathlib import Path
demo = Path.cwd() / 'demo'
demo.mkdir()
(demo / '__init__.py').write_text("""
print('demo/__init__.py executed')
def main():
print('main() executed')
""")
(demo / '__main__.py').write_text("""
print('demo/__main__.py executed')
from demo import main
main()
""")
We can treat demo as a package and actually import it, which executes the top-level code in the __init__.py (but not the main function):
>>> import demo
demo/__init__.py executed
When we use the package as the entry point to the program, we perform the code in the __main__.py, which imports the __init__.py first:
$ python -m demo
demo/__init__.py executed
demo/__main__.py executed
main() executed
You can derive this from the documentation. The documentation says:
__main__ — Top-level script environment
'__main__' is the name of the scope in which top-level code executes.
A module’s __name__ is set equal to '__main__' when read from standard
input, a script, or from an interactive prompt.
A module can discover whether or not it is running in the main scope
by checking its own __name__, which allows a common idiom for
conditionally executing code in a module when it is run as a script or
with python -m but not when it is imported:
if __name__ == '__main__':
# execute only if run as a script
main()
For a package, the same effect can be achieved by including a
__main__.py module, the contents of which will be executed when the module is run with -m.
Zipped
You can also zip up this directory, including the __main__.py, into a single file and run it from the command line like this - but note that zipped packages can't execute sub-packages or submodules as the entry point:
from pathlib import Path
demo = Path.cwd() / 'demo2'
demo.mkdir()
(demo / '__init__.py').write_text("""
print('demo2/__init__.py executed')
def main():
print('main() executed')
""")
(demo / '__main__.py').write_text("""
print('demo2/__main__.py executed')
from __init__ import main
main()
""")
Note the subtle change - we are importing main from __init__ instead of demo2 - this zipped directory is not being treated as a package, but as a directory of scripts. So it must be used without the -m flag.
Particularly relevant to the question - zipapp causes the zipped directory to execute the __main__.py by default - and it is executed first, before __init__.py:
$ python -m zipapp demo2 -o demo2zip
$ python demo2zip
demo2/__main__.py executed
demo2/__init__.py executed
main() executed
Note again, this zipped directory is not a package - you cannot import it either.
Some of the answers here imply that given a "package" directory (with or without an explicit __init__.py file), containing a __main__.py file, there is no difference between running that directory with the -m switch or without.
The big difference is that without the -m switch, the "package" directory is first added to the path (i.e. sys.path), and then the files are run normally, without package semantics.
Whereas with the -m switch, package semantics (including relative imports) are honoured, and the package directory itself is never added to the system path.
This is a very important distinction, both in terms of whether relative imports will work or not, but more importantly in terms of dictating what will be imported in the case of unintended shadowing of system modules.
Example:
Consider a directory called PkgTest with the following structure
:~/PkgTest$ tree
.
├── pkgname
│   ├── __main__.py
│   ├── secondtest.py
│   └── testmodule.py
└── testmodule.py
where the __main__.py file has the following contents:
:~/PkgTest$ cat pkgname/__main__.py
import os
print( "Hello from pkgname.__main__.py. I am the file", os.path.abspath( __file__ ) )
print( "I am being accessed from", os.path.abspath( os.curdir ) )
from testmodule import main as firstmain; firstmain()
from .secondtest import main as secondmain; secondmain()
(with the other files defined similarly with similar printouts).
If you run this without the -m switch, this is what you'll get. Note that the relative import fails, but more importantly note that the wrong testmodule has been chosen (i.e. relative to the working directory):
:~/PkgTest$ python3 pkgname
Hello from pkgname.__main__.py. I am the file ~/PkgTest/pkgname/__main__.py
I am being accessed from ~/PkgTest
Hello from testmodule.py. I am the file ~/PkgTest/pkgname/testmodule.py
I am being accessed from ~/PkgTest
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "pkgname/__main__.py", line 10, in <module>
from .secondtest import main as secondmain
ImportError: attempted relative import with no known parent package
Whereas with the -m switch, you get what you (hopefully) expected:
:~/PkgTest$ python3 -m pkgname
Hello from pkgname.__main__.py. I am the file ~/PkgTest/pkgname/__main__.py
I am being accessed from ~/PkgTest
Hello from testmodule.py. I am the file ~/PkgTest/testmodule.py
I am being accessed from ~/PkgTest
Hello from secondtest.py. I am the file ~/PkgTest/pkgname/secondtest.py
I am being accessed from ~/PkgTest
Note: In my honest opinion, running without -m should be avoided. In fact I would go further and say that I would create any executable packages in such a way that they would fail unless run via the -m switch.
In other words, I would only import from 'in-package' modules explicitly via 'relative imports', assuming that all other imports represent system modules. If someone attempts to run your package without the -m switch, the relative import statements will throw an error, instead of silently running the wrong module.
You create __main__.py in yourpackage to make it executable as:
$ python -m yourpackage
__main__.py is used for python programs in zip files. The __main__.py file will be executed when the zip file in run. For example, if the zip file was as such:
test.zip
__main__.py
and the contents of __main__.py was
import sys
print "hello %s" % sys.argv[1]
Then if we were to run python test.zip world we would get hello world out.
So the __main__.py file run when python is called on a zip file.
If your script is a directory or ZIP file rather than a single python file, __main__.py will be executed when the "script" is passed as an argument to the python interpreter.

Categories