What is the difference between __init__.py and __main__.py? - python

I know of these two questions about __init__.py and __main__.py:
What is __init__.py for?
What is __main__.py?
But I don't really understand the difference between them. Or I could say I don't understand how they interact together.

__init__.py is run when you import a package into a running python program. For instance, import idlelib within a program, runs idlelib/__init__.py, which does not do anything as its only purpose is to mark the idlelib directory as a package. On the otherhand, tkinter/__init__.py contains most of the tkinter code and defines all the widget classes.
__main__.py is run as '__main__' when you run a package as the main program. For instance, python -m idlelib at a command line runs idlelib/__main__.py, which starts Idle. Similarly, python -m tkinter runs tkinter/__main__.py, which has this line:
from . import _test as main
In this context, . is tkinter, so importing . imports tkinter, which runs tkinter/__init__.py. _test is a function defined within that file. So calling main() (next line) has the same effect as running python -m tkinter.__init__ at the command line.

__init__.py, among other things, labels a directory as a python directory and lets you set variables on a package wide level.
__main__.py, among other things, is run if you try to run a compressed group of python files. __main__.py allows you to execute packages.
Both of these answers were obtained from the answers you linked. Is there something else you didn't understand about these things?

Related

(Python Unittest) Module being tested cannot import its dependencies: ModuleNotFoundError

I am working on developing unittests for a project that has been already completed, however I am having a hard time running my unittests without modifying the original code. The module I am trying to test has other dependencies in the same folder that will not import when the unittests are run. Here is what my directory looks like:
root
|--main_folder
|--module1.py
|--module2.py
|--tests
|--test_module1.py
The original code in module1.py successfully imports module2.py on its own like this: from module2 import Practices where Practices is a function from module2.
The issue I am running into is that in order to run test_module1.py (which I am doing by calling python3 -m unittest from the root directory), I have to modify module1.py itself such that it says: from main_folder.module2 import Practices.
If I run the test file without modifying module1.py, I get the error ModuleNotFoundError: No module named 'module2'.
Ideally I cannot modify the code in this way, and I am trying to find a way to make my tests work without touching the application itself. How should I go about this? module1.py runs normally when I run the application without modifying the file, however modifying it so that the tests work breaks the main application. What can I do to make my tests independent of the code for the main app?
(For some more background, the test_module1.py file works by calling from main_folder.module1 import fun1 where fun1 is the function I am trying to test)
Try running your tests using one of the following commands (replacting the actual paths):
if your tests import the modules "from main_folder import ..."
env PYTHONPATH=/root python3 -m unittest
or if your tests import directly "import module1":
env PYTHONPATH=/root/main_folder python3 -m unittest
As a side note, you might need to have existing
main_folder/__init__.py
file, to get the main_folder recognized as package, depending of the python version you're using. If you currently don't have such file, try creating it (empty, no need to put code inside it) and check if the issue persists.

__init__.py code called twice and its significance with package import

I have a simple python project fro learning with two files __init__.py and __main__.py
When I executed python -m pkg_name
it runs both __init__.py and __main__.py
When I execute python -m pkg_name.__init__.py
it invokes __init__.py twice.
I want to know why __init__.py is called twice when i call __init__.py
Is it like the static code in java where when we call the class all the data
in static code is automatically triggered.
What is the relevance of __init__.py in python and benefits of it getting executed when package is imported/loaded or called for processing.
Please help me understand the concepts better.
"""Run a sequence of programs, testing python code __main__ variable
Each program (except the first) receives standard output of the
previous program on its standard input, by default. There are several
alternate ways of passing data between programs.
"""
def _launch():
print('Pipeline Launched')
if __name__ == '__main__':
print('This module is running as the main module!')
_launch()
> __init__.py
"""This is the __init__.py file of pipleline package
Used for testing the import statements.
"""
print(__name__)
print('This is the __init__.py file of pipleline package')
print('Exiting __init__ of pipeline package after all initialization')
The following command is used to execute a Python module or package:
python -m module
Where module is the name of the module/package without .py extension.
if the name matches a script, it is byte-compiled and executed,
if the name matches a directory with a __init__.py file and a __main__.py file, the directory is considered as being a Python package and is loaded first. Then the __main__.py script is executed.
if the name contains dots, e.g.: "mylib.mypkg.mymodule", Python browse each package and sub-package matching the dotted name and load it. Then it execute the last module (or last package which must contain a __main__.py file).
A (short) description is done in the official documentation: 29.4. main — Top-level script environment.
Your problem
If you run this command:
python -m pkg_name
It loads (and run) the __init__.py and __main__.py: this is the normal behavior.
If you run this command:
python -m pkg_name.__init__.py
It should fail if you leave the ".py" extension.
If it runs, the command loads the pkg_name package first: it execute the __init__.py first. Then it runs it again.
It is used to define a folder as a package, which contains required modules and resources.
You can use is as an empty file or add docs about the package or setup initial conditions for the module.
Please checkout the python documentation.
Also, as mentioned by Natecat, __init__.py gets executed whenever you load a package. That's why when you explicitly call __init__.py, it loads the package (1st load) then executes __init__.py (2nd load).

Running a script from a package

I'm new to python coming from java. I created a folder called: 'Project'. In 'Project' I created many packages (with __init__.py files) like 'test1' and 'tests2'. 'test1' contains a python script file .py that uses scripts from 'test2' (import a module from test2). I want to run a script x.py in 'test1' from command line. How can I do that?
Edit: if you have better recommendations on how I can better organize my files I would be thankful. (notice my java mentality)
Edit: I need to run the script from a bash script, so I need to provide full paths.
There are probably several ways to achieve what you want.
One thing that I do when I need to make sure the module paths are correct in an executable scripts is to get the parent directory and insert in the module search path (sys.path):
import sys, os
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
import test1 # next imports go here...
from test2 import something
# any import what works from the parent dir will work here
This way you are safe to run your scripts without worrying how the script is called.
Python code is organized into modules and packages. A module is just a .py file that can contain class definitions, function definitions and variables. A package is a directory with a __init__.py file.
A standard Python project might look something like this:
thingsproject/
README
setup.py
doc/
...
things/
__init__.py
animals.py
vegetables.py
minerals.py
test/
test_animals.py
test_vegetables.py
test_minerals.py
The setup.py file describes the metadata about your project. See Writing the Setup Script and particularly the section on installing scripts.
Entry points exist to help distribute command line tools in Python. An entry point is defined in setup.py like this:
setup(
name='thingsproject',
....
entry_points = {
'console_scripts': ['dog = things.animals:dog_main_function']
},
...
)
The effect is that when the package is installed using python setup.py install a script is automatically created in some reasonable place according to your OS, such as /usr/local/bin. The script then calls the dog_main_function in the animals module of the things package.
Yet another Python convention to consider is have a __main__.py file. This signifies the "main" script within a directory or zip file full of python code. This is a good place to define a command line interface to your code using the argparse parser for command line arguments.
Good and up-to-date information on the somewhat muddled world of Python packaging can be found in the Python Packaging User Guide.

Using code in init.py when running python from command line

I'm setting up some code for unittesting. My directory currently looks like this:
project/
src/
__init__.py
sources.py
test/
__init__.py
sources_test.py
In __init__.py for the test directory, I have these two lines:
import sys
sys.path.insert(0, '../')
In the test files, I have the line import src.sources.
When I use nose to run these tests from the project directory, everything works just fine. If I try to run the tests individually it gives me this error:
ImportError: No module named src.sources
I assume that this is because when I run the test from the command line it isn't using __init__.py. Is there a way I can make sure that it will use those lines even when I try to run the tests individually?
I could take the lines out of __init__.py and put them into my test files, but I'm trying to avoid doing that.
To run the tests individually I am running python sources_test.py
You're really trying to abuse packages here, and that isn't a good idea.
The simple solution is to not run the tests from within the tests directory. Just cd up a level, then do python tests/sources_test.py.
Of course that in itself isn't going to import test/__init__.py. For that, you really need to import the package. So python -m tests.sources_test is probably a better idea… except, of course, that if your package is made to be run as a script but not to be imported, that won't work.
Alternatively, you could (on POSIX platforms, at least) do PYTHONPATH=.. python sources_test.py from within tests. This is a bit hacky, but it should work.
Or, better, combine the above, and, from outside of tests, do PYTHONPATH=. python tests/sources_test.py.
A really hacky workaround is to explicitly import __init__. This should basically work for you simple use case, but everything ends up wrong—in particular, you end up with a module named __init__ instead of one named test, and of course your main module isn't named test.sources_test, and in fact there is no test package at all. Unless you accidentally re-import anything after modifying sys.path, in which case you may get duplicates of the modules.
If you write
import src.source
the python interpreter looks into the src directory for a __init__.py file. If it exists, you can use the directory as a package name. If your are not in your project directory, which is the case when you are in the src directory, then python looks into the directories in $PYTHONPATH environment variable (at least in linux, windows should also have some environment variable, maybe with another name), if it can find some directory src with a __init__.py file in it.
Did you set your $PYTHONPATH?

how to organize a python module that scripts can be run as __main__ program?

I'm starting a project in python, the code structure now as below:
project/
__init__.py
a.py
b.py
mainA.py
utilities/
__init__.py
mainB.py
c.py
The __init__ files are all blank.
I want to run utilities/mainB.py as a program(using something like python main.py), and mainB needs to import a.py and b.py. So I tried from .. import a and some other approaches, but the import failed. The error information is:
ValueError: Attempted relative import in non-package
So here comes the questions:
how to fix mainB.py so it can be run as a main program?
mainA.py can be run as main program now, it also imports a.py and b.py(using import a and import b).
I think the code structure may become more complex. Say, if mainA.py has to import a module from project/some/directory, how can I do that?
See this previous question. You have two options. One is to use the __package__ attribute as described in PEP 366 to set the relative name of your modules. The other is to execute your scripts as modules (using the -m flag to the interpreter) instead of running them directly as scripts.
You could use Python's built-in module-running functionality (python -m <module>).
python -m project.utilities.mainB
This allows you to write mainB normally as part of the package, so relative and absolute imports will both work correctly.
For an in-depth discussion of this functionality, see PEP-338.
You should add 'project' dir in PYTHON_PATH and then, in mainB.py:
from project import a

Categories