making a CLI entry point for a library

making a CLI entry point for a library - python

I have written a library with the following structure:
libProject/
setup.py
libname/
__init__.py
script1.py
script2.py
...
I use it primarily by importing it
from libname.script1 import Fun1
...
I would now like to make a command line interface to it, but I'm confused where do I put my argpase code within this project so that I can use
$ python libname <somecommand>

I found a solution. Would still keep it open for better answers.
In the libname directory, touch a new file called __main__.py
libname.__main__ will be executed when libname directory is called as script.
Now put your command line code inside this file
#__main__.py
import sys
from .script1 import Fun1
print Fun1(sys.arg[-1])

Related

init.py code called twice and its significance with package import

I have a simple python project fro learning with two files __init__.py and __main__.py
When I executed python -m pkg_name
it runs both __init__.py and __main__.py
When I execute python -m pkg_name.__init__.py
it invokes __init__.py twice.
I want to know why __init__.py is called twice when i call __init__.py
Is it like the static code in java where when we call the class all the data
in static code is automatically triggered.
What is the relevance of __init__.py in python and benefits of it getting executed when package is imported/loaded or called for processing.
Please help me understand the concepts better.
"""Run a sequence of programs, testing python code __main__ variable
Each program (except the first) receives standard output of the
previous program on its standard input, by default. There are several
alternate ways of passing data between programs.
"""
def _launch():
print('Pipeline Launched')
if __name__ == '__main__':
print('This module is running as the main module!')
_launch()
> __init__.py
"""This is the __init__.py file of pipleline package
Used for testing the import statements.
"""
print(__name__)
print('This is the __init__.py file of pipleline package')
print('Exiting __init__ of pipeline package after all initialization')

The following command is used to execute a Python module or package:
python -m module
Where module is the name of the module/package without .py extension.
if the name matches a script, it is byte-compiled and executed,
if the name matches a directory with a __init__.py file and a __main__.py file, the directory is considered as being a Python package and is loaded first. Then the __main__.py script is executed.
if the name contains dots, e.g.: "mylib.mypkg.mymodule", Python browse each package and sub-package matching the dotted name and load it. Then it execute the last module (or last package which must contain a __main__.py file).
A (short) description is done in the official documentation: 29.4. main — Top-level script environment.
Your problem
If you run this command:
python -m pkg_name
It loads (and run) the __init__.py and __main__.py: this is the normal behavior.
If you run this command:
python -m pkg_name.__init__.py
It should fail if you leave the ".py" extension.
If it runs, the command loads the pkg_name package first: it execute the __init__.py first. Then it runs it again.

It is used to define a folder as a package, which contains required modules and resources.
You can use is as an empty file or add docs about the package or setup initial conditions for the module.
Please checkout the python documentation.
Also, as mentioned by Natecat, __init__.py gets executed whenever you load a package. That's why when you explicitly call __init__.py, it loads the package (1st load) then executes __init__.py (2nd load).

python Import module from a different directory structure

common/src/validation/file1.py
In the common/src/validation/ and common/src/ folder "_init_" is defined.
common/test/validation/file2.py
common/test/validation/case/file3.py
In file2.py and file3.py, I want to import class from file1.py.
Im giving the following line in file2.py and file3.py.:
from file1 import class1
I currently get error:
#ImportError: No module named file1
what should be the sys.path.append ?

If you want to import a file from a folder you should have a file named
__init__.py
in that folder (it could also be a blank file).

You can basically just change PYTHONPATH to common and run from there. Then reference all import paths in relation to that.
like so: (assuming you are in common and your main is in file3.py)
PYTHONPATH=. python test/validation/case/file3.py
Make sure to have __init__.py files in any directory you import (the content of this file doesn't matter)
I usually like putting in the root of every project a run.sh file with some version of this line (pointing to wherever my main func is) in it.
It will usually contain a simple something like that:
#!/bin/bash
source some_env/bin/activate # start the relevant environment if you have one
PYTHONPATH=. python src/main.py # set PYTHONPATH and run
deactivate # exit the relevant environment if started one
Another option is to do this but is't not as elegant.

What is the difference between init.py and main.py?

I know of these two questions about __init__.py and __main__.py:
What is __init__.py for?
What is __main__.py?
But I don't really understand the difference between them. Or I could say I don't understand how they interact together.

__init__.py is run when you import a package into a running python program. For instance, import idlelib within a program, runs idlelib/__init__.py, which does not do anything as its only purpose is to mark the idlelib directory as a package. On the otherhand, tkinter/__init__.py contains most of the tkinter code and defines all the widget classes.
__main__.py is run as '__main__' when you run a package as the main program. For instance, python -m idlelib at a command line runs idlelib/__main__.py, which starts Idle. Similarly, python -m tkinter runs tkinter/__main__.py, which has this line:
from . import _test as main
In this context, . is tkinter, so importing . imports tkinter, which runs tkinter/__init__.py. _test is a function defined within that file. So calling main() (next line) has the same effect as running python -m tkinter.__init__ at the command line.

__init__.py, among other things, labels a directory as a python directory and lets you set variables on a package wide level.
__main__.py, among other things, is run if you try to run a compressed group of python files. __main__.py allows you to execute packages.
Both of these answers were obtained from the answers you linked. Is there something else you didn't understand about these things?

Running a script from a package

I'm new to python coming from java. I created a folder called: 'Project'. In 'Project' I created many packages (with __init__.py files) like 'test1' and 'tests2'. 'test1' contains a python script file .py that uses scripts from 'test2' (import a module from test2). I want to run a script x.py in 'test1' from command line. How can I do that?
Edit: if you have better recommendations on how I can better organize my files I would be thankful. (notice my java mentality)
Edit: I need to run the script from a bash script, so I need to provide full paths.

There are probably several ways to achieve what you want.
One thing that I do when I need to make sure the module paths are correct in an executable scripts is to get the parent directory and insert in the module search path (sys.path):
import sys, os
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
import test1 # next imports go here...
from test2 import something
# any import what works from the parent dir will work here
This way you are safe to run your scripts without worrying how the script is called.

Python code is organized into modules and packages. A module is just a .py file that can contain class definitions, function definitions and variables. A package is a directory with a __init__.py file.
A standard Python project might look something like this:
thingsproject/
README
setup.py
doc/
...
things/
__init__.py
animals.py
vegetables.py
minerals.py
test/
test_animals.py
test_vegetables.py
test_minerals.py
The setup.py file describes the metadata about your project. See Writing the Setup Script and particularly the section on installing scripts.
Entry points exist to help distribute command line tools in Python. An entry point is defined in setup.py like this:
setup(
name='thingsproject',
....
entry_points = {
'console_scripts': ['dog = things.animals:dog_main_function']
},
...
)
The effect is that when the package is installed using python setup.py install a script is automatically created in some reasonable place according to your OS, such as /usr/local/bin. The script then calls the dog_main_function in the animals module of the things package.
Yet another Python convention to consider is have a __main__.py file. This signifies the "main" script within a directory or zip file full of python code. This is a good place to define a command line interface to your code using the argparse parser for command line arguments.
Good and up-to-date information on the somewhat muddled world of Python packaging can be found in the Python Packaging User Guide.

What is main.py?

What is the __main__.py file for, what sort of code should I put into it, and when should I have one?

Often, a Python program is run by naming a .py file on the command line:
$ python my_program.py
You can also create a directory or zipfile full of code, and include a __main__.py. Then you can simply name the directory or zipfile on the command line, and it executes the __main__.py automatically:
$ python my_program_dir
$ python my_program.zip
# Or, if the program is accessible as a module
$ python -m my_program
You'll have to decide for yourself whether your application could benefit from being executed like this.
Note that a __main__ module usually doesn't come from a __main__.py file. It can, but it usually doesn't. When you run a script like python my_program.py, the script will run as the __main__ module instead of the my_program module. This also happens for modules run as python -m my_module, or in several other ways.
If you saw the name __main__ in an error message, that doesn't necessarily mean you should be looking for a __main__.py file.

What is the __main__.py file for?
When creating a Python module, it is common to make the module execute some functionality (usually contained in a main function) when run as the entry point of the program. This is typically done with the following common idiom placed at the bottom of most Python files:
if __name__ == '__main__':
# execute only if run as the entry point into the program
main()
You can get the same semantics for a Python package with __main__.py, which might have the following structure:
.
└── demo
├── __init__.py
└── __main__.py
To see this, paste the below into a Python 3 shell:
from pathlib import Path
demo = Path.cwd() / 'demo'
demo.mkdir()
(demo / '__init__.py').write_text("""
print('demo/__init__.py executed')
def main():
print('main() executed')
""")
(demo / '__main__.py').write_text("""
print('demo/__main__.py executed')
from demo import main
main()
""")
We can treat demo as a package and actually import it, which executes the top-level code in the __init__.py (but not the main function):
>>> import demo
demo/__init__.py executed
When we use the package as the entry point to the program, we perform the code in the __main__.py, which imports the __init__.py first:
$ python -m demo
demo/__init__.py executed
demo/__main__.py executed
main() executed
You can derive this from the documentation. The documentation says:
__main__ — Top-level script environment
'__main__' is the name of the scope in which top-level code executes.
A module’s __name__ is set equal to '__main__' when read from standard
input, a script, or from an interactive prompt.
A module can discover whether or not it is running in the main scope
by checking its own __name__, which allows a common idiom for
conditionally executing code in a module when it is run as a script or
with python -m but not when it is imported:
if __name__ == '__main__':
# execute only if run as a script
main()
For a package, the same effect can be achieved by including a
__main__.py module, the contents of which will be executed when the module is run with -m.
Zipped
You can also zip up this directory, including the __main__.py, into a single file and run it from the command line like this - but note that zipped packages can't execute sub-packages or submodules as the entry point:
from pathlib import Path
demo = Path.cwd() / 'demo2'
demo.mkdir()
(demo / '__init__.py').write_text("""
print('demo2/__init__.py executed')
def main():
print('main() executed')
""")
(demo / '__main__.py').write_text("""
print('demo2/__main__.py executed')
from __init__ import main
main()
""")
Note the subtle change - we are importing main from __init__ instead of demo2 - this zipped directory is not being treated as a package, but as a directory of scripts. So it must be used without the -m flag.
Particularly relevant to the question - zipapp causes the zipped directory to execute the __main__.py by default - and it is executed first, before __init__.py:
$ python -m zipapp demo2 -o demo2zip
$ python demo2zip
demo2/__main__.py executed
demo2/__init__.py executed
main() executed
Note again, this zipped directory is not a package - you cannot import it either.

Some of the answers here imply that given a "package" directory (with or without an explicit __init__.py file), containing a __main__.py file, there is no difference between running that directory with the -m switch or without.
The big difference is that without the -m switch, the "package" directory is first added to the path (i.e. sys.path), and then the files are run normally, without package semantics.
Whereas with the -m switch, package semantics (including relative imports) are honoured, and the package directory itself is never added to the system path.
This is a very important distinction, both in terms of whether relative imports will work or not, but more importantly in terms of dictating what will be imported in the case of unintended shadowing of system modules.
Example:
Consider a directory called PkgTest with the following structure
:~/PkgTest$ tree
.
├── pkgname
│   ├── __main__.py
│   ├── secondtest.py
│   └── testmodule.py
└── testmodule.py
where the __main__.py file has the following contents:
:~/PkgTest$ cat pkgname/__main__.py
import os
print( "Hello from pkgname.__main__.py. I am the file", os.path.abspath( __file__ ) )
print( "I am being accessed from", os.path.abspath( os.curdir ) )
from testmodule import main as firstmain; firstmain()
from .secondtest import main as secondmain; secondmain()
(with the other files defined similarly with similar printouts).
If you run this without the -m switch, this is what you'll get. Note that the relative import fails, but more importantly note that the wrong testmodule has been chosen (i.e. relative to the working directory):
:~/PkgTest$ python3 pkgname
Hello from pkgname.__main__.py. I am the file ~/PkgTest/pkgname/__main__.py
I am being accessed from ~/PkgTest
Hello from testmodule.py. I am the file ~/PkgTest/pkgname/testmodule.py
I am being accessed from ~/PkgTest
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "pkgname/__main__.py", line 10, in <module>
from .secondtest import main as secondmain
ImportError: attempted relative import with no known parent package
Whereas with the -m switch, you get what you (hopefully) expected:
:~/PkgTest$ python3 -m pkgname
Hello from pkgname.__main__.py. I am the file ~/PkgTest/pkgname/__main__.py
I am being accessed from ~/PkgTest
Hello from testmodule.py. I am the file ~/PkgTest/testmodule.py
I am being accessed from ~/PkgTest
Hello from secondtest.py. I am the file ~/PkgTest/pkgname/secondtest.py
I am being accessed from ~/PkgTest
Note: In my honest opinion, running without -m should be avoided. In fact I would go further and say that I would create any executable packages in such a way that they would fail unless run via the -m switch.
In other words, I would only import from 'in-package' modules explicitly via 'relative imports', assuming that all other imports represent system modules. If someone attempts to run your package without the -m switch, the relative import statements will throw an error, instead of silently running the wrong module.

You create __main__.py in yourpackage to make it executable as:
$ python -m yourpackage

__main__.py is used for python programs in zip files. The __main__.py file will be executed when the zip file in run. For example, if the zip file was as such:
test.zip
__main__.py
and the contents of __main__.py was
import sys
print "hello %s" % sys.argv[1]
Then if we were to run python test.zip world we would get hello world out.
So the __main__.py file run when python is called on a zip file.

If your script is a directory or ZIP file rather than a single python file, __main__.py will be executed when the "script" is passed as an argument to the python interpreter.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

making a CLI entry point for a library - python

Related

init.py code called twice and its significance with package import

python Import module from a different directory structure

What is the difference between init.py and main.py?

Running a script from a package

What is main.py?

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

making a CLI entry point for a library - python

Related

__init__.py code called twice and its significance with package import

python Import module from a different directory structure

What is the difference between __init__.py and __main__.py?

Running a script from a package

What is __main__.py?

Categories

Resources

init.py code called twice and its significance with package import

What is the difference between init.py and main.py?

What is main.py?