Idiom for modules in packages that are not directly executed - python

Is there an idiom or suggested style for modules that are never useful when directly executed, but simply serve as components of a larger package — e.g. those containing definitions, etc.?
For example is it customary to omit #!/usr/bin/env python; add a comment; report a message to the user or execute other code (e.g. using a check of whether or not __name__ is '__main__' — or simply do nothing special?

Most of the python code I write is modules that generally don't get directly called as scripts. Sometimes when I'm working on a small project and don't want to set up a more complex testing system, I'll call my tests for the module bellow if __name__ == '__main__':, that way I can quickly test my module just by calling python modulename.py (this sometimes does not play nice with relative imports and such but it works well enough for small projects). Whether or not I do this, I drop the shebang and don't give execute remission to the file because I don't like to make modules executable unless they're meant to be run as scripts.

Related

How to test whether the dependencies installed?

I've been learning PyTorch for deep learning recently.
Using anaconda I found some problems when I ran the program.
For example, I encountered the following import error
"no module named kiwisolver"
when my program imported matplotlib. It is fixed, but such error is very frustrating. The program runs for a long time.
Is there any way to check whether all the required dependencies are installed?
Depending on how your program is structured...
Many Python programs use the if __name__ == "__main__": idiom so that they don't immediately execute code. This lets you import the code without it immediately running.
For example, if you have my_py_torch.py, then if you run python to launch the Python interpreter in interactive mode, you can import your code:
import my_py_torch
Importing your code will process any imports, execute any top-level code, and define any functions and classes, but, as long as you use the if __name__ == "__main__": idiom, it won't actually run the (long-running) code. That's typically enough to let you know if you have major issues like syntax errors, bad imports, or missing dependencies.
Code can still circumvent this: you may have functions or methods that only import modules locally (when they're actually run), or code may wrap imports in try / except blocks to handle missing dependencies then later throw an error if the dependency is used. So it's not foolproof, but it can be a useful test.

document scripts not being part of modules in sphinx

I looked for several packages (sphinx-gallery, autoprogram,...), but found nothing on how to easily use a docstring from a python script for documentation. So I somehow want to do autodoc on a specific file.
Is somebody aware of the possibility to automatically generate sphinx documentation out of python scripts?
Like I have a docstring in the beginning of the script and maybe some functions with docstrings in there and just want to autogenerate some documentation like I can do with the .. automodule::directive, but unfortunately that won't work for relative paths / scripts.
EDIT:
The scripts I want to create some docstrings out are not cli scripts, it are just some python scripts which are getting called by a cron job in general. So unfortunately autoprogram won't help here as far as I see.
EDIT2:
Okay, so now I got a little bit clearer on that after re-reading the documentation and trying around. What I wanted to do is automatically taking the docstring of a python file and put that to documentation without executing the whole file (because for some reasons I can't or don't want to hide everything behind a main routine). I got autodoc to document a specific file (there was some misconfiguration why that didn't worked), but like stated in its documentation, it executes the file. That's my true problem right here. I'd be happy if one has a solution to achieve this, but would totally understand if this is not possible without big effort.
What I wanted to do is automatically taking the docstring of a python file and put that to documentation without executing the whole file.
You cannot do that. To get the docstring from a module, the module needs to be imported. Importing the module executes the whole file.
If you don't want code to be executed upon a "simple import", you can use a if __name__ == __name__ block or setuptools entry_point based automatic script generation.

restrict importing certain modules in python

I am setting this sys.modules['os']=None for restricting OS modules in my python notebook. But I want to restrict it by default, is there any file in /bin where I can add this line.
If not, is it possible in RestrictedPython?
I don't think you can do that, but you could create a virualenv and delete those modules there
First, there is no true sandboxing in python (you also can try PyPy, they claim that this is achievable all the way down to syscalls via rather nontrivial hooking inside their VM). But what you can try right now is runpy module from stdlib. It provides a way to run your module inside a restricted environment (yet not a sandbox) via providing this environment explicitly as a dict. Since import statement runs __import__ function underkeens, this function can be overloaded to not accept certain module names. Though I am not sure how to force Jupiter (or whatever you are using) to run in discussed mode.

Python spyder debug freezes with circular importing

I have a problem with the debugger when some modules in my code call each other.
Practical example:
A file dog.py contains the following code:
import cat
print("Dog")
The file cat.py is the following:
import dog
print("Cat")
When I run dog.py (or cat.py) I don't have any problem and the program runs smoothly.
However, when I try to debug it, the whole spyder freezes and I have to kill the program.
Do you know how can I fix this? I would like to use this circular importing, as the modules use functions that are in the other modules.
Thank you!
When I run dog.py (or cat.py) I don't have any problem and the program runs smoothly.
AFAICT that's mostly because a script is imported under the special name ("__main__"), while a module is imported under it's own name (here "dog" or "cat"). NB : the only difference between a script and a module is actually loaded - passed an argument to the python runtime (python dog.py) or imported from a script or any module with an import statement.
(Actually circular imports issues are a bit more complicated than what I describe above, but I'll leave this to someone more knowledgeable.)
To make a long story short: except for this particular use case (which is actually more of a side effect), Python does not support circular imports. If you have functions (classes, whatever) shared by other scripts or modules, put these functions in a different module. Or if you find out that two modules really depends on each other, you may just want to regroup them into a single module (or regroup the parts that depend on each other in a same module and everything else in one or more other modules).
Also: unless it's a trivial one-shot util or something that only depends on the stdlib, your script's content is often better reduced to a main function parsing command-line arguments / reading config files / whatever, importing the required modules and starting the effective process.

Python File Structure on GitHub

I've been looking around at some open source projects on Python, and I'm seeing a lot of files and patterns that I'm not familiar with.
First of all, a lot of projects just have a file called setup.py, which usually contains one function:
setup(blah, blah, blah)
Second, a lot contain a file that is simply called __init__.py and contains next to no information.
Third, some .py files contain a statement similar to this:
if __name__ == "__main__"
Finally, I'm wondering if there are any "best practices" for dividing Python files up in a git repository. With Java, the idea of file division comes pretty naturally because of the class structure. With Python, many scripts have no classes at all, and sometimes a program will have OOP aspects, but a class by class division does not make that much sense. Is it just "whatever makes the code the most readable," or are there some guidelines somewhere about this?
The setup.py is part of Python’s module distribution using the distrubution utilities. It allows for easy installation of the Python module and is useful when, well, you want to distribute your project as a whole Python module.
The __init__.py is used for Python’s package system. An empty file is usually enough to make Python recognize the directory it is in as a package, but you can also define different things in it.
Finally, the __name__ == '__main__' check is to ensure that the current script is run directly (e.g. from the command line) and it is not just imported into some other script. During a Python script execution only a single module’s __name__ property will be equal to __main__. See also my answer here or the more general question on that topic.
The setup.py is part of distutils setup process. You'll want to have one of those if you're distributing a module instead of just a basic script (which even then it's a good idea to have one so you can easily expand into a module later).
The __init__.py part of the python module import process:
Files named init.py are used to mark directories on disk as a
Python package directories. If you have the files
mydir/spam/init.py mydir/spam/module.py and mydir is on your path,
you can import the code in module.py as:
import spam.module or
from spam import module If you remove the init.py file, Python
will no longer look for submodules inside that directory, so attempts
to import the module will fail.
if __name == "__main__" is a way to indicate code that would be executed if the file was run directly instead of imported.
To answer on how to layout your code, the distfiles documentation has a good guide on this.
In addition to #poke's answer, see this related question on what the directory structure of a python project should be. Here is another useful tutorial on how to make your project easily runnable.

Categories