Requirement for __init__.py just to satisfy pylint and mypy - python

I have a project with the following (partial) directory structure
.
├── mypy.ini
├── src
│ ├── preppy
│ │ ├── cli.py
│ │ ├── __main__.py
│ │ ├── model.py
│ │ └── tools.py
├── pyproject.toml
└── tests
In cli.py, I have the following code (lines 13 and 14 in the file):
from .model import Problem
from .tools import get_abs_path, transcode
I also have similarly styled relative imports in model.py and __main__.py
All similar imports throw errors in both pylint (2.5.3) and mypy (0.761) when the tools are automatically run in my IDE (Code - OSS), e.g.:
Attempted relative import beyond top-level package pylint(relative-beyond-top-level) [13,1]
Cannot find implementation or library stub for module named '.model' mypy(error) [13,1]
Attempted relative import beyond top-level package pylint(relative-beyond-top-level) [14,1]
Cannot find implementation or library stub for module named '.tools' mypy(error) [14,1]
See https://mypy.readthedocs.io/en/latest/running_mypy.html#missing-imports mypy(note) [13,1]
When I add a blank __init__.py file to the folder, the errors disappear.
I don't need this __init__.py file for the package to work.
I thought that post-PEP 420, it shouldn't be required, especially if it's just there to satisfy linters.
Is there something else I'm doing wrong, or should I just add the __init__.py and get over it :) ?
Config for pylint is in pyproject.toml:
[tool.pylint.'MESSAGES CONTROL']
# Pylint and black disagree on hanging indentation.
disable = "C0330"
[tool.pylint.MISCELLANEOUS]
# Note: By default, "TODO" is flagged, this is disabled by omitting it
# from the list below.
notes = "FIXME,XXX"
Config for mypy is in mypy.ini:
[mypy]
disallow_untyped_calls = True
disallow_untyped_defs = True
disallow_incomplete_defs = True
disallow_untyped_decorators = True
mypy_path = src
namespace_packages = True
[mypy-openpyxl]
ignore_missing_imports = True
[mypy-pulp]
ignore_missing_imports = True
[mypy-pytest]
ignore_missing_imports = True
I'm running python 3.8.0.

PEP 420 does not allow to "create a package by omitting __init__.py", it enforces to "create a namespace package by omitting __init__.py". This means:
If you want a package, add __init__.py.
If you want a namespace package, omit __init__.py.
While using a namespace package like a regular package usually works, it may unexpectedly fail when package names clash. In most cases, a namespace package is not desirable.

For mypy, an alternative to the accepted answer is to use mypy option --namespace-packages so that namespace packages are taken into account by mypy

Related

Correct way to enable import of all submodule in python package

I'm making a python package using setuptools, and I'm having trouble making all nested folders in my source code available for import after installing the package. The directory I'm working in has a structcure like illustrated below.
├── setup.py
└── src
└── foo
├── a
│ ├── aa
│ │ └── aafile.py
│ └── afile.py
├── b
│ └── bfile.py
└── __init__.py
Currently, I can't import submodules, such as from foo.a import aa or from foo.a.aa import some_method, unless I explicitly pass the names of the submodules to setuptools. That is, setup.py needs to contain something like
from setuptools import setup
setup(
version="0.0.1",
name="foo",
py_modules=["foo", "foo.a", "foo.a.a", "foo.b"],
package_dir={"": "src"},
packages=["foo", "foo.a", "foo.a.a", "foo.b"],
include_package_data=True,
# more arguments go here
)
This makes organizing the code pretty cumbersome. Is there a simple way to just allows users of the package to install any submodule contained in src/foo?
You'll want setuptools.find_packages() – though all in all, you might want to consider tossing setup.py altogether in favor of a PEP 517 style build with no arbitrary Python but just pyproject.toml (and possibly setup.cfg).
from setuptools import setup, find_packages
setup(
version="0.0.1",
name="foo",
package_dir={"": "src"},
packages=find_packages(where='src'),
include_package_data=True,
)
Every package/subpackage must contain a (at least empty) __init__.py file to be considered so.
If you want to the whole package&subpackages tree to be imported with just one import foo consider filling your __init__.py files with the import of the relative subpackages.
# src/foo/__init__.py
import foo.a
import foo.b
# src/foo/a/__init__.py
import foo.a.aa
# src/foo/b/__init__.py
import foo.b.bb
Otherwise leave the __init__.py files empty and the user will need to manually load the subpackag/submodule he wants.

What is the cause of this Sphinx autodoc MockFinder error?

I am creating documentation with Sphinx. My folder structure looks as follows:
MyProject
├── mypackage
│   ├── __init__.py
│   ├── mycode.py
│   └── etc.
└── docs
├── build
├── make.bat
├── Makefile
└── source
├── conf.py
├── index.rst
├── _static
└── _templates
I begin by running make clean and make html in the docs directory. Next, to populate the documentation, I run sphinx-apidoc -o ./source ../mypackage, and all corresponding .rst files are generated as expected. Finally, I run make clean and make html once more to ensure a clean build, as suggested in the Sphinx-RTD-Tutorial. However, on this final build, I get the following output:
Running Sphinx v4.0.2
making output directory... done
[autosummary] generating autosummary for: index.rst, mypackage.rst, mypackage.mycode.rst
Extension error (sphinx.ext.autosummary):
Handler <function process_generate_options at 0x10678dee0> for event 'builder-inited' threw an exception (exception: list.remove(x): x not in list)
make: *** [html] Error 2
Removing the autosummary extension and just running autodoc with the same sequence of commands leads to a similar error:
Exception occurred:
File "/Users/myname/opt/anaconda3/envs/myenv/lib/python3.9/site-packages/sphinx/ext/autodoc/mock.py", line 151, in mock
sys.meta_path.remove(finder)
ValueError: list.remove(x): x not in list
Here is the source code method that the error comes from:
#contextlib.contextmanager
def mock(modnames: List[str]) -> Generator[None, None, None]:
"""Insert mock modules during context::
with mock(['target.module.name']):
# mock modules are enabled here
...
"""
try:
finder = MockFinder(modnames)
sys.meta_path.insert(0, finder)
yield
finally:
sys.meta_path.remove(finder)
finder.invalidate_caches()
Does anyone know what might be raising this error or have an idea as to what is happening in this method? Could it have to do with my specification of sys.path in my conf.py file?
[conf.py]
sys.path.insert(0, os.path.abspath('../../mypackage'))
I was able to resolve this error using the autodoc_mock_imports config:
autodoc_mock_imports
This value contains a list of modules to be
mocked up. This is useful when some external dependencies are not met
at build time and break the building process. You may only specify the
root package of the dependencies themselves and omit the sub-modules:
autodoc_mock_imports = ["django"]
Will mock all imports under the django package.
New in version 1.3.
Changed in version 1.6: This config value only requires to declare the
top-level modules that should be mocked.

Import diagram/structure inside a python folder (clean-up code)

I just finished a middle-sized python (3.6) project and I need to clean it a bit.
I am not a software engineer, so during the development, I was not too accurate structuring the project, so now I have several modules that are no (longer) imported by any other module or modules that are imported by other .py files that are not actually needed.
So for example, I have
Project/
├── __init__.py
├── main.py
├── foo.py
|
├── tools/
│ ├── __init__.py
│ ├── tool1.py
│ └── tool2.py
│ └── tool3.py
|
├── math/
│ ├── __init__.py
│ ├── math1.py
│ └── math2.py
├── graph/
│ ├── __init__.py
│ ├── graph1.py
│ ├── graph2.py
│
and inside
main.py
from math import math1
from tools import tool2
graph1.py
from math import math1
from tools import tool1, tool2
foo.py
from tools import tool3
If I could see in one look that not a module imports graph2 or math2, I could delete them, or at least add them as candidates for deletion (and restructure the project in a better way).
Or I may think to delete tool3 because I know I don't need foo anymore.
Is there an easy way to visualize all the "connections" (which module imports which) in a diagram or some other kind of structured data/visualization manner?
You can use Python to do the work for you:
Place a Python file with the following code into the same directory as your Project directory.
from pathlib import Path
# list all the modules you want to check:
modules = ["tool1", "tool2", "tool3", "math1", "math2", "graph1", "graph2"]
# find all the .py files within your Project directory (also searches subdirectories):
p = Path('./Project')
file_list = list(p.glob('**/*.py'))
# check, which modules are used in each .py file:
for file in file_list:
with open(file, "r") as f:
print('*'*10, file, ':')
file_as_string = f.read()
for module in modules:
if module in file_as_string:
print(module)
Running this will give you an output looking something like this:
********** Project\main.py :
tool1
tool2
graph1
********** Project\foo.py :
tool2
********** Project\math\math1.py :
tool2
math2
If you're in a Unix-like platform (such as macOS), you can find all files containing specific text with grep. So you could search for all files containing ''import math1'' in your Project directory, for example, with grep -rnw '/path/to/Project/' -e 'import math1' , and if there are no results, then you can safely remove the module. All this process can be easily automated with a python or a shell script!
Maybe this project can help you with visualizing your dependency graph. After a quick google search, it looks like you're not the first person to try to do this.

Failed to import python module from different directory

I have this code structure in python3:
- datalake
__init__.py
utils
__init__.py
utils.py
lambdas
__init__.py
my-lambdas.py
- tests
__init__.py
demo.py
All init__.py files are empty.
My problem is how I can import datalake module from tests/demo.py?
I tried from datalake.utils import utils in demo.py but when I run python tests/demo.py from command line, I get this error ModuleNotFoundError: No module named 'datalake'.
If I use this code:
from ..datalake.utils import utils
I will get error ValueError: attempted relative import beyond top-level package.
I also tried to import the module utils from my-lambda.py file which also failed. The code in my-lambda.py is from datalake.utils import utils but I get ModuleNotFoundError: No module named 'datalake' error when run python datalake/lambda/my-lambda.py from command line.
How can I import the module?
When you run a command like python tests/demo.py, the folder you are in does not get added to the PYTHONPATH, the script folder does. So a top-level import like import datalake will fail. To get around this you can run your tests as a module:
Python 2:
python -m tests/demo
Python 3:
python -m tests.demo
and any datalake imports in demo.py will work.
It sounds like what you really want to do is have a folder with tests separate to your main application and run them. For this I recommend py.test, for your case you can read Tests Outside Application Code for how to do it. TL;DR is run your tests from your top level project folder with python -m py.test and it will work.
First of all, my-lambdas.py is not importable with the import statement as hyphens are not valid in Python identifiers. Try to follow PEP-8's naming conventions, such as mylambdas.py.
Otherwise the package structure looks good, and it should be importable as long as you are at the level above datalake/, e.g., if you were in the directory myproject/ below:
myproject
├── datalake
│ ├── __init__.py
│ ├── utils
│ │ ├── __init__.py
│ │ └── utils.py
│ └── lambdas
│ ├── __init__.py
│ └── mylambdas.py
└── tests
├── __init__.py
└── demo.py
Then this should work:
~/myproject$ python -c 'from datalake import utils'
Otherwise, setting the environment variable PYTHONPATH to the path above datalake/ or modifying sys.path are both ways of changing where Python can import from. See the official tutorial on modules for more information.
Also some general advice: I've found it useful to stick with simple modules rather than packages (directories) until there is a need to expand. Then you can change foo.py into a foo/ directory with an __init__.py file and import foo will work as before, although you may need to add some imports to the __init__.py to maintain API compatibility. This would leave you with a simpler structure:
myproject
├── datalake
│ ├── __init__.py
│ ├── utils.py
│ └── lambdas.py
└── tests
├── __init__.py
└── demo.py
You can add the module directory into your sys.path:
import sys
sys.path.append("your/own/modules/folder") # like sys.path.append("../tests")
but this is a one-shot method, which is just valid at this time, the added path is not permanent, it will be eliminated after the code completed execution.
One of the ways to import the file directly instead of using from, like import util
you can try run :
python -m datalake.lambda.my-lambda
follow: https://docs.python.org/3.7/using/cmdline.html#cmdoption-m

How to use a packed python package without installing it

I have a python 3 package with the following structure:
.
├── package
│ └── bin
└── main_module
│ └── lib
│ ├── __init__.py
│ ├── module1.py
│ ├── module2.py
│ └── module3.py
│ └── test
│ ├── test1.py
│ ├── test2.py
│ └── test3.py
│ └── setup.py
Usually, one runs $ python3 setup.py install and all is good. However, I want to use this package on a cluster server, where I don't have write permissions for /usr/lib/. The following solutions came to my mind.
Somehow install the package locally in my user folder.
Modify the package such that it runs without installation.
Ask the IT guys to install the package for me.
I want to avoid 3., so my question is whether 1. is possible and if not, how I have to modify the code (particularly the imports) in order to be able to use the package without installation. I have been reading about relative imports in python all morning and I am now even more confused than before. I added __init__.py to package and bin and from what I read I assumed it has to be from package.lib import module1, but I always get ImportError: No module named lib.
In order for Python to be able to find your modules you need to add the path of your package to sys.path list. As a general way you can use following snippet:
from sys import path as syspath
from os import path as ospath
syspath.append(ospath.join(ospath.expanduser("~"), 'package_path_from_home'))
os.path.expanduser("~") will give you the path of the home directory and you can join it with the path of your package using os.path.join and then append the final path to sys.path.
If package is in home directory you can just add the following at the leading of your python code that is supposed to use this package:
syspath.append(ospath.join(ospath.expanduser("~"), 'package'))
Also make sure that you have an __init__.py in all your modules.
I had the same problem. I used the first approach
install the package locally in my user folder by running
python setup.py install --user
This will install your module in ~/.local/lib/python3/
Just add the path of your 'package' to environment variable PYTHONPATH. This will get rid of the error you are getting.
OR
programmatically add path of the package to sys.path.append()
you can add this to the "main file" of the package
import sys, os
sys.path.append(os.path.dirname(__file__) + "/..")
you can find the "main file" by looking for this pattern
if __name__ == "__main__":
some_function()

Categories