Python - converting imports to module format

Python - converting imports to module format - python

I have a Python package that is structured using imports as if a script is run from the root of the package. So the structure looks something like this:
foo/
script.py
bar/
import_a.py
bar_two/
import_b.py
And the imports within the lower level packages, for example in import_b.py are structured like
from bar import import_a
Obviously this breaks if you want to re-structure this as a package instead of running scripts from the root directory. I need to re-do all the import statements, and add __init__.py files to each subdirectory (this is in python 2.7 as I have to use this with a pre-existing ros kinetic repo, whereas the initial repo was in python 3).
My question is twofold: is there an easy method / script to convert the imports, and is there some way of auto-generating the corresponding __init__.py files?

From the perspective of Python 2.7 this is not even a package since it misses all the __init__.py files (and namespace packages didn't exist back then). So there's not way around creating these __init__.py files. But that should be relatively easy. You can use the following Python snippet from the root of the (to-be-)package:
import os
for item in os.walk('.'):
path = item[0]
with open(os.path.join(path, '__init__.py'), 'w') as fh:
pass
For the modification of import statements there's a way around it. When importing the package you can add the file path of the root __init__.py script to sys.path. That is you can add the following code to foo/__init__.py at the very top:
import os
import sys
sys.path.append(os.path.split(__file__)[0])
That way all imports can be resolved by searching that directory.

Related

Struggling with python's import mechanism

I am an experienced java enterprise developer but very new to python enterprise development shop. I am currently, struggling to understand why some imports work while others don't.
Some background: Our dev team recently upgraded python from 3.6 to 3.10.5 and following is our package structure
src/
bunch of files (dockerfile, Pipfile, requrirements.txt, shell scripts, etc)
package/
__init__.py
moduleA.py
subpackage1/
__init__.py
moduleX.py
moduleY.py
subpackage2/
__init__.py
moduleZ.py
tests/
__init__.py
test1.py
Now, inside the moduleA.py, I am trying to import subpackage2/moduleZ.py like so
from .subpackage2 import moduleZ
But, I get the error saying
ImportError: attempted relative import with no known parent package
The funny thing is that if I move moduleA.py out of package/ and into src/ then it is able to find everything. I am not sure why is this the case.
I run the moduleA.py by executiong python package/moduleA.py.
Now, I read that maybe there is a problem becasue you have you give a -m parameter if running a module as a script (or something on those lines). But, if I do that, I get the following error:
ModuleNotFoundError: No module names 'package/moduleA.py'
I even try running package1/moduleA and remove the .py, but that does not work either. I can understand why as I technically never installed it ?
All of this happened because apparently, the tests broke and to make it work they added relative imports. They changed the import from "from subpackage2 import moduleZ" to "from .subpackage2 import moduleZ" and the tests started working, but the app started failing.
Any understanding I can get would be much appreciated.

The -m parameter is used with the import name, not the path. So you'd use python3 -m package.moduleA (with . instead of /, and no .py), not python3 -m package/moduleA.py.
That said, it only works if package.moduleA is locatable from one of the roots in sys.path. Shy of installing the package, the simplest way to make it work is to ensure your working directory is src (so package exists in the working directory):
$ cd path/to/src
$ python3 -m package.moduleA
and, with your existing setup, if moduleA.py includes a from .subpackage2 import moduleZ, the import should work; Python knows package.moduleA is a module within package, so it can use a relative import to look for a sibling package to moduleA named subpackage2, and then inside it it can find moduleZ.
Obviously, this is brittle (it only works if you cd to the src root directory before running Python, or hack the path to src in PYTHONPATH, which is terrible hack if the code ever has to be run by anyone else); ideally you make this an installable package, install it (in global site-packages, user site-packages, or within a virtual environment created with the built-in venv module or the third-party virtualenv module), and then your working directory no longer matters (since the site-packages will be part of your sys.path automatically). For simple testing, as long as the working directory is correct (not sure what it was for you), and you use -m correctly (you were using it incorrectly), relative imports will work, but it's not the long term solution.

So first of all - the root importing directory is the directory from which you're running the main script.
This directory by default is the root for all imports from all scripts.
So if you're executing script from directory src you can do such imports:
from package.moduleA import *
from package.subpackage1.moduleX import *
But now in files moduleA and moduleX you need to make imports based on root folder. If you want to import something from module moduleY inside moduleX you need to do:
# this is inside moduleX
from package.subpackage1.moduleY import *
This is because python is looking for modules in specific locations.
First location is your root directory - directory from which you execute your main script.
Second location is directory with modules installed by PIP.
You can check all directories using following:
import sys
for p in sys.path:
print(p)
Now to solve your problem there are couple solutions.
The fast one but IMHO not the best one is to add all paths with submodules to sys.path - list variable with all directories where python is looking for modules.
new_path = "/path/to/application/app/folder/src/package/subpackage1"
if new_path not in sys.path:
sys.path.append(new_path)
Another solution is to use full path for imports in all package modules:
from package.subpackage1.moduleX import *
I think in your case it will be the correct solution.
You can also combine 2 solutions.
First add folders with subpackages to sys.path and use subpackage folders as a root folders for imports. But it's good solution only if you have complex submodule structure. And it's not the best solution if in future you will need to deploy your package as a wheel or share between multiple projects.

Use init.py to modify sys path is a good idea?

I want to ask you something that came to my mind doing some stuff.
I have the following structure:
src
- __init__.py
- class1.py
+ folder2
- __init__.py
- class2.py
I class2.py I want to import class1 to use it. Obviously, I cannot use
from src.class1 import Class1
cause it will produce an error. A workaround that works to me is to define the following in the __init__.py inside folder2:
import sys
sys.path.append('src')
My question is if this option is valid and a good idea to use or maybe there are better solutions.
Another question. Imagine that the project structure is:
src
- __init__.py
- class1.py
+ folder2
- __init__.py
- class2.py
+ errorsFolder
- __init__.py
- errors.py
In class1:
from errorsFolder.errors import Errors
this works fine. But if I try to do in class2 which is at the same level than errorsFolder:
from src.errorsFolder.errors import Errors
It fails (ImportError: No module named src.errorsFolder.errors)
Thank you in advance!

Despite the fact that it's slightly shocking to have to import a "parent" module in your package, your workaround depends on the current directory you're running your application, which is bad.
import sys
sys.path.append('src')
should be
import sys,os
sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)),os.pardir))
to add the parent directory of the directory of your current module, regardless of the current directory you're running your application from (your module may be imported by several applications, which don't all require to be run in the same directory)

One correct way to solve this is to set the environment variable PYTHONPATH to the path which contains src. Then import src.class1 will always work, regardless of which directory you start in.

from ..class1 import Class1 should work (at least it does here in a similar layout, using python 2.7.x).
As a general rule: messing with sys.path is usually a very bad idea, specially if this allows a same module to be imported from two different paths (which would be the case with your files layout).
Also, you may want to think twice about your current layout. Python is not Java and doesn't require (nor even encourage) a "one class per module" approach. If both classes need to work together, they might be better in a same module, or at least in modules at the same level in the package tree (note that you can use the top-level package's __init__ as a facade to provide direct access to objects defined in submodules / subpackages). NB : I'm not saying that your current layout is necessarily wrong, just that it might not be the simplest one.

There is also a solution that does not involve the use of the environment variable PYTHONPATH.
In src/ create setup.py with these contents:
from setuptools import setup
setup()
Now you can do pip install -e /path/to/src and import to your hearts content.
-e, --editable <path/url> Install a project in editable mode (i.e. setuptools "develop mode") from a local project path or a VCS url.

No, Its not good. Python takes modules in two ways:
Python looks for its modules and packages in $PYTHONPATH.
Refer: https://docs.python.org/2/using/cmdline.html#envvar-PYTHONPATH
To find out what is included in $PYTHONPATH, run the following code in python (3):
import sys
print(sys.path)
all folder containing init.py are marked as python package.(if they are subdirectories under PYTHONPATH)
By help of these two ways you can full-fill the need to create a python project.

Correct Package Importing in a Pythonic Library Design?

I have a python project structured like this:
repo_dir/
----project_package/
--------__init__.py
--------process.py
--------config.py
----tests/
--------test_process.py
__init__.py is empty
config.py looks like this:
name = 'brian'
USAGE
I use the library by running python process.py from the project/project/ directory, or by specifying the python file path absolutely. I'm running Python 2.7 on Amazon EC2 Linux.
When process.py looks like below, everything works fine and process.py prints brian.
import config
print config.name
When process.py looks like below, I get the error ImportError: No module named project.config.
import project.config
print config.name
When process.py looks like below, I get the error ImportError: No module named project. This makes sense as the same behavior from the previous example should be expected.
from project import config
print config.name
If I add these lines to process.py to include the library root in sys.path, all configurations above, work fine.
import os
import sys
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
MY CONFUSION
Many resources suggest setting up python libraries to import modules using project.module_name, but it doesn't seem like sys.path appending is standard, and seems weird that I need it. I can see that the sys.path append added my library root as a path in sys, but I thought that's what the __init__.py in my library root was supposed to do. What gives? What am I missing? I know Python importing creates lots of headaches so I've tried to simplify this as much as possible to wrap my head around it. I'm going crazy and it's Friday before a holiday. I'm bummed. Please help!!
QUESTIONS
How should I set up my libraries? How should I import packages? Where should I have __init__.py files? Do I need to append my library root to sys.path in every project? Why is this so confusing?

Your project setup is alright. I renamed the directories just for clarity
in this example, but the structure is the same as yours:
repo_dir/
project_package/
__init__.py
process.py
config.py
# Declare your project in a setup.py file, so that
# it will be installable, both by users and by you.
setup.py
When you have a module that wants to import from another module in
the same project, the best approach is to use relative imports. For example:
# In process.py
from .config import name
...
While working on the code on your dev box, do your work in a Python virtualenv,
and pip install your project in "editable" mode.
# From the root of your repo:
pip install -e .
With that approach, you'll never need to muck around with sys.path -- which
is almost always the wrong approach.

I think the problem is how you're running your script. If you want the script to be living in a package (the inner project folder), you should run it with python -m project.process, rather than by filename. Then you can make absolute or explicit relative imports to get config from process.
An absolute import would be from project import config or import project.config.
An explicit relative import would be from . import config.
Python 2 also allows implicit relative imports, but they're a really bad misfeature that you should never use. With implicit relative imports, internal package modules can shadow top-level modules. For instance, a project/json.py file would hide the standard library's json module from all the other modules in the package. You can tell Python you want to forbid implicit relative imports by putting from __future__ import absolute_import at the top of the file. It's the standard behavior in Python 3.

python: import problems with directory structure that keeps scripts and modules separate

I have the following directory structure:
app/
bin/
script1.py
script2.py
lib/
module1/
__init__.py
module1a.py
module1b.py
__init__.py
module2.py
Dockerfile
My problem is that I want to execute script1.py and script2.py, but inside those scripts, I want to import the modules in lib/.
I run my scripts from the root app/ directory (i.e. adjacent to Dockerfile) by simply executing python bin/script1.py. When I import modules into my scripts using from lib.module1 import module1a, I get ImportError: No module named lib.module1. When I try to import using relative imports, such as from ..lib.module1 import module1a, I get ValueError: Attempted relative import in non-package.
When I simply fire up the interpreter and run import lib.module1 or something, I have no issues.
How can I get this to work?

In general, you need __init__.py under app and bin, then you can do a relative import, but that expects a package
If you would structure your python code as python package (egg/wheel) then you could also define an entry point, that would become your /bin/ file post install.
here is an example of a package - https://python-packaging.readthedocs.io/en/latest/minimal.html
and this blog explains entry points quite well - https://chriswarrick.com/blog/2014/09/15/python-apps-the-right-way-entry_points-and-scripts/
if so, that way you could just do python setup.py install on your package and then have those entry points available within your PATH, as part of that you would start to structure your code in a way that would not create import issues.

You can add to the Python path at runtime in script1.py:
import sys
sys.path.insert(0, '/path/to/your/app/')
import lib.module1.module1a

you have to add current dir to python path.
use export in terminal
or sys.path.insert in your python script both are ok

Better approach to use script inside nested directory PYTHONPATH

Sorry for asking my own question 2nd time, but i am totally stuck in import file in python.
I have a directory structure below:
|--test/foo.py
|--library #This is my PYTHONPATH
|--|--script1.py
|--|--library_1
|--|--|--script2.py
|--|--library_2
|--|--library_3
I am accessing library/library_1/script2.py from test/foo.py.
Here i am confused about what is the better approach. Generally all library folders or utility functions should be added to pythonpath.
This is a folder structure i am maintaining to differentiate utility functions and test scripts.
I tried putting __init__.py in library and library1 & then imported like from library1 import script2, but getting error as No module named script.
I have tried appending that path to system path as well.
Working: if i add another pythonpath like path/to/library/libray_1/. So should i do this for all folders which are inside library folder to make it work ?

Here's what you need to do:
|--test/foo.py
|--library #This is my PYTHONPATH
|--__init__.py
|--|--script1.py
|--|--library_1
|--|--|--__init__.py
|--|--|--script2.py
|--|--library_2
|--|--|--__init__.py
|--|--library_3
|--|--|--__init__.py
And inside the first __init__.py below library you need to do:
import library1
import library2
import script
Then, if library is your python path, you can do this within test/foo.py with no errors:
import library
library.library1.bar()
library.script.foo()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - converting imports to module format - python

Related

Struggling with python's import mechanism

Use init.py to modify sys path is a good idea?

Correct Package Importing in a Pythonic Library Design?

python: import problems with directory structure that keeps scripts and modules separate

Better approach to use script inside nested directory PYTHONPATH

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - converting imports to module format - python

Related

Struggling with python's import mechanism

Use __init__.py to modify sys path is a good idea?

Correct Package Importing in a Pythonic Library Design?

python: import problems with directory structure that keeps scripts and modules separate

Better approach to use script inside nested directory PYTHONPATH

Categories

Resources

Use init.py to modify sys path is a good idea?