Git submodule's local import error - Python - python

I'm working on a Python project (Project A) that uses another project from GitHub (Project B). I'm not a Git expert, so after a quick research, I found out that I should use the Project B as a git submodule.
So, I cd project_A_root and did the following:
git submodule add project_B
git submodule init
git submodule update
Now, my project structure looks like this:
In main.py file, I've imported a method from do_something.py.
main.py
from ProjectB.do_something import foo
However, do_something.py file imports a method from util.py file, and that's where the problem occurs.
do_something.py
from util import bar
Project B is a submodule and it assumes that Project B dir is the root of the project, so method from util.py in do_something.py is imported without specifying the package, and I'm getting an error:
ImportError: cannot import name 'bar' from 'util'
Instead, it should be imported like this:
from ProjectB.util import bar
I'm not sure what is the best way to handle this.
I've fixed imports in submodule manually, but I cannot push that changes to Git because that's not how the submodules work, so if anyone wants to clone Project A, they must fix imports manually too.
Any help is welcome.

Try this in the head of main.py:
import sys
sys.path.append("ProjectB")
#### your old code ###
....

Git is just a version control system. Unfortunately, you can't handle this correctly.
The possible solution is patching sys.path variable by adding the ProjectB directory, but this is hack.
The best you can do is to use a Python packaging system, to package ProjectB into a pip package and install it as a usual package by pip.
Usefull links:
https://packaging.python.org/
https://python-packaging.readthedocs.io/en/latest/index.html
https://docs.python.org/3/distutils/setupscript.html
https://python-poetry.org/

Related

Struggling with python's import mechanism

I am an experienced java enterprise developer but very new to python enterprise development shop. I am currently, struggling to understand why some imports work while others don't.
Some background: Our dev team recently upgraded python from 3.6 to 3.10.5 and following is our package structure
src/
bunch of files (dockerfile, Pipfile, requrirements.txt, shell scripts, etc)
package/
__init__.py
moduleA.py
subpackage1/
__init__.py
moduleX.py
moduleY.py
subpackage2/
__init__.py
moduleZ.py
tests/
__init__.py
test1.py
Now, inside the moduleA.py, I am trying to import subpackage2/moduleZ.py like so
from .subpackage2 import moduleZ
But, I get the error saying
ImportError: attempted relative import with no known parent package
The funny thing is that if I move moduleA.py out of package/ and into src/ then it is able to find everything. I am not sure why is this the case.
I run the moduleA.py by executiong python package/moduleA.py.
Now, I read that maybe there is a problem becasue you have you give a -m parameter if running a module as a script (or something on those lines). But, if I do that, I get the following error:
ModuleNotFoundError: No module names 'package/moduleA.py'
I even try running package1/moduleA and remove the .py, but that does not work either. I can understand why as I technically never installed it ?
All of this happened because apparently, the tests broke and to make it work they added relative imports. They changed the import from "from subpackage2 import moduleZ" to "from .subpackage2 import moduleZ" and the tests started working, but the app started failing.
Any understanding I can get would be much appreciated.
The -m parameter is used with the import name, not the path. So you'd use python3 -m package.moduleA (with . instead of /, and no .py), not python3 -m package/moduleA.py.
That said, it only works if package.moduleA is locatable from one of the roots in sys.path. Shy of installing the package, the simplest way to make it work is to ensure your working directory is src (so package exists in the working directory):
$ cd path/to/src
$ python3 -m package.moduleA
and, with your existing setup, if moduleA.py includes a from .subpackage2 import moduleZ, the import should work; Python knows package.moduleA is a module within package, so it can use a relative import to look for a sibling package to moduleA named subpackage2, and then inside it it can find moduleZ.
Obviously, this is brittle (it only works if you cd to the src root directory before running Python, or hack the path to src in PYTHONPATH, which is terrible hack if the code ever has to be run by anyone else); ideally you make this an installable package, install it (in global site-packages, user site-packages, or within a virtual environment created with the built-in venv module or the third-party virtualenv module), and then your working directory no longer matters (since the site-packages will be part of your sys.path automatically). For simple testing, as long as the working directory is correct (not sure what it was for you), and you use -m correctly (you were using it incorrectly), relative imports will work, but it's not the long term solution.
So first of all - the root importing directory is the directory from which you're running the main script.
This directory by default is the root for all imports from all scripts.
So if you're executing script from directory src you can do such imports:
from package.moduleA import *
from package.subpackage1.moduleX import *
But now in files moduleA and moduleX you need to make imports based on root folder. If you want to import something from module moduleY inside moduleX you need to do:
# this is inside moduleX
from package.subpackage1.moduleY import *
This is because python is looking for modules in specific locations.
First location is your root directory - directory from which you execute your main script.
Second location is directory with modules installed by PIP.
You can check all directories using following:
import sys
for p in sys.path:
print(p)
Now to solve your problem there are couple solutions.
The fast one but IMHO not the best one is to add all paths with submodules to sys.path - list variable with all directories where python is looking for modules.
new_path = "/path/to/application/app/folder/src/package/subpackage1"
if new_path not in sys.path:
sys.path.append(new_path)
Another solution is to use full path for imports in all package modules:
from package.subpackage1.moduleX import *
I think in your case it will be the correct solution.
You can also combine 2 solutions.
First add folders with subpackages to sys.path and use subpackage folders as a root folders for imports. But it's good solution only if you have complex submodule structure. And it's not the best solution if in future you will need to deploy your package as a wheel or share between multiple projects.

Cannot import from a git submodule

I'm having difficulties making a git submodule work.
I have a project ProjectA that basically is a mainA.py file and a subfolder with library files:
The mainA.py contains a MainClass that is basically what should be called, and Libraries just contain scripts and classes for computations.
ProjectA/
Libraries/
__init__.py
library1.py
library2.py
__init__.py
mainA.py
In mainA.py I just do something like:
# content of mainA.py
from Libraries.library1 import ClassA, ClassB
class MainClass:
# do stuff
if __name__ == '__main__':
MainClass()
This just works fine, but I have now a ProjectB that needs to use the MainClass from ProjectA, so I decided to put ProjectA as a git submodule of ProjectB
git submodule add ProjectA_git_url
ProjectB/
ProjectA/
mainB.py
.gitmodules
However now in mainB.py I'm trying to import MainClass from projectA.
# content of mainB.py
from ProjectA.mainA import MainClass
ModuleNotFoundError: No module named 'Libraries'
I think this happens because now Libraries is no longer hanging from the root directory, but inside the submodule ProjectA, so when mainA.py does:
from Libraries.library1 import ClassA, ClassB
The system cannot find Libraries.
If I change mainA.py to do:
from ProjectA.Libraries.library1 import ClassA, ClassB
Then it works, but of course I don't want to change anything insise ProjectA, it is just a Project that should work either standalone or as a submodule of another project
What am I doing wrong? Is there a way to import MainClass from mainA.py when ProjectA is a submodule?
git is a development tool; you use it during development but not deployment. pip is a deployment tool; during development you use it to install necessary libraries; during deployment your users use it to install your package with dependencies.
Use submodules when you need something from a remote repository in your development environment. For example, if said remote repository contains Makefile(s) or other non-python files that you need and that usually aren't installed with pip.
That is, in your case you shouldn't make ProjectA a submodule of ProjectB, you should make ProjectA a Python dependency. Create packages for ProjectA and ProjectB and install them separately but allow ProjectB to import from ProjectA.
Dependencies are declared in setup.py or requirements.txt.
That said, if you insist on using submodules: either you have to manipulate sys.path yourself or you do relative import in mainA.py:
from .Libraries.library1 import ClassA, ClassB
Add submodule path to system path in mainB.py
Say your submodule path is "../ProjectB/ProjectA"
sys.path.append(../ProjectB/ProjectA) resolve the issue.

Python folder structure for project directory and easy import

My team has a folder of several small projects in python3. Amongst them, we have a utility folder with several utility functions, that are used throughout the projects. But the way to import it is very uncomfortable. This is the structure we use:
temp_projects
util
storage.py
geometry.py
project1
project1.py
project2
project2.py
The problem is that the import in the projects looks terrible:
sys.path.insert(1, os.path.join(sys.path[0], '..'))
import util.geometry
util.geometry.rotate_coordinates(....)
Also, pycharm and other tools are having trouble understanding it and to supply completion.
Is there some neater way to do that?
Edit:
All of the projects and utils are very much work-in-progress and are modified often, so I'm looking for something as flexible and comfortable as possible
PYTHONPATH environment variable might be a way to go. Just set it to projects folder location:
PYTHONPATH=/somepath/temp_projects
and you'll be able to use util as follows:
import util.geometry
util.geometry.rotate_coordinates(....)
Also this will be recognized by PyCharm automatically.
I believe the correct route would be completely different than what you are doing right now. Each project should be stored in a different Git repository, share modules should be added as git submodules. Once those projects will get larger and more complex (and they probably will), it would be easier to manage them separately.
In short
Projects structure should be:
Project_1
|- utils <submodule>
|- storage.py
|- geometry.py
|- main.py
Project_2
|- utils <submodule>
|- storage.py
|- geometry.py
|- main.py
Working with submodules
### Adding a submodule to an existing git directory
git submodule add <git#github ...> <optional path/to/submodule>
### Pulling latest version from master branch for all submodules
git submodule update --recursive --remote
### Removing submodule from project
# Remove the submodule entry from .git/config
git submodule deinit -f path/to/submodule
# Remove the submodule directory from the project's .git/modules directory
rm -rf .git/modules/path/to/submodule
# Remove the entry in .gitmodules and remove the submodule directory located at path/to/submodule
git rm -f path/to/submodule
Further reading https://git-scm.com/book/en/v2/Git-Tools-Submodules
According to Importing files from different folder adding a __init__.py in the util folder will cause python to treat it as a package. Another thing you can do is use import util.geometry as geometry and then you could use geometry.rotate_coordinates(...) which also improves readability.
If you create a setup.py file for your util module you can install it simply with pip. It will handle everything for you. After installation you can import it across the system.
import util
pip install
# setup.py is in current folder
sudo -H pip3 install .
or if the util module itself is still under development you can install it with the -e editable option. Then it automatically updates the installation when you make changes to the code.
sudo -H pip3 install -e .
For the project administration I recommend to use git as #Michael liv. is suggesting, especially if you work in a team.
Use importlib.
import importlib, importlib.util
def module_from_file(module_name, file_path):
spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
geometry = module_from_file("geometry", "../temp_projects/utils/geometry.py")
geometry.rotate_coordinates(...)
Other option (I used this appoach in my project):
Lets assume that projects are ran from project1.py and project2.py files accordingly.
At the top of these files you can add following imports and actions:
import sys
import os
sys.path.append(os.path.join(os.getcwd(), os.pardir))
import your_other_modules
your_other_modules.py will contains following for impotring utils
from utils import storage
from utils import geometry
# or from project2 import project2, etc..
May be it's not a best way, but just like one more option. I hope it will be helpful for someone.

Why doesn't my current directory show up in the path using pytest on Windows?

I have the following folder structure;
myapp\
myapp\
__init__.py
tests\
test_myapp.py
and my pwd is
C:\Users\wwerner\programming\myapp\
I have the following test setup:
import sys
import pprint
def test_cool():
pprint.pprint(sys.path)
assert False
That produces the following paths:
['C:\\Users\\wwerner\\programming\\myapp\\tests',
'C:\\Users\\wwerner\\programming\\envs\\myapp\\Scripts',
'C:\\Windows\\system32\\python34.zip',
'C:\\Python34\\DLLs',
'C:\\Python34\\lib',
'C:\\Python34',
'C:\\Users\\wwerner\\programming\\envs\\myapp',
'C:\\Users\\wwerner\\programming\\envs\\myapp\\lib\\site-packages']
And when I try to import myapp I get the following error:
ImportError: No module named 'myapp'
So it looks like it's not adding the current directory to my path.
By changing my import line to look like this:
import sys
sys.path.insert(0, '.')
import myapp
I am then able to import myapp with no problems.
Why does my current directory not show up in the path when running pytest? Is my only workaround to insert . into the sys.path? (I'm using Python 3.4 if it matters)
Ahah!
After comparing the layout of my cookiecutter repo, it turns out to be way more simple (and better) than that.
tests/
__init__.py
test_myapp.py
A simple addition of the __init__.py file to my test dir allows me to run py.test from my main directory.
Using an installable package
If you have an installable package (setup.py or pyproject.toml file with a build-system defined) then you want to test against the installed package.
pip install --editable .
pytest
The simplest possible way to make the project shown in the question into an installable package would be by adding this setup.py:
from setuptools import setup
setup(
name="myapp",
version="0.1",
packages=["myapp"],
)
This will put the myapp code at /path/to/myapp/.venv/lib/python3.XY/site-packages, which is in the sys.path of the virtual environment. Now myapp can be imported from the site-packages dir, just as it would be for a user installation. It is neither necessary nor desirable for the current working directory to be present on sys.path during test execution.
Not using an installable package
The project shown in the question does not have any installer, so it can't be installed. It can still be tested by making sure the project root (i.e. the directory which contains both myapp and tests as subdirectories) is present on sys.path.
The best way to do this is to use python -m pytest, rather than invoking the bare pytest command. When you use python -m pytest it adds the current working directory to the start of sys.path. That's the normal Python behavior when executing a package as __main__ (documented here) and it's also a documented usage for pytest - see Invoking pytest versus python -m pytest.
Why does adding an __init__.py to the tests subdirectory (not) work?
The directory structure shown in the question is the "Tests outside application code" pattern, documented here. This is also the directory structure I recommend, since it creates a clear distinction between library/application code and test code.
It's not recommended to add __init__.py files inside the test directories when using a "Tests outside application code" structure, since the test files aren't intended to be "packaged" (e.g. test files do not really need to import from other test files, and they do not need to be installed at all for end users of your package).
The reason adding a myapp/__init__.py actually allows myapp to be imported by pytest, as described in Wayne's answer is actually an accident due to the way test discovery appends sys.path during the test collection phase. This is described as "problematic" in the docs
... this introduces a subtle problem: in order to load the test modules from the tests directory, pytest prepends the root of the repository to sys.path, which adds the side-effect that now mypkg is also importable
They go on to strongly recommend using the src-layout if you intend to have __init__.py files inside test directories, to avoid this confusion of the import system.
But perhaps the best reason not to rely on this side-effect is that pytest collection actually can work in multiple modes (see import modes), and Wayne's answer relies upon pytest using the default "prepend" mode. It is currently mentioned that a future version will switch to "importlib" mode as default:
We intend to make importlib the default in future releases.
The accepted answer does not work with pytest --import-mode=importlib and so will stop working altogether at some stage.
sys.path automatically has the script's directory in it, and not the current working directory.
I am guessing that your script in placed in tests directory. Based on this assumption, your code should look like this:
import sys
import os
ROOT_DIR = os.path.dirname(os.path.dirname(__file__))
sys.path.append(ROOT_DIR)
import myapp # Should work now
Use the environment variable PYTHONPATH.
In Windows:
set PYTHONPATH=.
py.test
In Unix:
PYTHONPATH=. py.test

Can't import my own modules in Python

I'm having a hard time understanding how module importing works in Python (I've never done it in any other language before either).
Let's say I have:
myapp/__init__.py
myapp/myapp/myapp.py
myapp/myapp/SomeObject.py
myapp/tests/TestCase.py
Now I'm trying to get something like this:
myapp.py
===================
from myapp import SomeObject
# stuff ...
TestCase.py
===================
from myapp import SomeObject
# some tests on SomeObject
However, I'm definitely doing something wrong as Python can't see that myapp is a module:
ImportError: No module named myapp
In your particular case it looks like you're trying to import SomeObject from the myapp.py and TestCase.py scripts. From myapp.py, do
import SomeObject
since it is in the same folder. For TestCase.py, do
from ..myapp import SomeObject
However, this will work only if you are importing TestCase from the package. If you want to directly run python TestCase.py, you would have to mess with your path. This can be done within Python:
import sys
sys.path.append("..")
from myapp import SomeObject
though that is generally not recommended.
In general, if you want other people to use your Python package, you should use distutils to create a setup script. That way, anyone can install your package easily using a command like python setup.py install and it will be available everywhere on their machine. If you're serious about the package, you could even add it to the Python Package Index, PyPI.
The function import looks for files into your PYTHONPATH env. variable and your local directory. So you can either put all your files in the same directory, or export the path typing into a terminal::
export PYTHONPATH="$PYTHONPATH:/path_to_myapp/myapp/myapp/"
exporting path is a good way. Another way is to add a .pth to your site-packages location.
On my mac my python keeps site-packages in /Library/Python shown below
/Library/Python/2.7/site-packages
I created a file called awesome.pth at /Library/Python/2.7/site-packages/awesome.pth and in the file put the following path that references my awesome modules
/opt/awesome/custom_python_modules
You can try
from myapp.myapp import SomeObject
because your project name is the same as the myapp.py which makes it search the project document first
You need to have
__init__.py
in all the folders that have code you need to interact with.
You also need to specify the top folder name of your project in every import even if the file you tried to import is at the same level.
In your first myapp directory ,u can add a setup.py file and add two python code in setup.py
from setuptools import setup
setup(name='myapp')
in your first myapp directory in commandline , use pip install -e . to install the package
pip install on Windows 10 defaults to installing in 'Program Files/PythonXX/Lib/site-packages' which is a directory that requires administrative privileges. So I fixed my issue by running pip install as Administrator (you have to open command prompt as administrator even if you are logged in with an admin account). Also, it is safer to call pip from python.
e.g.
python -m pip install <package-name>
instead of
pip install <package-name>
In my case it was Windows vs Python surprise, despite Windows filenames are not case sensitive, Python import is. So if you have Stuff.py file you need to import this name as-is.
let's say i write a module
import os
my_home_dir=os.environ['HOME'] // in windows 'HOMEPATH'
file_abs_path=os.path.join(my_home_dir,"my_module.py")
with open(file_abs_path,"w") as f:
f.write("print('I am loaded successfully')")
import importlib
importlib.util.find_spec('my_module') ==> cannot find
we have to tell python where to look for the module. we have to add our path to the sys.path
import sys
sys.path.append(file_abs_path)
now importlib.util.find_spec('my_module') returns:
ModuleSpec(name='my_module', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7fa40143e8e0>, origin='/Users/name/my_module.py')
we created our module, we informed python its path, now we should be able to import it
import my_module
//I am loaded successfully
This worked for me:
from .myapp import SomeObject
The . signifies that it will search any local modules from the parent module.
Short Answer:
python -m ParentPackage.Submodule
Executing the required file via module flag worked for me. Lets say we got a typical directory structure as below:
my_project:
| Core
->myScript.py
| Utils
->helpers.py
configs.py
Now if you want to run a file inside a directory, that has imports from other modules, all you need to do is like below:
python -m Core.myscript
PS: You gotta use dot notation to refer the submodules(Files/scripts you want to execute). Also I used python3.9+. So I didnt require neither any init.py nor any sys path append statements.
Hope that helps! Happy Coding!
If you use Anaconda you can do:
conda develop /Path/To/Your/Modules
from the Shell and it will write your path into a conda.pth file into the standard directory for 3rd party modules (site-packages in my case).
If you are using the IPython Console, make sure your IDE (e.g., spyder) is pointing to the right working directory (i.e., your project folder)
Besides the suggested solutions like the accepted answer, I had the same problem in Pycharm, and I didn't want to modify imports like the relative addressing suggested above.
I finally found out that if I mark my src/ (root directory of my python codes) as the source in Interpreter settings, the issue will be resolved.

Categories