We use git submodule to share common module within our team.
Each module has test cases in its source folder, and ran by nose.
I have a project in this structure:
/project_sample
|__/project_sample
| |__ __init__.py
| |__ moduleA.py
| |__/tests
| | |__ __init__.py
| | |__ moduleA_tests.py
| |__/subpackage
| |__ __init__.py
| |__ moduleB.py
| |__/tests
| |__ __init__.py
| |__ moduleB_tests.py
|__setup.py
All of these init.py files are empty.
The subpackage is developed seperately and added to the project by git submodule. We want it to be self-contained , and try to share it in different project. Its test case is like this:
moduleB_tests.py:
from subpackage import moduleB
def test_funcA():
moduleB.funcA()
The test pass when i run nosetests from the subpackage's repo folder.
Seems like nose find a init.py file in the parent folder of subpackage (project_sample), when i run nosetests from project_sample's root directory, i get "ImportError: No module named subpackage". But it passed when i change the first line to:
from project_sample.subpackage import moduleB
But this way makes subpackage not self-contained.
I tried some way like : adding subpackage to sys.path or use -w option of nose, but still get this exception.
My teammate run subpackage's test case seperately in PyCharm and get passed, so i think there should be some way to make it pass from command line.
Is there any way to resolve the problem, or any suggestion with the project structure?
This is my first question on SO, Any suggestion is appreciated.
I know this question is a bit old and has already been answered but we use a different strategy in our code base.
If for some reason you still want the package in the project_sample module directory, you can structure it something like this:
/project_sample
|__/project_sample
| |__ __init__.py
| |__ moduleA.py
| |__/tests
| | |__ __init__.py
| | |__ moduleA_tests.py
| |__/subpackage_repo
| |__/subpackage
| |__ __init__.py
| |__ moduleB.py
| |__/tests
| |__ __init__.py
| |__ moduleB_tests.py
| |__setup.py
|__setup.py
Without an __init__.py in the subpackage, it won't be a part of your project.
Then in the __init__.py of the main package, you can include
import os
import sys
sys.path.insert(0,
os.path.join(
os.path.dirname(os.path.dirname(__file__)),
'subpackage_repo')
)
import subpackage
which puts the subpackage in the path for the main project.
Then in moduleA.py that allows us to do something like
from . import subpackage
since the package is imported at the __init__.py level.
Of course you can also move it up a level as well and just reflect that in the path you're adding if you want to do it that way. The only disadvantage of doing that is that when you run your tests with something like
python -m unittest discover
you'll discover the subpackage tests as well and run those. Ideally those tests should be taken care of with whatever CI you have for the package so we'd hope that they don't need to be run.
If you want subpackage to be self-contained, you need it to treat as such. This means that you need to put it top-level, parallel to project_sample instead of into it.
Related
Given the following structure,
my_project/
|__ __init__.py
|__ README.md
|__ src/
| |__ __init.py__
| |__ data_processing.py
| |__ utils.py
| |__ models.py
| |__ bootstrap.py
|__ python_scripts/
| |__ myscript.py
|__ data/
|__ output/
I want to import my modules data_processing, utils, models, bootstrap in my script myscript.py, without appending to the os.sys path.
Currently, I have at the top of my myscript.py:
import os, sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
sys.path.append(module_path)
from src.utils import foo, bar
from src.data_processing import data_process_1, data_process_2
from src.models import MyModel
from src.bootstrap import bootstrap_eval
However, I am unsure what I need to add to my __init__.py file in src as well as maybe other places, in order to import my modules iwthout appending the module path to my sys path.
I.e., I would like to be able to do something like this :
from src.utils import foo, bar
from src.data_processing import data_process_1, data_process_2
from src.models import MyModel
from src.bootstrap import bootstrap_eval
Is it possible to do this, without building a python wheel and installing the modules as a package?
I have this issue with Pytest where unit tests run fine from PyCharm but doesn't when I run through pipeline "python -m pytest".
Below is my project structure:
Common
|_____configuration.py
|
Services
|
|-----ServiceA
| |
| |___src
| | |___utils
| | |__ __init__.py
| | |__ helper1.py
| |__ helper2.py
| |___Test
| |___utils
| |__ __init.py
| |__ test1.py
|
|-----ServiceB
|
|
In helper1 I have code as
from Common import configuration
Tests runs absolutely fine when I run through Pycharm because it resolves all path but when I run it through pipeline, I get below error when running through cmd,
ModuleNotFoundError: No module named 'Common'
Can anyone help how to resolve this issue. TIA
In pytest >= 7.0.0 you can register extra path using Pytest's pythonpath config option. The path value should be relative to the rootdir. So it might be Common or ../Common or even prefixed with more parent levels.
If you use pyproject.toml:
[tool.pytest.ini_options]
pythonpath = ["Common"]
If you use pytest.ini instead:
[pytest]
pythonpath = Common
Added path to the common in test init.py and resolved the issue
sys.path.append('..\..\Common')
I'm having a similar but different issue from the other posts I've seen on here: I'm trying to import a module from a nested module, but even though the python linter picks up everything fine I can't execute the file because of an import error, module not found.
ParentFolder /
|__ContainerFolder
|__ __init__.py
|__ Camera/
|__ __init__.py
|__ CameraService.py
|__ Data.py
|__ Settings /
|__ __init__.py
|__ SettingService.py
|__ Handler /
|__ __init__.py
|__ Handle.py
|__ Models /
|__ __init__.py
|__ Setting.py
What I want is to import Data.py inside of Handle.py
I have tried:
from ContainerFolder.Camera.Data import DataClass
The linter says it's fine, and the autofill in VS Code gives me type-ahead, however, at execution I get a ModuleNotFoundError for ContainerFolder module. I have an __init__.py in all the directories so what am I missing to make that a module to import from?
edit
So CameraService.py and SettingService.py are both Tornado APIs, since they are both executing as main how would one be able to share modules between the two? ie Data.py with modules under the Settings directory or Setting.py within modules under the Camera directory?
Try converting the file into a package like here
https://www.jetbrains.com/help/pycharm/refactoring-convert.html
or here
Converting a python package into a single importable file
I have a python 2.7 project which I have structured as below:
project
|
|____src
| |
| |__pkg
| |
| |__ __init__.py
|
|____test
|
|__test_pkg
| |
| |__ __init__.py
|
|__helpers
| |
| |__ __init__.py
|
|__ __init__.py
I am setting the src folder to the PYTHONPATH, so importing works nicely in the packages inside src. I am using eclipse, pylint inside eclipse and nosetests in eclipse as well as via bash and in a make file (for project). So I have to satisfy lets say every stakeholder!
The problem is importing some code from the helpers package in test. Weirdly enough, my test is also a python package with __init__.py containing some top level setUp and tearDown method for all tests. So when I try this:
import helpers
from helpers.blaaa import Blaaa
in some module inside test_pkg, all my stakeholders are not satisfied. I get the ImportError: No module named ... and pylint also complains about not finding it. I can live with pylint complaining in test folders but nosetests is also dying if I run it in the project directory and test directory. I would prefer not to do relative imports with dot (.).
The problem is that you can not escape the current directory by importing from ..helpers.
But if you start your test code inside the test directory with
python3 -m test_pkg.foo
the current directory will be the test directory and importing helpers will work. On the minus side that means you have to import from . inside test_pkg.
I use setuptools to distribute my python package. Now I need to distribute additional datafiles.
From what I've gathered fromt the setuptools documentation, I need to have my data files inside the package directory. However, I would rather have my datafiles inside a subdirectory in the root directory.
What I would like to avoid:
/ #root
|- src/
| |- mypackage/
| | |- data/
| | | |- resource1
| | | |- [...]
| | |- __init__.py
| | |- [...]
|- setup.py
What I would like to have instead:
/ #root
|- data/
| |- resource1
| |- [...]
|- src/
| |- mypackage/
| | |- __init__.py
| | |- [...]
|- setup.py
I just don't feel comfortable with having so many subdirectories, if it's not essential. I fail to find a reason, why I /have/ to put the files inside the package directory. It is also cumbersome to work with so many nested subdirectories IMHO. Or is there any good reason that would justify this restriction?
Option 1: Install as package data
The main advantage of placing data files inside the root of your Python package
is that it lets you avoid worrying about where the files will live on a user's
system, which may be Windows, Mac, Linux, some mobile platform, or inside an Egg. You can
always find the directory data relative to your Python package root, no matter where or how it is installed.
For example, if I have a project layout like so:
project/
foo/
__init__.py
data/
resource1/
foo.txt
You can add a function to __init__.py to locate an absolute path to a data
file:
import os
_ROOT = os.path.abspath(os.path.dirname(__file__))
def get_data(path):
return os.path.join(_ROOT, 'data', path)
print get_data('resource1/foo.txt')
Outputs:
/Users/pat/project/foo/data/resource1/foo.txt
After the project is installed as an Egg the path to data will change, but the code doesn't need to change:
/Users/pat/virtenv/foo/lib/python2.6/site-packages/foo-0.0.0-py2.6.egg/foo/data/resource1/foo.txt
Option 2: Install to fixed location
The alternative would be to place your data outside the Python package and then
either:
Have the location of data passed in via a configuration file,
command line arguments or
Embed the location into your Python code.
This is far less desirable if you plan to distribute your project. If you really want to do this, you can install your data wherever you like on the target system by specifying the destination for each group of files by passing in a list of tuples:
from setuptools import setup
setup(
...
data_files=[
('/var/data1', ['data/foo.txt']),
('/var/data2', ['data/bar.txt'])
]
)
Updated: Example of a shell function to recursively grep Python files:
atlas% function grep_py { find . -name '*.py' -exec grep -Hn $* {} \; }
atlas% grep_py ": \["
./setup.py:9: package_data={'foo': ['data/resource1/foo.txt']}
I Think I found a good compromise which will allow you to mantain the following structure:
/ #root
|- data/
| |- resource1
| |- [...]
|- src/
| |- mypackage/
| | |- __init__.py
| | |- [...]
|- setup.py
You should install data as package_data, to avoid the problems described in samplebias answer, but in order to mantain the file structure you should add to your setup.py:
try:
os.symlink('../../data', 'src/mypackage/data')
setup(
...
package_data = {'mypackage': ['data/*']}
...
)
finally:
os.unlink('src/mypackage/data')
This way we create the appropriate structure "just in time", and mantain our source tree organized.
To access such data files within your code, you 'simply' use:
data = resource_filename(Requirement.parse("main_package"), 'mypackage/data')
I still don't like having to specify 'mypackage' in the code, as the data could have nothing to do necessarally with this module, but i guess its a good compromise.
I could use importlib_resources or importlib.resources (depending on python version).
https://importlib-resources.readthedocs.io/en/latest/using.html
I think that you can basically give anything as an argument *data_files* to setup().