I'm using pytest to write some unit tests and I have some tests that can only be run when the tests are running in cloud under some special runtime (Databricks cluster).
I want to automatically skip these tests when I run the tests locally. I know how to find if I'm running locally or not programmatically.
This is my project structure.
.
├── poetry.lock
├── poetry.toml
├── pyproject.toml
├── README.md
└── src
├── pkg1
│ ├── __init__.py
│ ├── conftest.py
│ ├── module1.py
│ ├── module2.py
│ ├── test_module1.py
│ ├── test_module2.py
│ └── utils
│ ├── aws.py
│ └── common.py
└── pkg2
├── __init__.py
├── ...
test_module1.py:
from pkg1 import module1
from common import skip_if_running_locally
def test_everywhere(module1_instance):
pass # do test..
#skip_if_running_locally
def test_only_in_cloud(module1_instance):
pass # do test..
common.py:
import pytest
from pyspark.sql import SparkSession
my_spark = SparkSession.getActiveSession()
running_locally = my_spark is None or \
my_spark.conf.get('spark.app.name') != 'Databricks Shell'
skip_if_running_locally = pytest.mark.skipif(running_locally, reason='running locally')
And I do the same in test_module2.py to mark tests that should be skipped locally.
I don't really like to put this in common.py because it contains the common application code (not test code).
I thought about putting it in a base class, but then it has to be a Class attribute (not self. instance attr).
If I put it in a test_common.py then it'll be picked up by pytest as a file containing test cases.
If I put it in conftest.py how do I import it? from conftest import skip_... ?
What is the right way of doing this? Where do I store common code/annotations dedicated to testing and how do I use it?
Generally, conftest.py is the place to put common test logic. There is nothing wrong with using util/common modules, but the conftest.py has two advantages:
It is executed automatically by pytest.
It is the standard place, so developers would often check it.
With that said, I believe that you can use the approach mentioned here to have custom markers enabled/disabled according to the environment.
Your tests would look like so (note that there is no import, just using the locally vs cloud markers):
import pytest
#pytest.mark.locally
def test_only_runs_locally():
pass
#pytest.mark.cloud
def test_only_runs_on_the_cloud():
pass
def test_runs_everywhere():
pass
Then inside the conftest.py you enable/disable the proper tests:
from pyspark.sql import SparkSession
import pytest
ALL = set("locally cloud".split())
my_spark = SparkSession.getActiveSession()
running_on = "locally" if (
my_spark is None
or my_spark.conf.get('spark.app.name') != 'Databricks Shell'
) else "cloud"
# runs before every test
def pytest_runtest_setup(item):
# look for all the relevant markers of the test
supported_platforms = ALL.intersection(mark.name for mark in item.iter_markers())
if supported_platforms and running_on not in supported_platforms:
pytest.skip(
f"We're running on {running_on}, cannot run {supported_platforms} tests")
Related
I have a project organized like so :
application
├── app
│ └── package
└── __init__.py
│ └── functions.py
└── app2
└── some_folder
└── file_2.py
My "functions.py" contains a basic function:
#functions.py
def add(x,y):
return x+y
The file "_init_.py" is empty
I want to use the "add" function in my "file_2.py" file, so I write:
#file_2.py
from application.app.package.functions import add
print(add(2,3))
But it returns an error message:
ModuleNotFoundError: No module named 'application'
it is the same if i try any of these:
from app.package.functions import add
from package.functions import add
from functions import add
Does anyone know where the problem comes from? I'm doing exactly like in this tutorial so I don't understand what's wrong
tutorial's link
Thank you for your help
One way to import functions.add is to import sys and use sys.path.insert()
after that you can import add from functions:
import sys
sys.path.insert(1, 'the/local/path/to/package')
from functions import add
print(add(1,2))
My code directory looks like below. I need to generate documentation for all the modules like for sub1,sub2,submoduleA1,submoduleB1 and so on.
Also as shown for submoduleB2.py: all the modules imports from other modules/submodules
<workspace>
└── toolbox (main folder)
├── __init__.py
│
├── sub
│ ├── __init__.py
│ ├── sub1.py
│ └── sub2.py
│
├── subpackageA
│ ├── __init__.py
│ ├── submoduleA1.py
│ └── submoduleA2.py
│
└── subpackageB
├── __init__.py
├── submoduleB1.py
└── submoduleB2.py code[from sub import sub1
from subpackageA import submoduleA2 and so on]
code structure for submoduleB2.py
from __future__ import absolute_import, division
import copy
import logging
import numpy as np
import pandas as pd
from dc.dc import DataCleaning
from sub.sub1 import ToolboxLogger
from subpackageA import pan
LOGGER = ToolboxLogger(
"MATH_FUNCTIONS", enableconsolelog=True, enablefilelog=False, loglevel=logging.DEBUG
).logger
"""
Calculations also take into account units of the tags that are passed in
"""
def spread(tag_list):
"""
Returns the spread of a set of actual tag values
:param tag_list: List of tag objects
:type tag_list: list
:return: Pandas Series of spreads
:rtype: Pandas Series
:example:
>>> tag_list = [tp.RH1_ogt_1,
tp.RH1_ogt_2,
tp.RH1_ogt_3,
tp.RH1_ogt_4,
tp.RH1_ogt_5,
tp.RH1_ogt_6]
>>> spread = pan.spread(tag_list)
"""
# use the same units for everything
units_to_use = tag_list[0].units
idxs = tag_list[0].actuals.index
spread_df = pd.DataFrame(index=idxs)
spread_series = spread_df.max(axis=1).copy()
return Q_(spread_series, units_to_use)
I tried to run the pdoc command using anaconda prompt by navigating it to the toolbox folder and executed the below command
pdoc --html --external-links --all-submodules preprocess/toolbox/subpackageA
after executing this command a "subpackageA" folder was created under toolbox with index.html file but it was all blank
Then i tried to generate documentation by providing specific module name
pdoc --html --external-links --all-submodules preprocess/toolbox/submoduleB2.py
but received this below error:
File "C:\Users\preprocess/toolbox/submoduleB2.py", line 16, in
from sub import sub1
ImportError: No module named sub.sub1
Can you please tell me how to generate the documentation using pdoc for complete directory?
Or is there any other package which will auto generate the documentation?
I even tried Sphnix, but faced issues in adding the module/submodule paths in config file
It appears that pdoc3 is throwing that kind of error for a module if it cannot find an import into that module in the python path. One solution is to put
import os, sys
syspath = os.path.dirname(os.path.abspath(__file__))
sys.path.append(path)
into the __init__.py files in each of the subdirectories.
I have the following directory structure for a number of unit tests (directories advanced and basic contain multiple files with test cases implemented in them):
Tests
├── advanced
│ ├── __init__.py
│ └── advanced_test.py
├── basic
│ ├── __init__.py
│ └── basic_test.py
└── helpers.py
with the following file contents.
# helpers.py
import unittest
def determine_option():
# Some logic that returns the option
return 1
class CustomTestCase(unittest.TestCase):
def __init__(self, methodName: str = ...) -> None:
super().__init__(methodName)
self._option = determine_option()
def get_option(self):
# Some custom method used in advanced test cases
return self._option
def customAssert(self, first, second):
# Some custom assertion code used in advanced and basic test cases
self.assertEqual(first, second)
# basic_test.py
import sys
import os
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
from helpers import CustomTestCase
class BasicTest(CustomTestCase):
# Includes test cases that use custom assertion method
def test_pass(self) -> None:
self.customAssert(1, 1)
# advanced_test.py
import sys
import os
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
from helpers import CustomTestCase
class AdvancedTest(CustomTestCase):
# Includes test cases that use custom assertion method and some further functionality (e.g. get_option())
def test_pass(self) -> None:
self.customAssert(self.get_option(), 1)
The outlined structure above allows me to use the test discovery functionality of python unittest.
> python -m unittest discover -p *_test.py -v
test_pass (advanced.advanced_test.AdvancedTest) ... ok
test_pass (basic.basic_test.BasicTest) ... ok
----------------------------------------------------------------------
Ran 2 tests in 0.001s
OK
In order to be able to import CustomTestCase from helpers.py I had to resort to the ugly and probably bad idea of adding the parent directory to sys.path. Attempting to import via from ..helpers import CustomTestCase does not play nicely with the test discovery of python unittest (ImportError: attempted relative import beyond top-level package).
How could the Tests directory be structured to allow defining such a CustomTestClass that can be used to implement the test cases in the subdirectory without resorting to the sys.path.insert hack used?
In a project one module moduleA requires access to module moduleB within the same sub-package packageA (which is in package project).
This access fails, when the __init__.py of sub-package packageA is filled with an import .. as .. statement, while the __init__py of package project is empty.
Why does a filled __init__.py (seemingly) block access from this (same package) modules - while PyCharm seems to still accept it from an autocomplete and highlighting perspective?
The thrown AttributeError suggests, that the import .. as .. statement makes the interpreter believe that the sub-package is an attribute, not an package – despite an existing __init__.py.
File structure
├── ProjectA
│ ├── src
│ │ ├── project
│ │ │ ├── __init__.py
│ │ │ ├── packageA
│ │ │ │ ├── __init__.py
│ │ │ │ ├── moduleA.py
│ │ │ │ ├── moduleB.py
Code sample 1
# ProjectA / src / project / __init__.py
(empty)
# packageA / __init__.py
(empty)
# packageA / moduleA.py
import project.packageA.moduleB as dummy
class A:
pass
class B:
pass
# packageA / moduleB.py
def method():
pass
Code execution 1
# jupyter stated in 'C:\\Users\\username\\Desktop\\devenv\\'
# notebook located in 'C:\\Users\\username\\Desktop\\devenv\\dev\\'
import sys
sys.path
# output:
# ['C:\\src\\ProjectA',
# 'C:\\src\\ProjectA\\src',
# 'C:\\Users\\username\\Desktop\\devenv\\dev',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv\\python36.zip',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv\\DLLs',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv\\lib',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv',
# '',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv\\lib\\site-packages',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv\\lib\\site-packages\\win32',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv\\lib\\site-packages\\win32\\lib',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv\\lib\\site-packages\\Pythonwin',
# 'C:\\ProgramData\\Anaconda3\\envs\\myenv\\lib\\site-packages\\IPython\\extensions',
# 'C:\\Users\\username\\.ipython']
from project.packageA.moduleA import A, B
# no error(s)
Code sample 2
First alternative filling of packageA / __init__.py
# packageA / __init__.py
from .moduleA import A, B
import .moduleB as dummy
Second alternative filling of packageA / __init__.py
# packageA / __init__.py
from project.packageA.moduleA import A, B
import project.packageA.moduleB as dummy
Code execution 2
from project.packageA.moduleA import A, B
AttributeError Traceback (most recent call last)
<ipython-input-1-61a791f79421> in <module>
----> 1 import project.packageA.moduleA.moduleB
C:\src\ProjectA\src\project\packageA\__init__.py in <module>
----> 1 from .moduleA import A, B
2 from .moduleB import *
C:\src\ProjectA\src\project\packageA\moduleA.py in <module>
---> 1 import project.packageA.moduleB as dummy
2
3 class A:
AttributeError: module 'project' has no attribute 'packageA'
Solution
I've found the solution in Stack Overflow: Imports in __init__.py and import as statement
Changing the import in packageA / __init__.py from import .. as to from xx import .. as did the trick:
# packageA / __init__.py
from project.packageA.moduleA import A, B
from project.packageA import moduleB as dummy
Can anyone help me to understand, why import xx as and from xx import xx as work differently, when it comes to sub-packages - specifically in this situation where the package's __init__.py is empty, but the sub-package's __init__.py is filled?
This behavior can actually not be described by any features of intended back-end mechanisms: the behavior doesn't match any of the documentation, e.g. PEP 0221. Thus the import .. as .. statement is nonfunctional.
This bug seems to have been fixed with Python 3.7 (I've been running 3.69)
I am running multiple tests in a tests package, and I want to print each module name in the package, without duplicating code.
So, I wanted to insert some code to __init__.py or conftest.py that will give me the executing module name.
Let's say my test modules are called: checker1, checker2, etc...
My directory structure is like this:
tests_dir/
├── __init__.py
├── conftest.py
├── checker1
├── checker2
└── checker3
So, inside __init__.py I tried inserting:
def module_name():
return os.path.splitext(__file__)[0]
But it still gives me __init__.py from each file when I call it.
I also tried using a fixture inside conftest.py, like:
#pytest.fixture(scope='module')
def module_name(request):
return request.node.name
But it seems as if I still need to define a function inside each module to get module_name as a parameter.
What is the best method of getting this to work?
Edit:
In the end, what I did is explained here:
conftest.py
#pytest.fixture(scope='module', autouse=True)
def module_name(request):
return request.node.name
example for a test file with a test function. The same needs to be added to each file and every function:
checker1.py
from conftest import *
def test_columns(expected_res, actual_res, module_name):
expected_cols = expected_res.columns
actual_cols = actual_res.columns
val = expected_cols.difference(actual_cols) # verify all expected cols are in actual_cols
if not val.empty:
log.error('[{}]: Expected columns are missing: {}'.format(module_name, val.values))
assert val.empty
Notice the module_name fixture I added to the function's parameters.
expected_res and actual_res are pandas Dataframes from excel file.
log is a Logger object from logging package
In each module (checker1, checker2, checker3, conftest.py), in the main function, execute
print(__name__)
When the __init__.py file imports those packages, it should print the module name along with it.
Based on your comment, you can perhaps modify the behaviour in the __init__.py file for local imports.
__init.py__
import sys, os
sys.path.append(os.path.split(__file__)[0])
def my_import(module):
print("Module name is {}".format(module))
exec("import {}".format(module))
testerfn.py
print(__name__)
print("Test")
Directory structure
tests_dir/
├── __init__.py
└── testerfn.py
Command to test
import tests_dir
tests_dir.my_import("testerfn")