Python - How to unmock/reset mock during testing? - python

I'm using nosetests and in two separate files I have two tests. Both run fine when run individually, but when run together, the mock from the first test messes up the results in the second test. How do I insure that all mocks/patches are reset after a test function is finished so that I get a clean test on every run?
If possible, explaining through my tests would be particularly appreciated. My first test looks like:
def test_list_all_channel(self):
from notification.models import Channel, list_all_channel_names
channel1 = Mock();
channel2 = Mock();
channel3 = Mock();
channel1.name = "ch1"
channel2.name = "ch2"
channel3.name = "ch3"
channel_list = [channel1, channel2, channel3]
Channel.all = MagicMock()
Channel.all.return_value = channel_list
print Channel
channel_name_list = list_all_channel_names()
self.assertEqual("ch1", channel_name_list[0])
self.assertEqual("ch2", channel_name_list[1])
self.assertEqual("ch3", channel_name_list[2])
And my second test is:
def test_can_list_all_channels(self):
add_channel_with_name("channel1")
namelist = list_all_channel_names()
self.assertEqual("channel1", namelist[0])
But the return value from Channel.all() is still set to the list from the first function so I get `"ch1" is not equal to "channel1". Any suggestions? Thank you much!

Look up https://docs.python.org/3/library/unittest.mock.html#patch
At the start of your test you initiate your patch and run
p = patch("Channel.all", new=MagicMock(return_value=channel_list))
p.start()
At the end:
p.stop()
This will ensure that your mocks are isolated to the test.

Related

Get value from threading function

I have two functions that I want to run concurrently to check performance, now a days I'm running one after another and it's taking quite some time.
Here it's how I'm running
import pandas as pd
import threading
df = pd.read_csv('data/Detalhado_full.csv', sep=',', dtype={'maquina':str})
def gerar_graph_36():
df_ordered = df.query(f'maquina=="3.6"')[['data', 'dia_semana', 'oee', 'ptg_ruins', 'prod_real_kg', 'prod_teorica_kg']].sort_values(by='data')
oee = df_ordered['oee'].iloc[-1:].iloc[0]
return oee
def gerar_graph_31():
df_ordered = df.query(f'maquina=="3.1"')[['data', 'dia_semana', 'oee', 'ptg_ruins', 'prod_real_kg', 'prod_teorica_kg']].sort_values(by='data')
oee = df_ordered['oee'].iloc[-1:].iloc[0]
return oee
oee_36 = gerar_graph_36()
oee_31 = gerar_graph_31()
print(oee_36, oee_31)
I tried to apply threading using this statement but it's not returning the variable, instead it's printing None value
print(oee_31, oee_36) -> Expecting: 106.3 99.7 // Returning None None
oee_31 = threading.Thread(target=gerar_graph_31, args=()).start()
oee_36 = threading.Thread(target=gerar_graph_36, args=()).start()
print(oee_31, oee_36)
For checking purpose, If I use the command below, returns 3 as expected
print(threading.active_count())
I need the return oee value from the function, something like 103.8.
Thanks in advance!!
Ordinarily creatign a new thread and starting it is not like calling a function which returns a variable: the Thread.start() call just "starts the code of the other thread", and returns imediatelly.
To colect results in the other threads you have to comunicate the computed results to the main thread using some data structure. An ordinary list or dictionary could do, or one could use a queue.Queue.
If you want to have something more like a function call and be able to not modify the gerar_graph() functions, you could use the concurrent.futures module instead of threading: that is higher level code that will wrap your calls in a "future" object, and you will be able to check when each future is done and fetch the value returned by the function.
Otherwise, simply have a top-level variable containign a list, wait for your threads to finish up running (they stop when the function called by "target" returns), and collect the results:
import pandas as pd
import threading
df = pd.read_csv('data/Detalhado_full.csv', sep=',', dtype={'maquina':str})
results = []
def gerar_graph_36():
df_ordered = df.query(f'maquina=="3.6"')[['data', 'dia_semana', 'oee', 'ptg_ruins', 'prod_real_kg', 'prod_teorica_kg']].sort_values(by='data')
oee = df_ordered['oee'].iloc[-1:].iloc[0]
results.append(oee)
def gerar_graph_31():
df_ordered = df.query(f'maquina=="3.1"')[['data', 'dia_semana', 'oee', 'ptg_ruins', 'prod_real_kg', 'prod_teorica_kg']].sort_values(by='data')
oee = df_ordered['oee'].iloc[-1:].iloc[0]
results.append(oee)
# We need to keep a reference to the threads themselves
# so that we can call both ".start()" (which always returns None)
# and ".join()" on them.
oee_31 = threading.Thread(target=gerar_graph_31); oee_31.start()
oee_36 = threading.Thread(target=gerar_graph_36); oee_36.start()
oee_31.join() # will block and return only when the task is done, but oee_36 will be running concurrently
oee_36.join()
print(results)
If you need more than 2 threads, (like all 36...), I strongly suggest using concurrent.futures: you can limit the number of workers to a number comparable to the logical CPUs you have. And, of course, manage your tasks and calls in a list or dictionary, instead of having a separate variable name for each.

Pytest, change counter that is passed to markers dynamically

I am reading my test data from a Python file as follows.
//testdata.py -- its a list of sets.
TEST_DATA = [
(
{"test_scenario":"1"}, {"test_case_id":1}
),
(
{"test_scenario":"2"}, {"test_case_id":2}
)
]
Now I use this test data as part of a pytest test file.
// test.py
// import testdata
test_data = testdata.TEST_DATA
start = 0
class TestOne():
#pytest.mark.parametrize(("test_scenario,testcase_id"),test_data)
#testcaseid.marktc[test_data[start][1]["test_case_id"]]
def testfunction():
global start
start = start + 1
// Doing test here.
Now when I print start, it changes its value continuoulsy. But when I try to retrieve the pytest results, I still keep getting start = 0 due to which my test case ID isnt being recorded properly.
Can I either
Pass marker from within the function.
Or is there a way to change the count of start dynamically in this example?
P.S. This is the best way that I am able to store my test data currently.
Here's how i have my testcaseid.marktc defined.
// testrailthingy.py
class testcaseid(object):
#staticmethod
def marktc(*ids):
return pytest.mark.testrail(ids=ids)

Dask : how to parallelize and serialize methods?

I am trying to parallize methods from a class using Dask on a PBS cluster.
My greatest challenge is that this method should parallelize some computations, then run further parallel computations on the result. Of course, this should be distributed on the cluster to run similar computations on other data...
The cluster is created :
cluster = PBSCluster(cores=4,
memory=10GB,
interface="ib0",
queue=queue,
processes=1,
nanny=False,
walltime="02:00:00",
shebang="#!/bin/bash",
env_extra=env_extra,
python=python_bin
)
cluster.scale(8)
client = Client(cluster)
The class I need to distribute has 2 separate steps which do have to be run separately since step1 writes a file that is then read at the beginning of the second step.
I have tried the following by putting both steps one after the other in a method :
def computations(params):
my_class(**params).run_step1(run_path)
my_class(**params).run_step2()
chain = []
for p in params_compute:
y = dask.delayed(computations)(p)
chain.append(y)
dask.compute(*chain)
But it does not work because the second step is trying to read the file immediately.
So I need to find a way to stop the execution after step1.
I have tried to force the execution of first step by adding a compute() :
def computations(params):
my_class(**params).run_step1(run_path).compute()
my_class(**params).run_step2()
But it may not be a good idea because when running dask.compute(*chain) I'd be ultimately doing compute(compute()) .. which might explain why the second step is not executed ?
What would the best approach be ?
Should I include a persist() somewhere at the end of step1 ?
For info, step1 and step2 below :
def run_step1(self, path_step):
preprocess_result = dask.delayed(self.run_preprocess)(path_step)
gpu_result = dask.delayed(self.run_gpu)(preprocess_result)
post_gpu = dask.delayed(self.run_postgpu)(gpu_result) # Write a result file post_gpu.tif
return post_gpu
def run_step2(self):
data_file = rio.open(self.outputdir + "/post_gpu.tif").read() #opens the file written at the end of step1
temp_result1 = self.process(data_file )
final_merge = dask.delayed(self.merging)(temp_result1 )
write =dask.delayed(self.write_final)(final_merge )
return write
This is only a rough suggestion, as I don't have a reproducible example as a starting point, but the key idea is to pass a delayed object to run_step2 to explicitly link it to run_step1. Note I'm not sure how essential for you it is to use a class in this case, but for me it's easier to pass the params as a dict explicitly.
def run_step1(params):
# params is assumed to be a dict
# unpack params here if needed (path_step was not explicitly in the `for p in params_compute:` loop so I assume it can be stored in params)
preprocess_result = run_preprocess(path_step, params)
gpu_result = run_gpu(preprocess_result, params)
post_gpu = run_postgpu(gpu_result, params) # Write a result file post_gpu.tif
return post_gpu
def run_step2(post_gpu, params):
# unpack params here if needed
data_file = rio.open(outputdir + "/post_gpu.tif").read() #opens the file written at the end of step1
temp_result1 = process(data_file, params)
final_merge = merging(temp_result1, params)
write = write_final(final_merge, params)
return write
chain = []
for p in params_compute:
y = dask.delayed(run_step1)(p)
z = dask.delayed(run_step2)(y, p)
chain.append(z)
dask.compute(*chain)
Sultan's answer almost works, but fails due to an internal misconception in the library I was provided.
I have used the following workaround which works for now (I'll use your solution later). I simply create 2 successive chains and compute them one after the other. Not really elegant but works fine...
chain1 = []
for p in params_compute:
y = (run_step1)(p)
chain1.append(y)
dask.compute(chain1)
chain2 = []
for p in params_compute:
y = (run_step2)(p)
chain2.append(y)
dask.compute(chain2)

PyTest skipping test based on target code version

What I am trying to do is to skip tests that are not supported by the code I am testing. My PyTest is running tests against an embedded system that could have different versions of code running. What I want to do mark my test such that they only run if they are supported by the target.
I have added a pytest_addoption method:
def pytest_addoption(parser):
parser.addoption(
'--target-version',
action='store', default='28',
help='Version of firmware running in target')
Create a fixture to decide if the test should be run:
#pytest.fixture(autouse = True)
def version_check(request, min_version: int = 0, max_version: int = 10000000):
version_option = int(request.config.getoption('--target-version'))
if min_version and version_option < min_version:
pytest.skip('Version number is lower that versions required to run this test '
f'({min_version} vs {version_option})')
if max_version and version_option > max_version:
pytest.skip('Version number is higher that versions required to run this test '
f'({max_version} vs {version_option})')
Marking the tests like this:
#pytest.mark.version_check(min_version=24)
def test_this_with_v24_or_greater():
print('Test passed')
#pytest.mark.version_check(max_version=27)
def test_not_supported_after_v27():
print('Test passed')
#pytest.mark.version_check(min_version=13, max_version=25)
def test_works_for_range_of_versions():
print('Test passed')
In the arguments for running the test I just want to add --target-version 22 and have only the right tests run. I haven't been able to figure out how to pass the arguments from #pytest.mark.version_check(max_version=27), to version_check.
Is there a way to do this or am I completely off track and should be looking at something else to accomplish this?
You are not far from a solution, but you're mixing up markers with fixtures; they are not the same, even if you give them the same name. You can, however, read markers of each test function in your version_check fixture and skip the test depending on what was provided by the version_check marker if set. Example:
#pytest.fixture(autouse=True)
def version_check(request):
version_option = int(request.config.getoption('--target-version'))
# request.node is the current test item
# query the marker "version_check" of current test item
version_marker = request.node.get_closest_marker('version_check')
# if test item was not marked, there's no version restriction
if version_marker is None:
return
# arguments of #pytest.mark.version_check(min_version=10) are in marker.kwargs
# arguments of #pytest.mark.version_check(0, 1, 2) would be in marker.args
min_version = version_marker.kwargs.get('min_version', 0)
max_version = version_marker.kwargs.get('max_version', 10000000)
# the rest is your logic unchanged
if version_option < min_version:
pytest.skip('Version number is lower that versions required to run this test '
f'({min_version} vs {version_option})')
if version_option > max_version:
pytest.skip('Version number is higher that versions required to run this test '
f'({max_version} vs {version_option})')

test getting skipped in pytest

I am trying to use parametrize for which I want to give testcases which I get from a different function using pytest.
I have tried this
test_input = []
rarp_input1 = ""
rarp_output1 = ""
count =1
def test_first_rarp():
global test_input
config = ConfigParser.ConfigParser()
config.read(sys.argv[2])
global rarp_input1
global rarp_output1
rarp_input1 = config.get('rarp', 'rarp_input1')
rarp_input1 =dpkt.ethernet.Ethernet(rarp_input1)
rarp_input2 = config.get('rarp','rarp_input2')
rarp_output1 = config.getint('rarp','rarp_output1')
rarp_output2 = config.get('rarp','rarp_output2')
dict_input = []
dict_input.append(rarp_input1)
dict_output = []
dict_output.append(rarp_output1)
global count
test_input.append((dict_input[0],count,dict_output[0]))
#assert test_input == [something something,someInt]
#pytest.mark.parametrize("test_input1,test_input2,expected1",test_input)
def test_mod_rarp(test_input1,test_input2,expected1):
global test_input
assert mod_rarp(test_input1,test_input2) == expected1
But the second test case is getting skipped. It says
test_mod_rarp1.py::test_mod_rarp[test_input10-test_input20-expected10]
Why is the test case getting skipped? I have checked that neither the function nor the input is wrong. Because the following code is working fine
#pytest.mark.parametrize("test_input1,test_input2,expected1,[something something,someInt,someInt])
def test_mod_rarp(test_input1,test_input2,expected1):
assert mod_rarp(test_input1,test_input2) == expected1
I have not put actual inputs here. Its correct anyway. Also I have config file from which I am taking inputs using configParser. test_mod_rarp1.py is the python file name where I am doing this. I basically want to know if we can access variables(test_input in my example) from other functions to use in parametrize if that is causing problem here. If we can't how do I change the scope of the variable?
Parametrization happens at compile time so that is the reason if you want to parametrized on data generated at run time it skips that.
The ideal way to acheive what you are trying to do is by using fixture parametrization.
Below example should clear things for you and then you could apply the same logic in your case
import pytest
input = []
def generate_input():
global input
input = [10,20,30]
#pytest.mark.parametrize("a", input)
def test_1(a):
assert a < 25
def generate_input2():
return [10, 20, 30]
#pytest.fixture(params=generate_input2())
def a(request):
return request.param
def test_2(a):
assert a < 25
OP
<SKIPPED:>pytest_suites/test_sample.py::test_1[a0]
********** test_2[10] **********
<EXECUTING:>pytest_suites/test_sample.py::test_2[10]
Collected Tests
TEST::pytest_suites/test_sample.py::test_1[a0]
TEST::pytest_suites/test_sample.py::test_2[10]
TEST::pytest_suites/test_sample.py::test_2[20]
TEST::pytest_suites/test_sample.py::test_2[30]
See test_1 was skipped because parameterization happened before execution of generate_input() but test_2 gets parameterized as required

Categories