For unit testing, I want to mock a variable inside a function, such as:
def function_to_test(self):
foo = get_complex_data_structure() # Do not test this
do_work(foo) # Test this
I my unit test, I don't want to be dependent on what get_complex_data_structure() would return, and therefore want to set the value of foo manually.
How do I accomplish this? Is this the place for #patch.object?
Just use #patch() to mock out get_complex_data_structure():
#patch('module_under_test.get_complex_data_structure')
def test_function_to_test(self, mocked_function):
foo_mock = mocked_function.return_value
When the test function then calls get_complex_data_structure() a mock object is returned and stored in the local name foo; the very same object that mocked_function.return_value references in the above test; you can use that value to test if do_work() got passed the right object, for example.
Assuming that get_complex_data_structure is a function1, you can just patch it using any of the various mock.patch utilities:
with mock.patch.object(the_module, 'get_complex_data_structure', return_value=something)
val = function_to_test()
...
they can be used as decorators or context managers or explicitly started and stopped using the start and stop methods.2
1If it's not a function, you can always factor that code out into a simple utility function which returns the complex data-structure
2There are a million ways to use mocks -- It pays to read the docs to figure out all the ways that you can set the return value, etc.
Related
I have a simple module (no classes, just utility functions) where a function foo() calls a number of functions from the same module, like this:
def get_config(args):
...
return config_dictionary
def get_objs(args):
...
return list_of_objects
def foo(no_run=False):
config = get_config(...)
if no_run:
return XYZ
objs = get_objects(config)
for obj in objs:
obj.work()
... # number of other functions from the same module called
Is it possible to use Python Mockito to verify that get_config() was the last function called from my module in foo() ? (for certain arguments)
Currently this is verified in this way:
spy2(mymodule.get_config)
spy2(mymodule.get_objects)
assert foo(no_run=True) == XYZ
verify(mymodule).get_config(...)
# Assumes that get_objects() is the first function to be called
# in foo() after the configuration is retrieved.
verify(mymodule, times=0).get_objects(...)
Perhaps something like generating the spy() and verify() calls dynamically ? Rewrite the module into a class and stub the whole class ?
Basically, I do not like the assumption of the test - the code in foo() can be reordered and the test would still pass.
That's not your real code, and then it is often not describing the real problem you have here. If for example you don't expect a function is called at all, like get_objects in your case, then why begin with spy2 in the first place. expect(<module>, times=0).<fn>(...) reads better in that case, and a subsequent verify is not needed.
There is verifyNoMoreInteractions(<module>) and inorder.verify testing. But all this is guessing as you don't tell how XYZ is computed. (Basically why spy2(get_config) and not a when call here. T.i. why calling the original implementation and not mocking the answer?)
I have a couple of fixtures that do some initialization that is rather expensive. Some of those fixtures can take parameters, altering their behaviour slightly.
Because these are so expensive, I wanted to do initialisation of them once per test class. However, it does not destroy and reinit the fixtures on the next permutation of parameters.
See this example: https://gist.github.com/vhdirk/3d7bd632c8433eaaa481555a149168c2
I would expect that StuffStub would be a different instance when DBStub is recreated for parameters 'foo' and 'bar'.
Did I misunderstand something? Is this a bug?
I've recently encountered the same problem and wanted to share another solution. In my case the graph of fixtures that required regenerating for each parameter set was very deep and it's not so easy to control. An alternative is to bypass the pytest parametrization system and programmatically generate the test classes like so:
import pytest
import random
def make_test_class(name):
class TestFoo:
#pytest.fixture(scope="class")
def random_int(self):
return random.randint(1, 100)
def test_someting(self, random_int):
assert random_int and name == "foo"
return TestFoo
TestFooGood = make_test_class("foo")
TestFooBad = make_test_class("bar")
TestFooBad2 = make_test_class("wibble")
You can see from this that three tests are run, one passes (where "foo" == "foo") the other two fail, but you can see that the class scope fixtures have been recreated.
This is not a bug. There is no relation between the fixtures so one of them is not going to get called again just because the other one was due to having multiple params.
In your case db is called twice because db_factory that it uses has 2 params. The stuff fixture on the other hand is called only once because stuff_factory has only one item in params.
You should get what you expect if stuff would include db_factory as well without actually using its output (db_factory would not be called more than twice):
#pytest.fixture(scope="class")
def stuff(stuff_factory, db_factory):
return stuff_factory()
scuevals_api/resources/students.py:
def year_in_range(year):
return datetime.now().year <= year <= datetime.now().year + 10
class StudentsResource(Resource):
args = {
'graduation_year': fields.Int(required=True, validate=year_in_range),
}
...
I'm trying to mock year_in_range (to always return True) however, all my attempts have failed so far.
I'm using the decorator approach with mock.patch and have tried a ton of different targets, but the one I believe should be the correct one is:
#mock.patch('scuevals_api.resources.students.year_in_range', return_value=True)
The mock function never gets called, as in, it's not mocking correctly. I'm not getting any errors either.
My only remaining suspicions is that it has something to do with that the function is passed in to fields.Int as a param (hence the question title), but in my head, it shouldn't affect anything.
I'm clueless as to where this function should be mocked?
By the time mock has patched year_in_range it is too late. mock.patch imports the module specified by the string you provided and patches the name indicated within the module so it refers to a mock object - it does not fundamentally alter the function object itself. On import of scuevals_api.resources.students the body of the StudentsResource class will be executed and a reference to the original year_in_range saved within the StudentResource.args['graduation_year'] object, as a result making the name year_in_range refer to a mock object has no impact.
In this particular case you have a few options:
Assuming you're trying to test some functionality, instead of trying to mock year_in_range you can seed the database (?) with data that tests the condition
You can patch datetime.now which will be called by year_in_range
You can patch the member of StudentResource.args['graduation_year'] where the function passed to validate has been saved.
Thanks to the explanation by Chris Hunt, I came up with an alternative solution. It does modify the application code rather than the testing code, but if that is acceptable (which, in today's day and age probably should be, since having testable code is high priority), it is a really simple solution:
It's not possible to mock year_in_range since a reference to the original function is saved before the mocking is done. Therefore, "wrap" the function you want to mock with another function and pass the wrapper instead. Wrapping can be done in a nice and tidy way using lambda functions:
def year_in_range(year):
return datetime.now().year <= year <= datetime.now().year + 10
class StudentsResource(Resource):
args = {
'graduation_year': fields.Int(required=True, validate=lambda y: year_in_range(y)),
}
...
Now, when I mock year_in_range as stated in the question, it will work. The reason is because now a reference is saved to the lambda function, instead of to the original year_in_range (that won't be accessed until the lambda function runs, which will be during the test).
I'm using the mock library written by Michael Foord to help with my testing on a django application.
I'd like to test that I'm setting up my query properly, but I don't think I need to actually hit the database, so I'm trying to mock out the query.
I can mock out the first part of the query just fine, but I am not getting the results I'd like when I chain additional things on.
The function:
#staticmethod
def get_policies(policy_holder, current_user):
if current_user.agency:
return Policy.objects.filter(policy_holder=policy_holder, version__agency=current_user.agency).distinct()
else:
return Policy.objects.filter(policy_holder=policy_holder)
and my test: The first assertion passes, the second one fails.
def should_get_policies_for_agent__user(self):
with mock.patch.object(policy_models.Policy, "objects") as query_mock:
user_mock = mock.Mock()
user_mock.agency = "1234"
policy_models.Policy.get_policies("policy_holder", user_mock)
self.assertEqual(query_mock.method_calls, [("filter", (), {
'policy_holder': "policy_holder",
'version__agency': user_mock.agency,
})])
self.assertTrue(query_mock.distinct.called)
I'm pretty sure the issue is that the initial query_mock is returning a new mock after the .filter() is called, but I don't know how to capture that new mock and make sure .distinct() was called on it.
Is there a better way to be testing what I am trying to get at? I'm trying to make sure that the proper query is being called.
Each mock object holds onto the mock object that it returned when it is called. You can get a hold of it using your mock object's return_value property.
For your example,
self.assertTrue(query_mock.distinct.called)
distinct wasn't called on your mock, it was called on the return value of the filter method of your mock, so you can assert that distinct was called by doing this:
self.assertTrue(query_mock.filter.return_value.distinct.called)
I'm writing a program that uses genetic techniques to evolve equations.
I want to be able to submit the function 'mainfunc' to the Parallel Python 'submit' function.
The function 'mainfunc' calls two or three methods defined in the Utility class.
They instantiate other classes and call various methods.
I think what I want is all of it in one NAMESPACE.
So I've instantiated some (maybe it should be all) of the classes inside the function 'mainfunc'.
I call the Utility method 'generate()'. If we were to follow it's chain of execution
it would involve all of the classes and methods in the code.
Now, the equations are stored in a tree. Each time a tree is generated, mutated or cross
bred, the nodes need to be given a new key so they can be accessed from a dictionary attribute of the tree. The class 'KeySeq' generates these keys.
In Parallel Python, I'm going to send multiple instances of 'mainfunc' to the 'submit' function of PP. Each has to be able to access 'KeySeq'. It would be nice if they all accessed the same instance of KeySeq so that none of the nodes on the returned trees had the same key, but I could get around that if necessary.
So: my question is about stuffing EVERYTHING into mainfunc.
Thanks
(Edit) If I don't include everything in mainfunc, I have to try to tell PP about dependent functions, etc by passing various arguements in various places. I'm trying to avoid that.
(late Edit) if ks.next() is called inside the 'generate() function, it returns the error 'NameError: global name 'ks' is not defined'
class KeySeq:
"Iterator to produce sequential \
integers for keys in dict"
def __init__(self, data = 0):
self.data = data
def __iter__(self):
return self
def next(self):
self.data = self.data + 1
return self.data
class One:
'some code'
class Two:
'some code'
class Three:
'some code'
class Utilities:
def generate(x):
'___________'
def obfiscate(y):
'___________'
def ruminate(z):
'__________'
def mainfunc(z):
ks = KeySeq()
one = One()
two = Two()
three = Three()
utilities = Utilities()
list_of_interest = utilities.generate(5)
return list_of_interest
result = mainfunc(params)
It's fine to structure your program that way. A lot of command line utilities follow the same pattern:
#imports, utilities, other functions
def main(arg):
#...
if __name__ == '__main__':
import sys
main(sys.argv[1])
That way you can call the main function from another module by importing it, or you can run it from the command line.
If you want all of the instances of mainfunc to use the same KeySeq object, you can use the default parameter value trick:
def mainfunc(ks=KeySeq()):
key = ks.next()
As long as you don't actually pass in a value of ks, all calls to mainfunc will use the instance of KeySeq that was created when the function was defined.
Here's why, in case you don't know: A function is an object. It has attributes. One of its attributes is named func_defaults; it's a tuple containing the default values of all of the arguments in its signature that have defaults. When you call a function and don't provide a value for an argument that has a default, the function retrieves the value from func_defaults. So when you call mainfunc without providing a value for ks, it gets the KeySeq() instance out of the func_defaults tuple. Which, for that instance of mainfunc, is always the same KeySeq instance.
Now, you say that you're going to send "multiple instances of mainfunc to the submit function of PP." Do you really mean multiple instances? If so, the mechanism I'm describing won't work.
But it's tricky to create multiple instances of a function (and the code you've posted doesn't). For example, this function does return a new instance of g every time it's called:
>>> def f():
def g(x=[]):
return x
return g
>>> g1 = f()
>>> g2 = f()
>>> g1().append('a')
>>> g2().append('b')
>>> g1()
['a']
>>> g2()
['b']
If I call g() with no argument, it returns the default value (initially an empty list) from its func_defaults tuple. Since g1 and g2 are different instances of the g function, their default value for the x argument is also a different instance, which the above demonstrates.
If you'd like to make this more explicit than using a tricky side-effect of default values, here's another way to do it:
def mainfunc():
if not hasattr(mainfunc, "ks"):
setattr(mainfunc, "ks", KeySeq())
key = mainfunc.ks.next()
Finally, a super important point that the code you've posted overlooks: If you're going to be doing parallel processing on shared data, the code that touches that data needs to implement locking. Look at the callback.py example in the Parallel Python documentation and see how locking is used in the Sum class, and why.
Your concept of classes in Python is not sound I think. Perhaps, it would be a good idea to review the basics. This link will help.
Python Basics - Classes