I am trying to mock the infamous datetime.now() function so that I have a frozen date easy to test in my unit test. I usually use unittest-mock package to do so, patching the datetime object of the interested module.
The problem is that the datetime.now() I am trying to patch is coming from a default function parameter, as follows (in views.py):
def user_as_guest(user_id,
date=datetime.now()):
So this simply doesn't work (test.py):
with patch('myapp.views.datetime') as mock_date:
frozen_date = datetime.now(pytz.utc)
mock_date.now.return_value = frozen_date
Events.user_as_guest(user_id_1)
Even if debugging the flow I can see the datetime.now() is correctly patched, I imagine the callable object used for assigning the default parameter has a different scope from the module, but I can't go further. Thank for your help!
Side Note
I realized the code I wrote was not doing exactly what I wanted. I thought the default function parameters were evaluated and late bound, that is not actually the case.
Indeed, my date object would contain a date in the past, not the one at the time of the function call. This explains more why I couldn't patch it with my approach.
I fixed everything assigning the date in the function body, so the old but gold mocking approach works. But still, the question is valid...
Related
Problem
I have a function make_pipeline that accepts an arbitrary number of functions, which it then calls to perform sequential data transformation. The resulting call chain performs transformations on a pandas.DataFrame. Some, but not all functions that it may call need to operate on a sub-array of the DataFrame. I have written multiple selector functions. However at present each member-function of the chain has to be explicitly be given the user-selected selector/filter function. This is VERY error-prone and accessibility is very important as the end-code is addressed to non-specialists (possibly with no Python/programming knowledge), so it must be "batteries-included". This entire project is written in a functional style (that's what's always worked for me).
Sample Code
filter_func = simple_filter()
# The API looks like this
make_pipeline(
load_data("somepath", header = [1,0]),
transform1(arg1,arg2),
transform2(arg1,arg2, data_filter = filter_func),# This function needs access to user-defined filter function
transform3(arg1,arg2,, data_filter = filter_func),# This function needs access to user-defined filter function
transform4(arg1,arg2),
)
Expected API
filter_func = simple_filter()
# The API looks like this
make_pipeline(
load_data("somepath", header = [1,0]),
transform1(arg1,arg2),
transform2(arg1,arg2),
transform3(arg1,arg2),
transform4(arg1,arg2),
)
Attempted
I thought that if the data_filter alias is available in the caller's namespace, it also becomes available (something similar to a closure) to all functions it calls. This seems to happen with some toy examples but wont work in the case (UnboundError).
What's a good way to make a function defined in one place available to certain interested functions in the call chain? I'm trying to avoid global.
Notes/Clarification
I've had problems with OOP and mutable states in the past, and functional programming has worked quite well. Hence I've set a goal for myself to NOT use classes (to the extent that Python enables me to anyways). So no classes.
I should have probably clarified this initially: In the pipeline the output of all functions is a DataFrame and the input of all functions (except load data obviously) is a DataFrame. The functions are decorated with a wrapper that calls functools.partial because we want the user to supply the args to each function but not execute it. The actual execution is done be a forloop in make_pipeline.
Each function accepts df:pandas.DataFrame plus all arguements that are specific to that function. The statement seen above transform1(arg1,arg2,...) actually calls the decorated transform1 witch returns functools.partial(transform, arg1,arg2,...) which is now has a signature like transform(df:pandas.DataFrame).
load_dataframe is just a convenience function to load the initial dataframe so that all other functions can begin operating on it. It just felt more intuitive to users to have it part of the chain rather that a separate call
The problem is this: I need a way for a filter function to be initialized (called) in only on place, such that every function in the call chain that needs access to the filter function, gets it without it being explicitly passed as argument to said function. If you're wondering why this is the case, it's because I feel that end users will find it unintuitive and arbitrary. Some functions need it, some don't. I'm also pretty certain that they will make all kinds of errors like passing different filters, forgetting it sometimes etc.
(Update) I've also tried inspect.signature() in make_pipeline to check if each function accepts a data_filter argument and pass it on. However, this raises an incorrect function signature error so some unclear reason (likely because of the decorators/partial calls). If signature could the return the non-partial function signature, this would solve the issue, but I couldn't find much info in the docs
Turns out it was pretty easy. The solution is inspect.signature.
def make_pipeline(*args, data_filter:Optional[Callable[...,Any]] = None)
d = args[0]
for arg in args[1:]:
if "data_filter" in inspect.signature(arg):
d = arg(d, data_filter = data_filter)
else:
d= arg(d)
Leaving this here mostly for reference because I think this is a mini design pattern. I've also seen an function._closure_ on unrelated subject. That may also work, but will likely be more complicated.
I have some code I'd like to quickly test. This code includes a line that needs to query a server and obtain a True/False answer:
result = server.queryServer(data)
Is there a way to override this function call so that it just calls a local function that always returns True (so that I can debug without running the server)?
Mock is your friend. It allows you to mock entire classes or functions of them.
What you want is called mocking, replacing existing objects with temporary objects that act differently just for a test.
Use the unittest.mock library to do this. It will create the temporary objects for you, and give you the tools to replace the object, and restore the old situation, for the duration of a test.
The module provides patchers to do the replacement. For example, you could use a context manager:
from unittest import mock
with mock.patch('server.queryServer') as mocked_queryServer:
mocked_queryServer.return_value = True # always return True when called
# test your code
# ...
# afterwards, check if the mock has been called at least once.
mocked_queryServer.assert_called()
I added an assertion at the end there; mock objects not only let you replace functions transparently, they then also record what happens to them, letting you check if your code worked correctly by calling the mock.
I have been writing unit tests for over a year now, and have always used patch.object for pretty much everything (modules, classes, etc).
My coworker says that patch.object should never be used to patch an object in a module (i.e. patch.object(socket, 'socket'), instead you should always use patch('socket.socket').
I much prefer the patch.object method, as it allows me to import modules and is more pythonic in my opinion. Is my coworker right?
Note: I have looked through the patch documentation and can't find any warnings on this subject. Isn't everything an object in python?
There is no such requirement, and yes, everything is an object in Python.
It is nothing more than a style choice; do you import the module yourself or does patch take care of this for you? Because that's the only difference between the two approaches; either patch() imports the module, or you do.
For the code-under-test, I prefer mock.patch() to take care of this, as this ensures that the import takes place as the test runs. This ensures I get a test error state (test failure), rather than problems while loading the test. All other modules are fair game.
Looking at the mock source code, it really doesn't look like there is a difference.
To investigate I first looked at def patch and see that it does:
getter, attribute = _get_target(target)
return _patch(
getter, attribute, new, spec, create,
spec_set, autospec, new_callable, kwargs)
wheras patch.object does the same except: getter = lambda: target
Ok, so what does this _get_target do? It pretty much splits the string and calls _importer on the first part (making an object) and uses the string the same way as get_object.
_importer is a pretty simple mechanism to import from a module (using getattr for every "component"), and pretty clearly just makes an object as well.
So fundamentally, at the source level, there is not really any difference.
Case Closed
I am writing an application which makes use of the current time (via datetime.datetime.now())
Is there a way, for debugging purposes, to override this call so that it returns a specific timestamp?
As a fallback I could write a function which would be called instead of datetime.datetime.now() and return whatever is needed (the actual current time in producton and the required test time when debugging) but there may be a more pythonic way to perform these kind of actions?
Broadly, your options are:
Use the unittest.mock library, which can replace the function on the fly with a dummy function that always gives the same results (or use another mocking library that does the same thing). This means you don't have to modify your function; however, reasonable people can disagree on whether monkey patching with mock is good practice, even for debugging. I think this is the most widely used solution to this problem in Python.
Modify your function to do something different depending on its environment (the actual environment variables on your system, or global state, or something else). This is the easiest way, but also the crudest and most fragile way, so you'll have to be sure to change it back after your debugging is finished.
Modify your function to accept a function itself as a parameter, and pass in datetime.datetime.now as that function in normal operation, but pass in something different (for instance a stub) for testing.
You can use the mock library to mock the datetime.datetime.now() usage:
import mock
def my_test():
my_mock = mock.Mock(return_value=your_desired_timestamp)
with mock.patch('mymodule.datetime.datetime.now', my_mock):
# Here, all calls to `datetime.datetime.now` referenced by `datetime.datetime`
# defined in `my_module` will be mocked to return `your_desired_timestamp`.
I have two functions like the following:
def fitnesscompare(x, y):
if x.fitness>y.fitness:
return 1
elif x.fitness==y.fitness:
return 0
else: #x.fitness<y.fitness
return -1
that are used with 'sort' to sort on different attributes of class instances.
These are used from within other functions and methods in the program.
Can I make them visible everywhere rather than having to pass them to each object in which they are used?
Thanks
The best approach (to get the visibility you ask about) is to put this def statement in a module (say fit.py), import fit from any other module that needs access to items defined in this one, and use fit.fitnesscompare in any of those modules as needed.
What you ask, and what you really need, may actually be different...:
as I explained in another post earlier today, custom comparison functions are not the best way to customize sorting in Python (which is why in Python 3 they're not even allowed any more): rather, a custom key-extraction function will serve you much better (future-proof, more general, faster). I.e., instead of calling, say
somelist.sort(cmp=fit.fitnesscompare)
call
somelist.sort(key=fit.fitnessextract)
where
def fitnessextract(x):
return x.fitness
or, for really blazing speed,
import operator
somelist.sort(key=operator.attrgetter('fitness'))
Defining a function with def makes that function available within whatever scope you've defined it in. At module level, using def will make that function available to any other function inside that module.
Can you perhaps post an example of what is not working for you? The code you've posted appears to be unrelated to your actual problem.