Mocking a function call within a function in Python - python

This is my first time building out unit tests, and I'm not quite sure how to proceed here. Here's the function I'd like to test; it's a method in a class that accepts one argument, url, and returns one string, task_id:
def url_request(self, url):
conn = self.endpoint_request()
authorization = conn.authorization
response = requests.get(url, authorization)
return response["task_id"]
The method starts out by calling another method within the same class to obtain a token to connect to an API endpoint. Should I be mocking the output of that call (self.endpoint_request())?
If I do have to mock it, and my test function looks like this, how do I pass a fake token/auth endpoint_request response?
#patch("common.DataGetter.endpoint_request")
def test_url_request(mock_endpoint_request):
mock_endpoint_request.return_value = {"Auth": "123456"}
# How do I pass the fake token/auth to this?
task_id = DataGetter.url_request(url)

The code you have shown is strongly dominated by interactions. Which means that there will most likely be no bugs to find with unit-testing: The potential bugs are on the interaction level: You access conn.authorization - but, is this the proper member? And, does it already have the proper representation in the way you need it further on? Is requests.get the right method for the job? Is the argument order as you expect it? Is the return value as you expect it? Is task_id spelled correctly?
These are (some of) the potential bugs in your code. But, with unit-testing you will not be able to find them: When you replace the depended-on components with some mocks (which you create or configure), your unit-tests will just succeed: Lets assume that you have a misconception about the return value of requests.get, namely that task_id is spelled wrongly and should rather be spelled taskId. If you mock requests.get, you would implement the mock based on your own misconception. That is, your mock would return a map with the (misspelled) key task_id. Then, the unit-test would succeed despite of the bug.
You will only find that bug with integration testing, where you bring your component and depended-on components together. Only then you can test the assumptions made in your component against the reality of the other components.

Related

Transparently passing through a function with a variable argument list

I am using Python RPyC to communicate between two machines. Since the link may be prone to errors I would like to have a generic wrapper function which takes a remote function name plus that function's parameters as its input, does some status checking, calls the function with the parameters, does a little more status checking and then returns the result of the function call. The wrapper should have no knowledge of the function, its parameters/parameter types or the number of them, or the return value for that matter, the user has to get that right; it should just pass them transparently through.
I get the getattr(conn.root, function)() pattern to call the function but my Python expertise runs out at populating the parameters. I have read various posts on the use of *arg and **kwarg, in particular this one, which suggests that it is either difficult or impossible to do what I want to do. Is that correct and, if so, might there be a scheme which would work if I, say, ensured that all the function parameters were keyword parameters?
I do own both ends of this interface (the caller and the called) so I could arrange to dictionary-ise all the function parameters but I'd rather not make my API too peculiar if I could possibly avoid it.
Edit: the thing being called, at the remote end of the link, is a class with very ordinary methods, e.g.;
def exposed_a(self)
def exposed_b(self, thing1)
def exposed_c(self, thing1=None)
def exposed_d(self, thing1=DEFAULT_VALUE1, thing2=None)
def exposed_e(self, thing1, thing2, thing3=DEFAULT_VALUE1, thing4=None)
def exposed_f(self, thing1=None, thing2=None)
...where the types of each argument (and the return values) could be string, dict, number or list.
And it is indeed, trivial, my Goggle fu had simply failed me in finding the answer. In the hope of helping anyone else who is inexperienced in Python and is having a Google bad day:
One simply takes *arg and **kwarg as parameters and passes them directly on, with the asterisks attached. So in my case, to do my RPyC pass-through, where conn is the RPyC connection:
def my_passthru(conn, function_name, *function_args, **function_kwargs):
# Do a check of something or other here
return_value = getattr(conn.root, function_name)(*function_args, **function_kwargs)
# Do another check here
return return_value
Then, for example, a call to my exposed_e() method above might be:
return_value = my_passthru(conn, e, thing1, thing2, thing3)
(the exposed_ prefix being added automagically by RPyC in this case).
And of course one could put a try: / except ConnectionRefusedError: around the getattr() call in my_passthru() to generically catch the case where the connection has dropped underneath RPyC, which was my main purpose.

Is it better to assign a name to a throwaway object in Python, or not?

This is not a "why doesn't my code run" question. It is a "how / why does my code work" question. I am looking to generalize from this specific case to learn what broad rules apply to similar situations in the future.
I have done some searching (Google and StackOverflow) for this, but haven't seen anything that answers this question directly. Of course, I'm not entirely sure how best to ask this question, and may be using the wrong terms. I welcome suggested edits for the question title and labels.
I have the following function (which makes use of the requests module):
def make_session(username,password,login_url):
#The purpose of this function is to create a requests.Session object,
#update the state of the object to have all of the cookies and other
#session data necessary to act as a logged in user at a website, and
#return the session to the calling function.
new_session = requests.Session()
login_page = new_session.get(login_url)
#The function get_login_submit_page takes the previously
#created login_page, extracts the target of the login form
#submit, and returns it as a unicode string.
submit_page_URL = get_login_submit_page_URL(login_page)
payload = {u'session_name': username, u'session_password': password}
new_session.post(submit_page_URL,data=payload,allow_redirects=True)
return new_session
And what I really want to know is whether or not how I do this line matters:
new_session.post(submit_page_URL,data=payload,allow_redirects=True)
According to the requests documentation, the Session.post method returns a Response object.
However, this method also has side-effects which update the Session object. It is those side effects that I care about. I have no use for the Response object this method creates.
I have tested this code in practice, both assigning the Response to a label, and leaving it as presented above. Both options appear to work equally well for my purposes.
The actual question I am asking is: since, reasonably, whether I assign a label or not, the Requests object created by my call to Session.post falls out of scope as soon as the Session is returned to the calling function, does it matter whether I assign a label or not?
Rather, do I save any memory/processing time by not making the assignment? Do I create potential unforeseen problems for myself by not doing so?
If you are not using the return value of a call, there is little point in assigning it to a local name.
The returned response object will then not be referenced anywhere and freed two bytecodes earlier than if you assigned it to a name, and ignored that name before returning from the function.

Creating an asynchronous method with Google App Engine's NDB

I want to make sure I got down how to create tasklets and asyncrounous methods. What I have is a method that returns a list. I want it to be called from somewhere, and immediatly allow other calls to be made. So I have this:
future_1 = get_updates_for_user(userKey, aDate)
future_2 = get_updates_for_user(anotherUserKey, aDate)
somelist.extend(future_1)
somelist.extend(future_2)
....
#ndb.tasklet
def get_updates_for_user(userKey, lastSyncDate):
noteQuery = ndb.GqlQuery('SELECT * FROM Comments WHERE ANCESTOR IS :1 AND modifiedDate > :2', userKey, lastSyncDate)
note_list = list()
qit = noteQuery.iter()
while (yield qit.has_next_async()):
note = qit.next()
noteDic = note.to_dict()
note_list.append(noteDic)
raise ndb.Return(note_list)
Is this code doing what I'd expect it to do? Namely, will the two calls run asynchronously? Am I using futures correctly?
Edit: Well after testing, the code does produce the desired results. I'm a newbie to Python - what are some ways to test to see if the methods are running async?
It's pretty hard to verify for yourself that the methods are running concurrently -- you'd have to put copious logging in. Also in the dev appserver it'll be even harder as it doesn't really run RPCs in parallel.
Your code looks okay, it uses yield in the right place.
My only recommendation is to name your function get_updates_for_user_async() -- that matches the convention NDB itself uses and is a hint to the reader of your code that the function returns a Future and should be yielded to get the actual result.
An alternative way to do this is to use the map_async() method on the Query object; it would let you write a callback that just contains the to_dict() call:
#ndb.tasklet
def get_updates_for_user_async(userKey, lastSyncDate):
noteQuery = ndb.gql('...')
note_list = yield noteQuery.map_async(lambda note: note.to_dict())
raise ndb.Return(note_list)
Advanced tip: you can simplify this even more by dropping the #ndb.tasklet decorator and just returning the Future returned by map_async():
def get_updates_for_user_Async(userKey, lastSyncDate):
noteQuery = ndb.gql('...')
return noteQuery.map_async(lambda note: note.to_dict())
This is a general slight optimization for async functions that contain only one yield and immediately return the value yielded. (If you don't immediately get this you're in good company, and it runs the risk to be broken by a future maintainer who doesn't either. :-)

Unit Testing a Function That Calls a url with Mock

Question
I have a function that calls a url and then modifies that url's response. How can I write a unit test for the portion of that function's code that modifies the response, without having to rely on that url?
Example
my_module.py
import requests
def get_some_resource():
url = 'http://httpbin.org/get'
r = requests.get(url)
# Special manipulation of returned text (what I want to test) simple example used
output = r.text.upper()
return output
What I've tried so far
Using mock's MagicMock() (you can't use it to override a function's variables, as far as I can tell)
I've considered breaking apart the two sections of that function (the retrieval of the url and the modify of the response), however, I'm not clear if that's necessary
So much googling my hands hurt
If this is a unit test, you would just probably want to mock out the requests library. You can either just patch the whole thing, or just get, it doesn't really matter.
It'd look like:
get = mock.Mock()
text = get.return_value.text = "hey I got this")
with mock.patch("my_module.requests.get", get):
resource = get_some_resource()
self.assertEqual(resource, text.upper())
Cheers.

How to mock chained function calls in python?

I'm using the mock library written by Michael Foord to help with my testing on a django application.
I'd like to test that I'm setting up my query properly, but I don't think I need to actually hit the database, so I'm trying to mock out the query.
I can mock out the first part of the query just fine, but I am not getting the results I'd like when I chain additional things on.
The function:
#staticmethod
def get_policies(policy_holder, current_user):
if current_user.agency:
return Policy.objects.filter(policy_holder=policy_holder, version__agency=current_user.agency).distinct()
else:
return Policy.objects.filter(policy_holder=policy_holder)
and my test: The first assertion passes, the second one fails.
def should_get_policies_for_agent__user(self):
with mock.patch.object(policy_models.Policy, "objects") as query_mock:
user_mock = mock.Mock()
user_mock.agency = "1234"
policy_models.Policy.get_policies("policy_holder", user_mock)
self.assertEqual(query_mock.method_calls, [("filter", (), {
'policy_holder': "policy_holder",
'version__agency': user_mock.agency,
})])
self.assertTrue(query_mock.distinct.called)
I'm pretty sure the issue is that the initial query_mock is returning a new mock after the .filter() is called, but I don't know how to capture that new mock and make sure .distinct() was called on it.
Is there a better way to be testing what I am trying to get at? I'm trying to make sure that the proper query is being called.
Each mock object holds onto the mock object that it returned when it is called. You can get a hold of it using your mock object's return_value property.
For your example,
self.assertTrue(query_mock.distinct.called)
distinct wasn't called on your mock, it was called on the return value of the filter method of your mock, so you can assert that distinct was called by doing this:
self.assertTrue(query_mock.filter.return_value.distinct.called)

Categories