Unit Testing a Function That Calls a url with Mock - python

Question
I have a function that calls a url and then modifies that url's response. How can I write a unit test for the portion of that function's code that modifies the response, without having to rely on that url?
Example
my_module.py
import requests
def get_some_resource():
url = 'http://httpbin.org/get'
r = requests.get(url)
# Special manipulation of returned text (what I want to test) simple example used
output = r.text.upper()
return output
What I've tried so far
Using mock's MagicMock() (you can't use it to override a function's variables, as far as I can tell)
I've considered breaking apart the two sections of that function (the retrieval of the url and the modify of the response), however, I'm not clear if that's necessary
So much googling my hands hurt

If this is a unit test, you would just probably want to mock out the requests library. You can either just patch the whole thing, or just get, it doesn't really matter.
It'd look like:
get = mock.Mock()
text = get.return_value.text = "hey I got this")
with mock.patch("my_module.requests.get", get):
resource = get_some_resource()
self.assertEqual(resource, text.upper())
Cheers.

Related

Python - Call Pre-defined Function after Decorator

I am building a straight-forward Flask API. After each decorator for the API endpoint, I have to define a function that simply calls another function I have in a separate file. This works fine, but seems redundant. I would rather just call that pre-defined function directly, instead of having to wrap it within another function right after the decorator. Is this possible?
What I have currently:
import routes.Locations as Locations
# POST: /api/v1/locations
#app.route('/locations', methods=['GET'])
def LocationsRead ():
return Locations.read()
Locations.read() function looks like this:
def read():
return {
'id': 1,
'name': 'READ'
}
What I am hoping to do:
import routes.Locations as Locations
# POST: /api/v1/locations
#app.route('/locations', methods=['GET'])
Locations.read()
The #syntax of decorators is just syntactic sugar for:
def LocationsRead():
return Locations.read()
LocationsRead = app.route('/locations', methods=['GET'])(LocationsRead)
So you could do something like:
LocationsRead = app.route('/locations', methods=['GET'])(Locations.read)
Arguably, that takes a bit longer to understand the intention and it's not that much more terse that your original code.
With exceptions and logging the stack trace, you also lose one level of the stack trace. That will make it hard to identify where and how Locations.read is being added as a route in a flask. The stack trace will jump straight from the flask library to routes.Locations:read. If you want to know how route was configured (eg.what the URL is parameterised with or what methods it works with), then you'll have to know already know which file the "decoration" took place. If you use normal decoration, you'll get a line pointing at the file containing #app.route('/locations', methods=['GET']).
That is, you get a debatable benefit and the potential to make debugging harder. Stick with the # decorator syntax.
Thanks to #Dunes and #RodrigoRodrigues answers, I played around with it more and found that the following works both for endpoints with and without arguments to pass, like an ID. See the code below.
# GET: /api/v1/locations
app.route(basepath + '/locations', methods=['GET'])(Locations.read)
# GET: /api/v1/locations/{id}
app.route(basepath + '/locations/<int:id>', methods=['GET'])(Locations.read)
# POST: /api/v1/locations
app.route(basepath + '/locations', methods=['POST'])(Locations.create)
# PUT: /api/v1/locations/{id}
app.route(basepath + '/locations/<int:id>', methods=['PUT'])(Locations.update)
# DELETE: /api/v1/locations/{id}
app.route(basepath + '/locations/<int:id>', methods=['DELETE'])(Locations.delete)
Now, I doubt this is standard practice, but if someone is looking to reduce the amount of code in their route declarations, this is one way to do it.

Mocking a function call within a function in Python

This is my first time building out unit tests, and I'm not quite sure how to proceed here. Here's the function I'd like to test; it's a method in a class that accepts one argument, url, and returns one string, task_id:
def url_request(self, url):
conn = self.endpoint_request()
authorization = conn.authorization
response = requests.get(url, authorization)
return response["task_id"]
The method starts out by calling another method within the same class to obtain a token to connect to an API endpoint. Should I be mocking the output of that call (self.endpoint_request())?
If I do have to mock it, and my test function looks like this, how do I pass a fake token/auth endpoint_request response?
#patch("common.DataGetter.endpoint_request")
def test_url_request(mock_endpoint_request):
mock_endpoint_request.return_value = {"Auth": "123456"}
# How do I pass the fake token/auth to this?
task_id = DataGetter.url_request(url)
The code you have shown is strongly dominated by interactions. Which means that there will most likely be no bugs to find with unit-testing: The potential bugs are on the interaction level: You access conn.authorization - but, is this the proper member? And, does it already have the proper representation in the way you need it further on? Is requests.get the right method for the job? Is the argument order as you expect it? Is the return value as you expect it? Is task_id spelled correctly?
These are (some of) the potential bugs in your code. But, with unit-testing you will not be able to find them: When you replace the depended-on components with some mocks (which you create or configure), your unit-tests will just succeed: Lets assume that you have a misconception about the return value of requests.get, namely that task_id is spelled wrongly and should rather be spelled taskId. If you mock requests.get, you would implement the mock based on your own misconception. That is, your mock would return a map with the (misspelled) key task_id. Then, the unit-test would succeed despite of the bug.
You will only find that bug with integration testing, where you bring your component and depended-on components together. Only then you can test the assumptions made in your component against the reality of the other components.

How do we check if mock request is actually the correct one?

Since we return mock objects from the requests made by the code, this means that no matter the input to the code under test, as long as the response from the request is handled correctly, the test would always pass. However we don't know if the code made the right request(s) to my site in the first place. For example, if makeRequest() for some reason made a request to www.my-site.com/foobar or www.google.com we would get a false positive because the tests would still pass since the mock response is still what we expect, but they should really fail.
Probably a silly question, but is there a way in unittest.mock to check and make sure the request made is what we expect as well?
def makeRequest(session):
resp = session.get(www.my-site.com/foobar)
return resp
#patch.object(requests.Session, 'get')
def test_makeRequest(self, mock_get):
def mockResp(self):
r = requests.Response()
req.status_code = 200
return r
mock_get.return_value = mockResp()
mock_get_response = makeRequest()
You can check the request parameters by using assertions on the mock.
# setting up the canned response on the mock
mock_get.return_value = mockResp()
# actually calls the real code under test, i.e. calls makeRequest
mock_get_response = makeRequest()
# make an assertion about what the code *within* makeRequest did
mock_get.assert_called_once_with("www.my-site.com/foobar")
# maybe make an assertion about the `mock_get_response` here, too
Note that, as written, this test will fail. You need to pass a session into makeRequest, since it takes one required argument. Rather than setup your mock on requests.Session, it will be easier to just pass in your mock as the session argument during the test.

Is it better to assign a name to a throwaway object in Python, or not?

This is not a "why doesn't my code run" question. It is a "how / why does my code work" question. I am looking to generalize from this specific case to learn what broad rules apply to similar situations in the future.
I have done some searching (Google and StackOverflow) for this, but haven't seen anything that answers this question directly. Of course, I'm not entirely sure how best to ask this question, and may be using the wrong terms. I welcome suggested edits for the question title and labels.
I have the following function (which makes use of the requests module):
def make_session(username,password,login_url):
#The purpose of this function is to create a requests.Session object,
#update the state of the object to have all of the cookies and other
#session data necessary to act as a logged in user at a website, and
#return the session to the calling function.
new_session = requests.Session()
login_page = new_session.get(login_url)
#The function get_login_submit_page takes the previously
#created login_page, extracts the target of the login form
#submit, and returns it as a unicode string.
submit_page_URL = get_login_submit_page_URL(login_page)
payload = {u'session_name': username, u'session_password': password}
new_session.post(submit_page_URL,data=payload,allow_redirects=True)
return new_session
And what I really want to know is whether or not how I do this line matters:
new_session.post(submit_page_URL,data=payload,allow_redirects=True)
According to the requests documentation, the Session.post method returns a Response object.
However, this method also has side-effects which update the Session object. It is those side effects that I care about. I have no use for the Response object this method creates.
I have tested this code in practice, both assigning the Response to a label, and leaving it as presented above. Both options appear to work equally well for my purposes.
The actual question I am asking is: since, reasonably, whether I assign a label or not, the Requests object created by my call to Session.post falls out of scope as soon as the Session is returned to the calling function, does it matter whether I assign a label or not?
Rather, do I save any memory/processing time by not making the assignment? Do I create potential unforeseen problems for myself by not doing so?
If you are not using the return value of a call, there is little point in assigning it to a local name.
The returned response object will then not be referenced anywhere and freed two bytecodes earlier than if you assigned it to a name, and ignored that name before returning from the function.

Creating an asynchronous method with Google App Engine's NDB

I want to make sure I got down how to create tasklets and asyncrounous methods. What I have is a method that returns a list. I want it to be called from somewhere, and immediatly allow other calls to be made. So I have this:
future_1 = get_updates_for_user(userKey, aDate)
future_2 = get_updates_for_user(anotherUserKey, aDate)
somelist.extend(future_1)
somelist.extend(future_2)
....
#ndb.tasklet
def get_updates_for_user(userKey, lastSyncDate):
noteQuery = ndb.GqlQuery('SELECT * FROM Comments WHERE ANCESTOR IS :1 AND modifiedDate > :2', userKey, lastSyncDate)
note_list = list()
qit = noteQuery.iter()
while (yield qit.has_next_async()):
note = qit.next()
noteDic = note.to_dict()
note_list.append(noteDic)
raise ndb.Return(note_list)
Is this code doing what I'd expect it to do? Namely, will the two calls run asynchronously? Am I using futures correctly?
Edit: Well after testing, the code does produce the desired results. I'm a newbie to Python - what are some ways to test to see if the methods are running async?
It's pretty hard to verify for yourself that the methods are running concurrently -- you'd have to put copious logging in. Also in the dev appserver it'll be even harder as it doesn't really run RPCs in parallel.
Your code looks okay, it uses yield in the right place.
My only recommendation is to name your function get_updates_for_user_async() -- that matches the convention NDB itself uses and is a hint to the reader of your code that the function returns a Future and should be yielded to get the actual result.
An alternative way to do this is to use the map_async() method on the Query object; it would let you write a callback that just contains the to_dict() call:
#ndb.tasklet
def get_updates_for_user_async(userKey, lastSyncDate):
noteQuery = ndb.gql('...')
note_list = yield noteQuery.map_async(lambda note: note.to_dict())
raise ndb.Return(note_list)
Advanced tip: you can simplify this even more by dropping the #ndb.tasklet decorator and just returning the Future returned by map_async():
def get_updates_for_user_Async(userKey, lastSyncDate):
noteQuery = ndb.gql('...')
return noteQuery.map_async(lambda note: note.to_dict())
This is a general slight optimization for async functions that contain only one yield and immediately return the value yielded. (If you don't immediately get this you're in good company, and it runs the risk to be broken by a future maintainer who doesn't either. :-)

Categories