I'm currently using the python requests library to interact with an external api which uses json.
Each endpoint works via a method (of the api class) and uses the collect_data method.
However I want the scraper to continue running whenever it encounters a http error (and ideally output this to a log).
What's the best way to do this as currently it just breaks when I use http.raise_for_status()
It seems like I should be using a try/except in someway but not sure how best to do this here?
def scrape_full_address(self, house_no, postcode):
address_path = '/api/addresses'
address_url = self.api_source + address_path
payload = {
'houseNo': house_no,
'postcode': postcode,
}
return self.collect_data(url=address_url, method='get', payload=payload)
def collect_data(self, url, method, payload=None):
if method == 'get':
data = None
params = payload
elif method == 'post':
params = None
data = payload
response = getattr(requests, method)(url=url, params=params, json=data, headers=self.headers)
if response.status_code == 200:
return response.json()
else:
return response.raise_for_status()
When you call scrape_full_address() elsewhere in your code wrap that in a try statement.
For more info see: https://wiki.python.org/moin/HandlingExceptions
try:
scrape_full_address(659, 31052)
except HTTPError:
print "Oops! That caused an error. Try again..."
Related
I have a function like this:
def get_some_data(api_url, **kwargs)
# some logic on generating headers
# some more logic
response = requests.get(api_url, headers, params)
return response
I need to create a fake/mock "api_url", which, when made request to, would generate a valid response.
I understand how to mock the response:
def mock_response(data):
response = requests.Response()
response.status_code = 200
response._content = json.dumps(data)
return response
But i need to make the test call like this:
def test_get_some_data(api_url: some_magic_url_path_that_will_return_mock_response):
Any ideas on how to create an url path returning a response within the scope of the test (only standard Django, Python, pytest, unittest) would be very much appreciated
The documentation is very well written and more than clear on how to mock whatever you want. But, let say you have a service that makes the 3rd party API call:
def foo(url, params):
# some logic on generating headers
# some more logic
response = requests.get(url, headers, params)
return response
In your test you want to mock the return value of this service.
#patch("path_to_service.foo")
def test_api_call_response(self, mock_response):
mock_response.return_value = # Whatever the return value you want it to be
# Here you call the service as usual
response = foo(..., ...)
# Assert your response
I am trying to adapt this script so that even if multiple messages are sent at the same time the script will just keep trying until they are let through.
The original script is: https://github.com/4rqm/dhooks
My version is:
def post(self):
"""
Send the JSON formated object to the specified `self.url`.
"""
headers = {'Content-Type': 'application/json'}
result = requests.post(self.url, data=self.json, headers=headers)
if result.status_code == 400 or result.status_code == 429:
print(result.status_code)
#its the line below that does not work.
post(self)
else:
print("Payload delivered successfuly")
print("Code : "+str(result.status_code))
time.sleep(2)
I have the following code for interacting with pull requests on the github api.
def merge(pull):
url = "https://api.github.com/repos/{}/{}/pulls/{}/merge".format(os.environ.get("GITHUB_USERNAME"), os.environ.get("GITHUB_REPO"), pull['number'])
response = requests.put(url, auth=get_auth(), data={})
if response.status_code == 200:
#Merge was successful
return True
else:
#Something went wrong. Oh well.
return response.status_code
def close(pull):
url = "https://api.github.com/repos/{}/{}/pulls/{}".format(os.environ.get("GITHUB_USERNAME"), os.environ.get("GITHUB_REPO"), pull['number'])
payload = {"state" : "closed"}
response = requests.put(url, auth=get_auth(), data=payload)
if response.status_code == 200:
#Close was successful
return True
else:
#Something went wrong. Oh well.
return response.status_code
Now merge works just fine, when I run it with a pull request, the pull request is merged and it feels good.
But close gives me a 404. This is strange since merge can clearly find the pull request, and also shows that I clearly have permissions set up properly so I can close the request.
I have also confirmed that I can close the request manually by logging in on github and pressing the 'close pull request' button.
Why does github give me a 404 for the close function but not the merge function? What is different between these two functions?
The answer is that the 'update a pull request' api call should be a POST request, not a put request.
Changing
response = requests.put(url, auth=get_auth(), data=payload)
to
response = requests.post(url, auth=get_auth(), data=payload)
Fixed the issue.
I have been using this function to handle http requests with no problems:
def do_request(self, method, url, **kwargs):
params = kwargs.get('params', None)
headers = kwargs.get('headers', None)
payload = kwargs.get('data', None)
request_method = {'GET':requests.get, 'POST': requests.post, 'PUT': requests.put, 'DELETE': requests.delete}
request_url = url
req = request_method[method]
try:
res = req(request_url, headers=headers, params=params, data=json.dumps(payload))
except (requests.exceptions.ConnectionError, requests.exceptions.RequestException) as e:
data = {'has_error':True, 'error_message':e.message}
return data
try:
data = res.json()
data.update({'has_error':False, 'error_message':''})
except ValueError as e:
msg = "Cannot read response, %s" %(e.message)
data = {'has_error':True, 'error_message':msg}
if not res.ok:
msg = "Response not ok"
data.update({'has_error':True, 'error_message':msg})
if res.status_code >= 400:
msg = 'Error code: ' + str(res.status_code) + '\n' + data['errorCode']
data.update({'has_error':True, 'error_message': msg})
return data
When I have to do a DELETE request without body entity I have no problems but when I try to add one (when required by the server) I get an error message from the server telling that the body cannot be null as if no body has been sent. Any ideas why this might be happening? I'm using requests module and python 2.7.12. As far as I know data can be send in a DELETE request. Thanks!
There are problems with some clients and some servers when DELETE includes entity body: Is an entity body allowed for an HTTP DELETE request? for example & lots of search results.
Some servers (apparently) convert the DELETE into a POST, others simply perform the DELETE but drop the body. In your case, you've investigated that indeed, the body of a DELETE is dropped by the server & it has been suggested that you change the DELETE to POST.
Mmm... I can send a DELETE with body with Postman and works OK. But I cant get the same result with Requests 2.17.3
This is a issue related to Requests
I'm trying to build a simple proxy using Flask and requests. The code is as follows:
#app.route('/es/<string:index>/<string:type>/<string:id>',
methods=['GET', 'POST', 'PUT']):
def es(index, type, id):
elasticsearch = find_out_where_elasticsearch_lives()
# also handle some authentication
url = '%s%s%s%s' % (elasticsearch, index, type, id)
esreq = requests.Request(method=request.method, url=url,
headers=request.headers, data=request.data)
resp = requests.Session().send(esreq.prepare())
return resp.text
This works, except that it loses the status code from Elasticsearch. I tried returning resp (a requests.models.Response) directly, but this fails with
TypeError: 'Response' object is not callable
Is there another, simple, way to return a requests.models.Response from Flask?
Ok, found it:
If a tuple is returned the items in the tuple can provide extra information. Such tuples have to be in the form (response, status, headers). The status value will override the status code and headers can be a list or dictionary of additional header values.
(Flask docs.)
So
return (resp.text, resp.status_code, resp.headers.items())
seems to do the trick.
Using text or content property of the Response object will not work if the server returns encoded data (such as content-encoding: gzip) and you return the headers unchanged. This happens because text and content have been decoded, so there will be a mismatch between the header-reported encoding and the actual encoding.
According to the documentation:
In the rare case that you’d like to get the raw socket response from the server, you can access r.raw. If you want to do this, make sure you set stream=True in your initial request.
and
Response.raw is a raw stream of bytes – it does not transform the response content.
So, the following works for gzipped data too:
esreq = requests.Request(method=request.method, url=url,
headers=request.headers, data=request.data)
resp = requests.Session().send(esreq.prepare(), stream=True)
return resp.raw.read(), resp.status_code, resp.headers.items()
If you use a shortcut method such as get, it's just:
resp = requests.get(url, stream=True)
return resp.raw.read(), resp.status_code, resp.headers.items()
Flask can return an object of type flask.wrappers.Response.
You can create one of these from your requests.models.Response object r like this:
from flask import Response
return Response(
response=r.reason,
status=r.status_code,
headers=dict(r.headers)
)
I ran into the same scenario, except that in my case my requests.models.Response contained an attachment. This is how I got it to work:
return send_file(BytesIO(result.content), mimetype=result.headers['Content-Type'], as_attachment=True)
My use case is to call another API in my own Flask API. I'm just propagating unsuccessful requests.get calls through my Flask response. Here's my successful approach:
headers = {
'Authorization': 'Bearer Muh Token'
}
try:
response = requests.get(
'{domain}/users/{id}'\
.format(domain=USERS_API_URL, id=hit['id']),
headers=headers)
response.raise_for_status()
except HTTPError as err:
logging.error(err)
flask.abort(flask.Response(response=response.content, status=response.status_code, headers=response.headers.items()))