Related
I am using this code to get data from Twitter API.
The code works, but I cannot understand how.
Specifically, I cannot understand how the auth=bearer_oauth argument works, since I am passing a function. And how the function works, since I am calling it without its argument.
Sorry if this is too basic, but I could not find an answer.
import requests
bearer_token = "AAA"
api_url = "https://api.twitter.com/2/tweets/search/recent"
def bearer_oauth(r):
r.headers["Authorization"] = f"Bearer {bearer_token}"
return r
def connect_to_endpoint(url, params):
response = requests.get(url, auth=bearer_oauth)
return response
query_params = {'query': 'test'}
json_response = connect_to_endpoint(api_url, query_params)
The bearer_oauth function is just setting the request's authorization header to the bearer token before the request is sent.
The code you provided essentially has the same functionality as this:
headers = {"Authorization": f"Bearer {bearer_token}"
requests.get(url, headers=headers)
After you send the request, Twitter's server parses the authorization header and checks that the bearer token you supplied is valid and has access to the requested resources.
As for why your specific code works, bearer_oauth is an authentication handler that gets attached to the request. The handler gets called when the request is constructed. You don't need to pass the request object because the handler is part of it already.
If you're curious about the implementation, I'd read the internal code here. It looks like the request object is passed to the handler, which then modifies the request (in this case, by setting the authorization header), and then returns the modified request object back to the internal function preparing the request. Then, all of the modified request object's attributes are copied:
# Allow auth to make its changes.
r = auth(self)
# Update self to reflect the auth changes.
self.__dict__.update(r.__dict__)
Since __dict__ is an internal dictionary that holds all the attributes of a single object, everything that was changed about the request object in the handler function will be copied and included in the request before it is sent.
Consider an http request using an OAuth token. The access token needs to be included in the header as bearer. However, if the token is expired, another request needs to be made to refresh the token and then try again. So the custom Retry object will look like:
s = requests.Session()
### token is added to the header here
s.headers.update(token_header)
retry = OAuthRetry(
total=2,
read=2,
connect=2,
backoff_factor=1,
status_forcelist=[401],
method_whitelist=frozenset(['GET', 'POST']),
session=s
)
adapter = HTTPAdapter(max_retries=retry)
s.mount('http://', adapter)
s.mount('https://', adapter)
r = s.post(url, data=data)
The Retry class:
class OAuthRetry(Retry):
def increment(self, method, url, *args, **kwargs):
# refresh the token here. This could be by getting a reference to the session or any other way.
return super(OAuthRetry, self).increment(method, url, *args, **kwargs)
The problem is that after the token is refreshed, HTTPConnectionPool is still using the same headers to make the request after calling increment. See: https://github.com/urllib3/urllib3/blob/master/src/urllib3/connectionpool.py#L787.
Although the instance of the pool is passed in increment, changing the headers there will not affect the call since it is using a local copy of the headers.
This seems like a use case that should come up frequently for the request parameters to change in between retries.
Is there a way to change the request headers in between two subsequent retries?
No, in current version of Requests(2.18.4) and urllib3(1.22).
Retrys is finally handled by openurl in urllib3. And by trace the code of the whole function, there is not a interface to change headers between retrys.
And dynamically changing headers should not be considered as a solution. From the doc:
headers – Dictionary of custom headers to send, such as User-Agent, If-None-Match, etc. If None, pool headers are used. If provided, these headers completely replace any pool-specific headers.
headers is a param passed to the function. And there is no guarantee that it will not be copy after passed. Although in current version of urllib3, openurl does not copy headers, any solution based on changing headers is considered hacky, since it's based on the implementation but not the documentation.
One work around
Interrupt a function and edit some verible it's using is very dangerous.
Instead of injecting something into urllib3, one simple solution is that check the response status and try again if needed.
r = s.post(url, data=data)
if r.status_code == 401:
# refresh the token here.
r = s.post(url, data=data)
Why does the original approach not work?
Requests copy the header in prepare_headers before sending it to urllib3. So urllib3 use the copy created before editing when retrying.
I am trying to delete a git branch from gitlab, using the gitlab API with a personal access token.
If I use curl like this:
curl --request DELETE --header "PRIVATE_TOKEN: somesecrettoken" "deleteurl"
then it works and the branch is deleted.
But if I use requests like this:
token_data = {'private_token': "somesecrettoken"}
requests.Request("DELETE", url, data= token_data)
it doesn't work; the branch is not deleted.
Your requests code is indeed not doing the same thing. You are setting data=token_data, which puts the token in the request body. The curl command-line uses a HTTP header instead, and leaves the body empty.
Do the same in Python:
token_data = {'Private-Token': "somesecrettoken"}
requests.Request("DELETE", url, headers=token_data)
You can also put the token in the URL parameters, via the params argument:
token_data = {'private_token': "somesecrettoken"}
requests.Request("DELETE", url, params=token_data)
This adds ?private_token=somesecrettoken to the URL sent to gitlab.
However, GitLab does accept the private_token value in the request body as well, either as form data or as JSON. Which means that you are using the requests API wrong.
A requests.Request() instance is not going to be sent without additional work. It is normally only needed if you want to access the prepared data before sending.
If you don't need to use this more advanced feature, use the requests.delete() method:
response = requests.delete(url, headers=token_data)
If you do need the feature, use a requests.Session() object, then first prepare the request object, then send it:
with requests.Session() as session:
request = requests.Request("DELETE", url, params=token_data)
prepped = request.prepare()
response = session.send(prepped)
Even without needing to use prepared requests, a session is very helpful when using an API. You can set the token once, on a session:
with requests.Session() as session:
session.headers['Private-Token'] = 'somesecrettoken'
# now all requests via the session will use this header
response = session.get(url1)
response = session.post(url2, json=....)
response = session.delete(url3)
# etc.
I want to make multiple internal REST API call from my Django TemplateView, using requests library. Now I want to pass the session too from template view to api call. What is the recommended way to do that, keeping performance in mind.
Right now, I'm extracting cookie from the current request object in template view, and passing that to requests.get() or requests.post() call. But problem with that is, I would have to pass request object to my API Client, which I don't want.
This the current wrapper I'm using to route my requests:
def wrap_internal_api_call(request, requests_api, uri, data=None, params=None, cookies=None, is_json=False, files=None):
headers = {'referer': request.META.get('HTTP_REFERER')}
logger.debug('Request API: %s calling URL: %s', requests_api, uri)
logger.debug('Referer header sent with requests: %s', headers['referer'])
if cookies:
csrf_token = cookies.get('csrftoken', None)
else:
csrf_token = request.COOKIES.get('csrftoken', None)
if csrf_token:
headers['X-CSRFToken'] = csrf_token
if data:
if is_json:
return requests_api(uri, json=data, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
elif not files:
return requests_api(uri, data=data, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
else:
return requests_api(uri, data=data, files=files, params=params, cookies=cookies if cookies else request.COOKIES,
headers=headers)
else:
return requests_api(uri, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
Basically I want to get rid of that request parameter (1st param), because then to call it I've to keep passing request object from TemplateViews to internal services. Also, how can I keep persistent connection across multiple calls?
REST vs Invoking the view directly
While it's possible for a web app to make a REST API call to itself. That's not what REST is designed for. Consider the following from: https://docs.djangoproject.com/ja/1.9/topics/http/middleware/
As you can see a django request/response cycle has quite a bit of overhead. Add to this the overhead of webserver and wsgi container. At the client side you have the overhead associated with the requests library, but hang on a sec, the client also happens to be the same web app so it become s part of the web app's overhead too. And there is the problem of peristence (which I will come to shortly).
Last but not least, if you have a DNS round robin setup your request may actually go out on the wire before coming back to the same server. There is a better way, to invoke the view directly.
To invoke another view without the rest API call is really easy
other_app.other_view(request, **kwargs)
This has been discussed a few times here at links such as Django Call Class based view from another class based view and Can I call a view from within another view? so I will not elaborate.
Persistent requests
Persistent http requests (talking about python requests rather than django.http.request.HttpRequest) are managed through session objects (again not to be confused with django sessions). Avoiding confusion is really difficult:
The Session object allows you to persist certain parameters across
requests. It also persists cookies across all requests made from the
Session instance, and will use urllib3's connection pooling. So if
you're making several requests to the same host, the underlying TCP
connection will be reused, which can result in a significant
performance increase
Different hits to your django view will probably be from different users so you don't want to same cookie reused for the internal REST call. The other problem is that the python session object cannot be persisted between two different hit to the django view. Sockets cannot generally be serialized, a requirement for chucking them into memcached or redis.
If you still want to persist with internal REST
I think #julian 's answer shows how to avoid passing the django request instance as a parameter.
If you want to avoid passing the request to wrap_internal_api_call, all you need to do is do a bit more work on the end of the TemplateView where you call the api wrapper. Note that your original wrapper is doing a lot of cookies if cookies else request.COOKIES. You can factor that out to the calling site. Rewrite your api wrapper as follows:
def wrap_internal_api_call(referer, requests_api, uri, data=None, params=None, cookies, is_json=False, files=None):
headers = {'referer': referer}
logger.debug('Request API: %s calling URL: %s', requests_api, uri)
logger.debug('Referer header sent with requests: %s', referer)
csrf_token = cookies.get('csrftoken', None)
if csrf_token:
headers['X-CSRFToken'] = csrf_token
if data:
if is_json:
return requests_api(uri, json=data, params=params, cookies=cookies, headers=headers)
elif not files:
return requests_api(uri, data=data, params=params, cookies=cookies, headers=headers)
else:
return requests_api(uri, data=data, files=files, params=params, cookies=cookies, headers=headers)
else:
return requests_api(uri, params=params, cookies=cookies, headers=headers)
Now, at the place of invocation, instead of
wrap_internal_api_call(request, requests_api, uri, data, params, cookies, is_json, files)
do:
cookies_param = cookies or request.COOKIES
referer_param = request.META.get['HTTP_REFERER']
wrap_internal_api_call(referer_param, requests_api, uri, data, params, cookies_param, is_json, files)
Now you are not passing the request object to the wrapper anymore. This saves a little bit of time because you don't test cookies over and over, but otherwise it doesn't make a difference for performance. In fact, you could achieve the same slight performance gain just by doing the cookies or request.COOKIES once inside the api wrapper.
Networking is always the tightest bottleneck in any application. So if these internal APIs are on the same machine as your TemplateView, your best bet for performance is to avoid doing an API call.
Basically I want to get rid of that request parameter (1st param), because then to call it I've to keep passing request object from TemplateViews to internal services.
To pass function args without explicitly passing them into function calls you can use decorators to wrap your functions and automatically inject your arguments. Using this with a global variable and some django middleware for registering the request before it gets to your view will solve your problem. See below for an abstracted and simplified version of what I mean.
request_decorators.py
REQUEST = None
def request_extractor(func):
def extractor(cls, request, *args, **kwargs):
global REQUEST
REQUEST = request # this part registers request arg to global
return func(cls, request, *args, **kwargs)
return extractor
def request_injector(func):
def injector(*args, **kwargs):
global REQUEST
request = REQUEST
if len(args) > 0 and callable(args[0]): # to make it work with class methods
return func(args[0], request, args[1:], **kwargs) # class method
return func(request, *args, **kwargs) # function
return injector
extract_request_middleware.py
See the django docs for info on setting up middleware
from request_decorators import request_extractor
class ExtractRequest:
#request_extractor
def process_request(self, request):
return None
internal_function.py
from request_decorators import request_injector
#request_injector
def internal_function(request):
return request
your_view.py
from internal_function import internal_function
def view_with_request(request):
return internal_function() # here we don't need to pass in the request arg.
def run_test():
request = "a request!"
ExtractRequest().process_request(request)
response = view_with_request(request)
return response
if __name__ == '__main__':
assert run_test() == "a request!"
I am sending post request in the body of some json data, to process on server and I want the results back to client(c++ app on phone) in the form of json data and hence parse on mobile.
I have the following code inside handler:
class ServerHandler(tornado.web.RequestHandler):
def post(self):
data = tornado.escape.json_decode(self.request.body)
id = data.get('id',None)
#process data from db (take a while) and pack in result which is dictinary
result = process_data(id)# returns dictionary from db= takes time
print 'END OF HANDLER'
print json.dumps(result)
#before this code below I have tried also
#return result
#return self.write(result)
#return self.write(json.dumps(result))
#return json.dumps(result)
self.set_header('Content-Type', 'application/json')
json_ = tornado.escape.json_encode(result)
self.write(json_)
self.finish()
#return json.dumps(result)
I always get printed 'END OF HANDLER' and valid dictinary/json below on console but when I read at client mobile I always get
<html><title>405: Method Not Allowed</title><body>405: Method Not Allowed</body></html>
Does anyone have any idea what is the bug ?
(I am using CIwGameHttpRequest for sending request and it works when file is static =>name.json but now same content is giving error in post request. )
The error (HTTP 405 Method Not Allowed) means that you have made a request to a valid URL, but you are using an HTTP verb (e.g. GET, POST, PUT, DELETE) that cannot be used with that URL.
Your web service code appears to handle the POST verb, as evidenced by the post method name, and also by the fact that incoming requests appear to have a request body. You haven't shown us your C++ client code, so all I can do is to speculate that it is making a GET request. Does your C++ code call Request->setPOST();? (I haven't worked with CIwGameHttpRequest before, but Googling for it I found this page from which I took that line of code.)
I've not worked with Tornado before, but I imagine that there is some mechanism somewhere that allows you to connect a URL to a RequestHandler. Given that you have a 405 Method Not Allowed error rather than 404 Not Found, it seems that however this is done you've done it correctly. You issue a GET request to Tornado for the URL, it determines that it should call your handler, and only when it tries to use your handler it realises that it can't handle GET requests, concludes that your handler (and hence its URL) doesn't support GETs and returns a 405 error.