What's the correct way of reusing Python Requests connections in Django across multiple HTTP requests. That's what I'm doing currently:
import requests
def do_request(data):
return requests.get('http://foo.bar/', data=data, timeout=4)
def my_view1(request)
req = do_request({x: 1})
...
def my_view2(request)
req = do_request({y: 2})
...
So, I have one function that makes the request. This function is called in various Django views. The views get called in separate HTTP requests by users. My question is: Does Python Requests automatically reuse the same connections (via urllib3 connection pooling)?
Or do I have to first create a Requests session object to work with?
s = requests.Session()
def do_request(data):
return s.get('http://foo.bar/', data=data, auth=('user', 'pass'), timeout=4).text
And if so, does the session object have to be created in global scope or should it be inside the function?
def do_request(data):
s = requests.Session()
return s.get('http://foo.bar/', data=data, auth=('user', 'pass'), timeout=4).text
I can have multiple HTTP requests at the same time, so the solution needs to e thread safe ... I'm a newbie to connection pooling, so I really am not sure and the Requests docs aren't that extensive here.
Create a session, keep the session maintained by passing it through functions and returning it, or create the session object at global level or class level, so the latest state is maintained whenever it is referenced. And it will work like a charm.
For thread-safety, do not hold a global Session object. They should be mostly safe, but there are unresolved discussions about it. See Is the Session object from Python's Requests library thread safe? and Document threading contract for Session class #2766.
Yet, for the purpose of reusing connections, it should be safe to hold a global HTTPAdapter instance, the class actually operating the underlying urllib3 PoolManager and ConnectionPool.
Mount the adapter in your sessions and maybe use the sessions from a service class for easier access and setup.
import atexit
import requests.adapters
adapter = requests.adapters.HTTPAdapter()
# Fix ResourceWarning, but at interpreter exit
# the connections are about to be closed anyway
atexit.register(adapter.close)
class FooService():
URL = "http://foo.bar/"
def __init__(self):
self.session = requests.Session()
self.session.mount("http://", adapter) # <------ THIS
def get_bar(self, data):
return self.session.get(self.URL, data)
def my_view1(request)
foo = FooService()
foo.get_bar({"y": 1})
# ...
def my_view2(request)
foo = FooService()
for y in range(1234):
foo.get_bar({"y": y})
# ...
Related
I have an own class derived from BaseHTTPRequestHandler, which implements my specific GET method. This works quite fine:
from http.server import BaseHTTPRequestHandler, HTTPServer
class MyHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
""" my implementation of the GET method """
myServer = HTTPServer(("127.0.0.1", 8099), MyHTTPRequestHandler)
myServer.handle_request()
But why do I need to pass my class MyHTTPRequestHandler to the HTTPServer? I know that it is required by documentation:
class http.server.BaseHTTPRequestHandler(request, client_address, server)
This class is used to handle the HTTP requests that arrive at the server. By itself, it cannot respond to any actual HTTP requests; it
must be subclassed to handle each request method (e.g. GET or POST).
BaseHTTPRequestHandler provides a number of class and instance
variables, and methods for use by subclasses.
The handler will parse the request and the headers, then call a method specific to the request type. The method name is constructed
from the request. For example, for the request method SPAM, the
do_SPAM() method will be called with no arguments. All of the relevant
information is stored in instance variables of the handler. Subclasses
should not need to override or extend the init() method.
But I do want to pass an instantiated object of my subclass instead. I don't understand why this has been designed like that and it looks like design failure to me. The purpose of object oriented programming with polymorphy is that I can subclass to implement a specific behavior with the same interfaces, so this seems to me as an unnecessary restriction.
That is what I want:
from http.server import BaseHTTPRequestHandler, HTTPServer
class MyHTTPRequestHandler(BaseHTTPRequestHandler):
def __init__(self, myAdditionalArg):
self.myArg = myAdditionalArg
def do_GET(self):
""" my implementation of the GET method """
self.wfile(bytes(self.myArg, "utf-8"))
# ...
myReqHandler = MyHTTPRequestHandler("mySpecificString")
myServer = HTTPServer(("127.0.0.1", 8099), myReqHandler)
myServer.handle_request()
But if I do that, evidently I receive the expected error message:
TypeError: 'MyHTTPRequestHandler' object is not callable
How can I workaround this so that I can still use print a specific string?
There is also a 2nd reason why I need this: I want that MyHTTPRequestHandler provides also more information about the client, which uses the GET method to retrieve data from the server (I want to retrieve the HTTP-Header of the client browser).
I just have one client which starts a single request to the server. If a solution would work in a more general context, I'll be happy, but I won't
need it for my current project.
Somebody any idea to do that?
A server needs to create request handlers as needed to handle all the requests coming in. It would be bad design to only have one request handler. If you passed in an instance, the server could only ever handle one request at a time and if there were any side effects, it would go very very badly. Any sort of change of state is beyond the scope of what a request handler should do.
BaseHTTPRequestHandler has a method to handle message logging, and an attribute self.headers containing all the header information. It defaults to logging messages to sys.stderr, so you could do $ python -m my_server.py 2> log_file.txt to capture the log messages. or, you could write to file in your own handler.
class MyHTTPRequestHandler(BaseHTTPRequestHandler):
log_file = os.path.join(os.path.abspath(''), 'log_file.txt') # saves in directory where server is
myArg = 'some fancy thing'
def do_GET(self):
# do things
self.wfile.write(bytes(self.myArg,'utf-8'))
# do more
msg_format = 'start header:\n%s\nend header\n'
self.log_message(msg_format, self.headers)
def log_message(self, format_str, *args): # This is basically a copy of original method, just changing the destination
with open(self.log_file, 'a') as logger:
logger.write("%s - - [%s] %s\n" %
self.log_date_time_string(),
format%args))
handler = MyHTTPRequestHandler
myServer = HTTPServer(("127.0.0.1", 8099), handler)
It is possible to derive a specific HTTPServer class (MyHttpServer), which has the following attributes:
myArg: the specific "message text" which shall be printed by the HTTP
request handler
headers: a dictionary storing the headers set by a
HTTP request handler
The server class must be packed together with MyHTTPRequestHandler. Furthermore the implementation is only working properly under the following conditions:
only one HTTP request handler requests an answer from the server at the same time (otherwise data kept by the attributes are corrupted)
MyHTTPRequestHandler is only used with MyHttpServer and vice versa (otherwise unknown side effects like exceptions or data corruption can occur)
That's why both classes must be packed and shipped together in a way like this:
from http.server import BaseHTTPRequestHandler, HTTPServer
class MyHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
self.wfile.write(bytes(self.server.myArg, 'utf-8'))
#...
self.server.headers = self.headers
class MyHttpServer(HTTPServer):
def __init__(self, server_address, myArg, handler_class=MyHttpRequestHandler):
super().__init__(server_address, handler_class)
self.myArg = myArg
self.headers = dict()
The usage of these classes could look like this, whereas only one request of a client (i.e. Web-Browser) is answered by the server:
def get_header():
httpd = MyHttpServer(("127.0.0.1", 8099), "MyFancyText", MyHttpRequestHandler)
httpd.handle_request()
return httpd.headers
I have an API I have written in flask. It uses sqlalchemy to deal with a MySQL database. I don't use flask-sqlalchemy, because I don't like how the module forces you into a certain pattern for declaring the model.
I'm having a problem in which my database connections are not closing. The object representing the connection is going out of scope, so I assume it is being garbage collected. I also explicitly call close() on the session. Despite this, the connections stay open long after the API call has returned its response.
sqlsession.py: Here is the wrapper I am using for the session.
class SqlSession:
def __init__(self, conn=Constants.Sql):
self.db = SqlSession.createEngine(conn)
Session = sessionmaker(bind=self.db)
self.session = Session()
#staticmethod
def createEngine(conn):
return create_engine(conn.URI.format(user=conn.USER, password=conn.PASS, host=conn.HOST, port=conn.PORT, database=conn.DATABASE, poolclass=NullPool))
def close(self):
self.session.close()
flaskroutes.py: Here is an example of the flask app instantiating and using the wrapper object. Note that it instantiates it in the beginning within the scope of the api call, then closes the session at the end, and presumably is garbage collected after the response is returned.
def commands(self, deviceId):
sqlSession = SqlSession(self.sessionType) <---
commandsQueued = getCommands()
jsonCommands = []
for command in commandsQueued:
jsonCommand = command.returnJsonObject()
jsonCommands.append(jsonCommand)
sqlSession.session.delete(command)
sqlSession.session.commit()
resp = jsonify({'commands': jsonCommands})
sqlSession.close() <---
resp.status_code = 200
return resp
I would expect the connections to be cleared as soon as the HTTP response is made, but instead, the connections end up with the "SLEEP" state (when viewed in the MySQL command line interface 'show processlist').
I ended up using the advice from this SO post:
How to close sqlalchemy connection in MySQL
I strongly recommend reading that post to anyone having this problem. Basically, I added a dispose() call to the close method. Doing so causes the entire connection to be destroyed, while closing simply returns connections to an available pool (but leave them open).
def close(self):
self.session.close()
self.db.dispose()
This whole this was a bit confusing to me, but at least now I understand more about the connection pool.
I'm needing some help setting up unittests for Google Cloud Endpoints. Using WebTest all requests answer with AppError: Bad response: 404 Not Found. I'm not really sure if endpoints is compatible with WebTest.
This is how the application is generated:
application = endpoints.api_server([TestEndpoint], restricted=False)
Then I use WebTest this way:
client = webtest.TestApp(application)
client.post('/_ah/api/test/v1/test', params)
Testing with curl works fine.
Should I write tests for endpoints different? What is the suggestion from GAE Endpoints team?
After much experimenting and looking at the SDK code I've come up with two ways to test endpoints within python:
1. Using webtest + testbed to test the SPI side
You are on the right track with webtest, but just need to make sure you correctly transform your requests for the SPI endpoint.
The Cloud Endpoints API front-end and the EndpointsDispatcher in dev_appserver transforms calls to /_ah/api/* into corresponding "backend" calls to /_ah/spi/*. The transformation seems to be:
All calls are application/json HTTP POSTs (even if the REST endpoint is something else).
The request parameters (path, query and JSON body) are all merged together into a single JSON body message.
The "backend" endpoint uses the actual python class and method names in the URL, e.g. POST /_ah/spi/TestEndpoint.insert_message will call TestEndpoint.insert_message() in your code.
The JSON response is only reformatted before being returned to the original client.
This means you can test the endpoint with the following setup:
from google.appengine.ext import testbed
import webtest
# ...
def setUp(self):
tb = testbed.Testbed()
tb.setup_env(current_version_id='testbed.version') #needed because endpoints expects a . in this value
tb.activate()
tb.init_all_stubs()
self.testbed = tb
def tearDown(self):
self.testbed.deactivate()
def test_endpoint_insert(self):
app = endpoints.api_server([TestEndpoint], restricted=False)
testapp = webtest.TestApp(app)
msg = {...} # a dict representing the message object expected by insert
# To be serialised to JSON by webtest
resp = testapp.post_json('/_ah/spi/TestEndpoint.insert', msg)
self.assertEqual(resp.json, {'expected': 'json response msg as dict'})
The thing here is you can easily setup appropriate fixtures in the datastore or other GAE services prior to calling the endpoint, thus you can more fully assert the expected side effects of the call.
2. Starting the development server for full integration test
You can start the dev server within the same python environment using something like the following:
import sys
import os
import dev_appserver
sys.path[1:1] = dev_appserver._DEVAPPSERVER2_PATHS
from google.appengine.tools.devappserver2 import devappserver2
from google.appengine.tools.devappserver2 import python_runtime
# ...
def setUp(self):
APP_CONFIGS = ['/path/to/app.yaml']
python_runtime._RUNTIME_ARGS = [
sys.executable,
os.path.join(os.path.dirname(dev_appserver.__file__),
'_python_runtime.py')
]
options = devappserver2.PARSER.parse_args([
'--admin_port', '0',
'--port', '8123',
'--datastore_path', ':memory:',
'--logs_path', ':memory:',
'--skip_sdk_update_check',
'--',
] + APP_CONFIGS)
server = devappserver2.DevelopmentServer()
server.start(options)
self.server = server
def tearDown(self):
self.server.stop()
Now you need to issue actual HTTP requests to localhost:8123 to run tests against the API, but again can interact with GAE APIs to set up fixtures, etc. This is obviously slow as you're creating and destroying a new dev server for every test run.
At this point I use the Google API Python client to consume the API instead of building the HTTP requests myself:
import apiclient.discovery
# ...
def test_something(self):
apiurl = 'http://%s/_ah/api/discovery/v1/apis/{api}/{apiVersion}/rest' \
% self.server.module_to_address('default')
service = apiclient.discovery.build('testendpoint', 'v1', apiurl)
res = service.testresource().insert({... message ... }).execute()
self.assertEquals(res, { ... expected reponse as dict ... })
This is an improvement over testing with CURL as it gives you direct access to the GAE APIs to easily set up fixtures and inspect internal state. I suspect there is an even better way to do integration testing that bypasses HTTP by stitching together the minimal components in the dev server that implement the endpoint dispatch mechanism, but that requires more research time than I have right now.
webtest can be simplified to reduce naming bugs
for the following TestApi
import endpoints
import protorpc
import logging
class ResponseMessageClass(protorpc.messages.Message):
message = protorpc.messages.StringField(1)
class RequestMessageClass(protorpc.messages.Message):
message = protorpc.messages.StringField(1)
#endpoints.api(name='testApi',version='v1',
description='Test API',
allowed_client_ids=[endpoints.API_EXPLORER_CLIENT_ID])
class TestApi(protorpc.remote.Service):
#endpoints.method(RequestMessageClass,
ResponseMessageClass,
name='test',
path='test',
http_method='POST')
def test(self, request):
logging.info(request.message)
return ResponseMessageClass(message="response message")
the tests.py should look like this
import webtest
import logging
import unittest
from google.appengine.ext import testbed
from protorpc.remote import protojson
import endpoints
from api.test_api import TestApi, RequestMessageClass, ResponseMessageClass
class AppTest(unittest.TestCase):
def setUp(self):
logging.getLogger().setLevel(logging.DEBUG)
tb = testbed.Testbed()
tb.setup_env(current_version_id='testbed.version')
tb.activate()
tb.init_all_stubs()
self.testbed = tb
def tearDown(self):
self.testbed.deactivate()
def test_endpoint_testApi(self):
application = endpoints.api_server([TestApi], restricted=False)
testapp = webtest.TestApp(application)
req = RequestMessageClass(message="request message")
response = testapp.post('/_ah/spi/' + TestApi.__name__ + '.' + TestApi.test.__name__, protojson.encode_message(req),content_type='application/json')
res = protojson.decode_message(ResponseMessageClass,response.body)
self.assertEqual(res.message, 'response message')
if __name__ == '__main__':
unittest.main()
I tried everything I could think of to allow these to be tested in the normal way. I tried hitting the /_ah/spi methods directly as well as even trying to create a new protorpc app using service_mappings to no avail. I'm not a Googler on the endpoints team so maybe they have something clever to allow this to work but it doesn't appear that simply using webtest will work (unless I missed something obvious).
In the meantime you can write a test script that starts the app engine test server with an isolated environment and just issue http requests to it.
Example to run the server with an isolated environment (bash but you can easily run this from python):
DATA_PATH=/tmp/appengine_data
if [ ! -d "$DATA_PATH" ]; then
mkdir -p $DATA_PATH
fi
dev_appserver.py --storage_path=$DATA_PATH/storage --blobstore_path=$DATA_PATH/blobstore --datastore_path=$DATA_PATH/datastore --search_indexes_path=$DATA_PATH/searchindexes --show_mail_body=yes --clear_search_indexes --clear_datastore .
You can then just use requests to test ala curl:
requests.get('http://localhost:8080/_ah/...')
If you don't want to test the full HTTP stack as described by Ezequiel Muns, you can also just mock out endpoints.method and test your API definition directly:
def null_decorator(*args, **kwargs):
def decorator(method):
def wrapper(*args, **kwargs):
return method(*args, **kwargs)
return wrapper
return decorator
from google.appengine.api.users import User
import endpoints
endpoints.method = null_decorator
# decorator needs to be mocked out before you load you endpoint api definitions
from mymodule import api
class FooTest(unittest.TestCase):
def setUp(self):
self.api = api.FooService()
def test_bar(self):
# pass protorpc messages directly
self.api.foo_bar(api.MyRequestMessage(some='field'))
My solution uses one dev_appserver instance for the entire test module, which is faster than restarting the dev_appserver for each test method.
By using Google's Python API client library, I also get the simplest and at the same time most powerful way of interacting with my API.
import unittest
import sys
import os
from apiclient.discovery import build
import dev_appserver
sys.path[1:1] = dev_appserver.EXTRA_PATHS
from google.appengine.tools.devappserver2 import devappserver2
from google.appengine.tools.devappserver2 import python_runtime
server = None
def setUpModule():
# starting a dev_appserver instance for testing
path_to_app_yaml = os.path.normpath('path_to_app_yaml')
app_configs = [path_to_app_yaml]
python_runtime._RUNTIME_ARGS = [
sys.executable,
os.path.join(os.path.dirname(dev_appserver.__file__),
'_python_runtime.py')
]
options = devappserver2.PARSER.parse_args(['--port', '8080',
'--datastore_path', ':memory:',
'--logs_path', ':memory:',
'--skip_sdk_update_check',
'--',
] + app_configs)
global server
server = devappserver2.DevelopmentServer()
server.start(options)
def tearDownModule():
# shutting down dev_appserver instance after testing
server.stop()
class MyTest(unittest.TestCase):
#classmethod
def setUpClass(cls):
# build a service object for interacting with the api
# dev_appserver must be running and listening on port 8080
api_root = 'http://127.0.0.1:8080/_ah/api'
api = 'my_api'
version = 'v0.1'
discovery_url = '%s/discovery/v1/apis/%s/%s/rest' % (api_root, api,
version)
cls.service = build(api, version, discoveryServiceUrl=discovery_url)
def setUp(self):
# create a parent entity and store its key for each test run
body = {'name': 'test parent'}
response = self.service.parent().post(body=body).execute()
self.parent_key = response['parent_key']
def test_post(self):
# test my post method
# the tested method also requires a path argument "parent_key"
# .../_ah/api/my_api/sub_api/post/{parent_key}
body = {'SomeProjectEntity': {'SomeId': 'abcdefgh'}}
parent_key = self.parent_key
req = self.service.sub_api().post(body=body,parent_key=parent_key)
response = req.execute()
etc..
After digging through the sources, I believe things have changed in endpoints since Ezequiel Muns's (excellent) answer in 2014. For method 1 you now need to request from /_ah/api/* directly and use the correct HTTP method instead of using the /_ah/spi/* transformation. This makes the test file look like this:
from google.appengine.ext import testbed
import webtest
# ...
def setUp(self):
tb = testbed.Testbed()
# Setting current_version_id doesn't seem necessary anymore
tb.activate()
tb.init_all_stubs()
self.testbed = tb
def tearDown(self):
self.testbed.deactivate()
def test_endpoint_insert(self):
app = endpoints.api_server([TestEndpoint]) # restricted is no longer required
testapp = webtest.TestApp(app)
msg = {...} # a dict representing the message object expected by insert
# To be serialised to JSON by webtest
resp = testapp.post_json('/_ah/api/test/v1/insert', msg)
self.assertEqual(resp.json, {'expected': 'json response msg as dict'})
For searching's sake, the symptom of using the old method is endpoints raising a ValueError with Invalid request path: /_ah/spi/whatever. Hope that saves someone some time!
I am developing an API using Django-TastyPie.
What API do?
It checks that if two or more requests are there on the server if yes it swap the data of both the requests and return a json response after 7 second delay.
What i need to do is send multiple asynchronous requests to the server to test this API.
I am using Django-Unit Test along with Tasty-Pie to test this functionality.
Problem
Django develpment server is single threaded so it does not support asynchronous requests
Solution tried:
I have tried to solve this by using multiprocessing:
class MatchResourceTest(ResourceTestCase):
def setUp(self):
super(MatchResourceTest, self).setUp()
self.user=""
self.user_list = []
self.thread_list = []
# Create and get user
self.assertHttpCreated(self.api_client.post('/api/v2/user/', format='json', data={'username': '123456','device': 'abc'}))
self.user_list.append( User.objects.get(username='123456') )
# Create and get other_user
self.assertHttpCreated(self.api_client.post('/api/v2/user/', format='json', data={'username': '456789','device': 'xyz'}))
self.user_list.append( User.objects.get(username='456789') )
def get_credentials(self):
return self.create_apikey(username=self.user.username, api_key=self.user.api_key.key)
def get_url(self):
resp = urllib2.urlopen(self.list_url).read()
self.assertHttpOK(resp)
def test_get_list_json(self):
for user in self.user_list:
self.user = user
self.list_url = 'http://127.0.0.1:8000/api/v2/match/?name=hello'
t = multiprocessing.Process(target=self.get_url)
t.start()
self.thread_list.append( t )
for t in self.thread_list:
t.join()
print ContactCardShare.objects.all()
Please suggest me any solution to test this API by sending asychronous requests
or
any APP , Library or any this which allow django development server to handle multiple requests asychronously
As far as I know, django's development server is multi-threaded.
I'm not sure this test is formatted correctly though. The test setUp shouldn't include tests itself, it should be foolproof data insertion by creating entries. The post should have it's own test.
See the tastypie docs for an example test case.
I am developing an application which uses a series of REST calls to retrieve data. I have the basic application logic complete and the structure for data retrieval is roughly as follows.
1) the initial data call is completed
2) for each response in the initial call a subsequent data call is performed to a rest service requiring basic authentication.
Performing these calls in sequential order can add up to a long wait time by the end user, I am therefore trying to implement threading to speed up the process (being IO bound makes this an ideal candidate for threading). The problem is I am having problems with the authentication on the threaded calls.
If I perform the calls sequentially then everything works fine but if I set it up with the threaded approach I end up with 401 authentication errors or 500 internal server errors from the server.
I have talked to the REST service admins and they know of nothing that would prevent concurrent connections from the same user on the server end so I am wondering if this is an issue on the urllib2 end.
Does anyone have any experience with this?
EDIT:
While I am unable to post the exact code I will post a reasonable representation of what I am doing with very similar structure.
import threading
class UrlThread(threading.Thread):
def __init__(self, data):
threading.Thread.__init__(self)
self.data = data
def run(self):
password_manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_manager.add_password(None, 'https://url/to/Rest_Svc/', 'uid', 'passwd')
auth_manager = urllib2.HTTPBasicAuthHandler(password_manager)
opener = urllib2.build_opener(auth_manager)
urllib2.install_opener(opener)
option = data[0]
urlToOpen = 'https://url/to/Rest_Svc/?option='+option
rawData = urllib2.urlopen(urlToOpen)
wsData = rawData.readlines()
if wsData:
print('success')
#firstCallRows is a list of lists containing the data returned
#from the initial call I mentioned earlier.
thread_list = []
for row in firstCallRows:
t = UrlThread(row)
t.setDaemon(True)
t.start()
thread_list.append(t)
for thread in thread_list:
thread.join()
With Requests you could do something like this:
from requests import session, async
auth = ('username', 'password')
url = 'http://example.com/api/'
options = ['foo1', 'foo2', 'foo3']
s = session(auth=auth)
rs = [async.get(url, params={'option': opt}, session=s) for opt in options]
responses = async.imap(rs)
for r in responses:
print r.text
Relevant documentation:
Sessions
Asynchronous requests
Basic authentication