Python Webserver: How to serve requests asynchronously

Python Webserver: How to serve requests asynchronously - python

I need to create a python middleware that will do the following:
a) Accept http get/post requests from multiple clients.
b) Modify and Dispatch these requests to a backend remote application (via socket communication). I do not have any control over this remote application.
c) Receive processed results from backend application and return these results back to the requesting clients.
Now the clients are expecting a synchronous request/response scenario. But the backend application is not returning the results synchronously. That is, some requests take much longer to process than others. Hence,
Client 1 : send http request C1 --> get response R1
Client 2 : send http request C2 --> get response R2
Client 3 : send http request C3 --> get response R3
Python middleware receives them in some order: C2, C3, C1. Dispatches them in this order to backend (as non-http messages). Backend responds with results in mixed order R1, R3, R2. Python middleware should package these responses back into http response objects and send the response back to the relevant client.
Is there any sample code to program this sort of behavior. There seem to be something like 20 different web frameworks for python and I'm confused as to which one would be best for this scenario (would prefer something as lightweight as possible ... I would consider Django too heavy ... I tried bottle, but I am not sure how to go about programming that for this scenario).
================================================
Update (based on discussions below): Requests have a request id. Responses have a response id (which should match the request id that they correspond to). There is only one socket connection between the middleware and the remote backend application. While we can maintain a {request_id : ip_address} dictionary, the issue is how to construct a HTTP response object to the correct client. I assume, threading might solve this problem where each thread maintains its own response object.

Screw frameworks. This exactly the kind of task for asyncore. This module allows event-based network programming: given a set of sockets, it calls back given handlers when data is ready on any of them. That way, threads are not necessary just to dumbly wait for data on one socket to arrive and painfully pass it to another thread. You would have to implement the http handling yourself, but examples can be found on that. Alternatively, you could use the async feature of uwsgi, which would allow your application to be integrated with an existing webserver, but that does not integrate with asyncore by default --- though it wouldn't be hard to make it work. Depends on specific needs.

Quoting your comment:
The middleware uses a single persistent socket connection to the backend. All requests from middleware are forwarded via this single socket. Clients do send a request id along with their requests. Response id should match the request id. So the question remains: How does the middleware (web server) keep track of which request id belonged to which client? I mean, is there any way for a cgi script in middleware to create a db of tuples like and once a response id matches, then send a http response to clientip:clienttcpport ?
Is there any special reason for doing all this processing in a middleware? You should be able to do all this in a decorator, or somewhere else, if more appropriate.
Anyway, you need to maintain a global concurrent dictionary (extend dict and protect it using threading.Lock). Upon a new request, store the given request-id as key, and associate it to the respective client (sender). Whenever your backend responds, retrieve the client from this dictionary, and remove the entry so it doesn't accumulate forever.
UPDATE: someone already extended the dictionary for you - check this answer.

Ultimately your going from the synchronous http request-response protocol from your clients to an asynchronous queuing/messaging protocol with your backend. So you've two choices (1) either make requests wait until the backend has no outstanding work, then process one (2) write something that marries the backend responses with their associated request (using a dictionary of request or something)
One way might be to run your server in one thread while dealing with your backend in another (see... Run Python HTTPServer in Background and Continue Script Execution) or maybe look at aiohttp (https://docs.aiohttp.org/en/v0.12.0/web.html)

Related

How to pass request over Django channels WebSocket and call Django view

I'm working on a single page application with Django, and would like to use WebSockets, and therefore Channels. To keep things simple, I think I want to handle all server communication over a WebSocket alone, rather than adding XHR (XML HTTP Request) into the mix. I'm using channels from the get-go since there will be a lot of data pushed from the server to the client asynchronously.
With regular Django, a conventional request made to https://example.com/login or https://example.com/logout or whatever and the Django URL router will decide what view to send it to. Instead, I would like to have the user perform their action in the client, handle it with Javascript, and use the WebSocket to send the request to the server. Since I'm using Django-allauth, I would like to use the provided Django views to handle things like authentication. The server would then update the client with the necessary state information from the view.
My question: how can I process the data received over the WebSocket and submit the HTTP request to the Django view? My channels consumer would then take the rendered HTML and send it back to the client to update the page or section.
I can picture what would happen using XHR, but I'm trying to avoid mixing the two, unless someone can point out the usefulness in using XHR plus WebSockets...? I suppose another option is to use XHR for authentication and other client initiated requests, and use the WebSocket for asynchronously updating the client. Does this make any sense at all?
Update: It occurs to me that I could use requests from PyPi, and make an sync_to_async call to localhost using credentials I received over the WebSocket. However, this would require me to then handle the session data and send it back to the client. This seems like a lot more work. That said, I could maintain the sessions themselves on the server and just associate them with the WebSocket connection itself. Since I'm using a secure WebSocket wss:// is there any possibility for hijacking the WebSocket connection?

Check out this project that gives the ability to process a channels websocket request using Django Rest Framework views. You can try to adapt it to a normal Django view.
EDIT: I am quoting the following part of the DCRF docs in response to #hobs comments:
Using your normal views over a websocket connection
from djangochannelsrestframework.consumers import view_as_consumer
application = ProtocolTypeRouter({
"websocket": AuthMiddlewareStack(
URLRouter([
url(r"^front(end)/$", view_as_consumer(YourDjangoView)),
])
),
})
In this situation if your view needs to read the GET query string
values you can provides these using the query option. And if the view
method reads parameters from the URL you can provides these with the
parameters.
#hobs if you have a problem with the naming of the package or the functionality is not working as intended, please take it up with the developers on Github using their issue tracker.

Pending requests with Python's SimpleHTTPServer

I'm making an anonymous chat application, similar to Omegle. My method of approach instead of using sockets is to use a REST API, but to add a bit of a twist. When a user makes a request, such as POST /search (find a match), the request is held by the server until another user sends a POST /search. Once two people have done this, both requests are responded to which lets the client know to switch to a chat page. This is also done with a pending GET /events request, which is only responded to by the server if there's any new events to be sent.
This works very well in theory with the flow of this application; however, since I'm using SimpleHTTPServer - which is a very basic library - requests are not handling asynchronously. This means that if I block one request until information requirements are fulfilled, no other requests can be accepted. For this kind of project I really don't want to take the time to learn an entirely new library/sub-language for asynchronous requests handling, so I'm trying to figure out how I can do this.
def waitForMatch(client):
# if no other clients available to match:
if not pendingQueue:
# client added to pending queue
pendingQueue.append(client)
client["pending"] = True
while client["pending"]:
time.sleep(1)
# break out and let the other client's waitForMatch call create the chatsession
return
# now there's a client available to match
otherClient = pendingQueue.pop()
otherClient["pending"] = False
# chat session created
createChatSession(otherClient, client)
This is the code I currently have, which won't work with non-async requests.

Flask request waiting for asynchronous background job

I have an HTTP API using Flask and in one particular operation clients use it to retrieve information obtained from a 3rd party API. The retrieval is done with a celery task. Usually, my approach would be to accept the client request for that information and return a 303 See Other response with an URI that can be polled for the response as the background job is finished.
However, some clients require the operation to be done in a single request. They don't want to poll or follow redirects, which means I have to run the background job synchronously, hold on to the connection until it's finished, and return the result in the same response. I'm aware of Flask streaming, but how to do such long-pooling with Flask?

Tornado would do the trick.
Flask is not designed for asynchronization. A Flask instance processes one request at a time in one thread. Therefore, when you hold the connection, it will not proceed to next request.

of tornado and blocking code

I am trying to move away from CherryPy for a web service that I am working on and one alternative that I am considering is Tornado. Now, most of my requests look on the backend something like:
get POST data
see if I have it in cache (database access)
if not make multiple HTTP requests to some other web service which can take even a good few seconds depending on the number of requests
I keep hearing that one should not block the tornado main loop; I am wondering if all of the above code is executed in the post() method of a RequestHandler, does this mean that I am blocking the code ? And if so, what's the appropriate approach to use tornado with the above requirements.

Tornado comes shipped with an asynchronous (actually two iirc) http client (AsyncHTTPClient). Use that one if you need to do additional http requests.
The database lookup should also be done using an asynchronous client in order to not block the tornado ioloop/mainloop. I know there are a couple of tornado tailor made database clients (e.g redis, mongodb) out there. The mysql lib is included in the tornado distribution.

Bind arbitrary Python objects to CherryPy sessions

I'm using CherryPy to make a web-based frontend for SymPy that uses an asynchronous process library on the server side to allow for processing multiple requests at once without waiting for each one to complete. So as to allow for the frontend to function as expected, I am using one process for the entirety of each session. The client-side Javascript sends the session-id from the cookie to the server when the user submits a request, and the server-side currently uses a pair of lists, storing instances of a controller class in one and the corresponding session-id's in another, creating a new interpreter proxy and sending the input if a non-existant session-id is submitted. The only problem with this is that the proxy classes are not deleted upon the expiration of their corresponding sessions. Also, I can't see anything to retrieve the session-id for which the current request is being served.
My questions about all this are: is there any way to "connect" an arbitrary object to a CherryPy session so that it gets deleted upon session expiration, is there something I am overlooking here that would greatly simplify things, and does CherryPy's multi-threading negate the problem of synchronous reading of the stdout filehandle from the child process?

You can create your own session type, derived from CherryPy's base session. Use its clean_up method to do your cleanup.
Look at cherrypy/lib/sessions.py for details and sample session implementations.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.