FastAPI with request queue

FastAPI with request queue - python

I have developed an application that takes an image and does some hard work on the GPU. The problem is that if a request is currently being processed (processing some image on the GPU) and another request for image processing comes to the server, then an error occurs related to the logic of using the GPU. Thus, I want each request to be processed by the server sequentially, that is, how to queue requests: do not execute a new request until the previous one has completed. How can this be implemented?
I read about celery and message brokers like RabbitMQ but I don't fully understand whether it should be used in my case

Related

How to handle high response time

There are two different services. One service -Django is getting the request from the front-end and then calling an API in the other service -Flask.
But the response time of the Flask service is high and if the user navigates to another page that request will be canceled.
Should it be a background task or a pub/sub pattern? If so, how to do it in the background and then tell the user here is your last result?

You have two main options possible:
Make an initial request to a "simple" view of Django, which load a skeleton HTML page with a spinner where some JS will trigger a XHR request to a second Django view which will contain the other service (Flask) call. Thus, you can even properly alert your user the loading takes times and handle the exit on the browser side (ask confirmation before leaving/abort the request...)
If possible, cache the result of the Flask service, so you don't need to call it at each page load.
You can combine those two solutions by calling the service in a asynchronous request and cache its result (depending on context, you may need to customize the cache depending on the user connected for example).
The first solution can be declined with pub/sub, websockets, whatever, but a classical XHR seems fine for your case.

On our project, we have a couple of time-expensive endpoints. Our solution was similar to a previous answer:
Once we receive a request we call a Celery task that does its expensive work in async mode. We do not wait for its results and return a quick response to the user. Celery task sends its progress/results via WebSockets to a user. Frontend handles this WS message. The benefit of this approach is that we do not spend the CPU of our backend. We spend the CPU of the Celery worker that is running on another machine.

Multi-client web-cam access and stream processing from python server in real-time

I am in the process of making a web application that essentially takes in some web-stream from the client via their browser, and in real-time, sends it to a python server (Flask probably) that processes the frames in real-time and sends a response to the user. Now the backend has to be capable of handling web-streams from multiple clients simultaneously.
I am trying to grasp the framework for this entire application. What I have in mind is the following:
The user accesses the web-cam via their browser (e.g using webcamJS), the frames are sent from the frontend to the back-end through a web-socket. The task here is to establish a seemless handshake between the multiple clients and their processing requests.
There is a need for concurrency if the processing is to be done in real-time, multiple threads of the same image-processing-algorithm need to be executed. My take is that I make use of the multiple threads for this purpose or is there a better way of doing this? Is this even a feasible approach as the image-processing-algorithm (trained model) takes some time to load up , so it has to be always initialized at the backend and not start from scratch at every request.
The response from image-processing-algorithm need to get back to the frontend and the process goes on.
What I really need help is in drawing out the complete framework of this implementation. Any suggestions on the modules/frameworks to use with some implementations would be greatly appreciated.
Thank you.

You can use Flask for your web server, Keras to process the videos.
The standard library multiprocessing module will also be helpful to treat multiple feeds at once.

Trouble with understanding how RabbitMQ can be used

i'm currently working on a Python web app that needs to implement RabbitMQ.
The app is structured like that :
The client connects to a HTTP server
His connexion is send to a message queue that is connected to the main service of my app
the main service receive the message and give the user his information
I understand how to make work RabbitMq using the documentation and tutorial on the website but I have trouble seeing how can it work with real tasks like displaying a web page or printing a file ? How does my service connected to the message queue will read the message received and say : "oh, i'm gonna display this webpage".
Sorry if this is confusing, if you need further explanations on what i'm trying to get, just tell me.
Thanks for reading!

RabbitMq can be good to send message to service which can execute long running process - ie download big file, generate complex animation. Web server can't (or shoudn't) execute long running process.
Web page sends message to RabbitMq (ie. with parameters for long running process) and get unique number. When service has free worker then it checks if there is new message in queue, get it (with unique number) and start worker. When worker finish job then service send result to RabbitMQ with the same uniqe number.
At the same time web page uses JavaScript to run loop which periodically check in RabbitMQ if there is result with this unique number. If there is no result then it may display progressbar, if there is result then it may display this result.
Example: Celery - Distributed Task Queue.
Celery can use RabbitMQ to communicate with Django or Flask.
(but it can use other modules ie. Redis)
Using Celery with Django.
Flask - Celery Background Tasks
From Celery repo
Celery is usually used with a message broker to send and receive messages.
The RabbitMQ, Redis transports are feature complete, but there's also
experimental support for a myriad of other solutions, including using
SQLite for local development.

Why do I need to use asynchronous tools for sending Django email

While using Django, I had noticed when I send an email there is a delay and to overcome this I had to use Celery tool. So I wanted to know what is actually happening under the hood. Why does Django/Python need an async tool to accomplish this task. I have learned about MultiThreading, MultiProcessing in Python but if somebody could give me a brief idea about what exactly is happening when Django is trying to send an email without using Celery.

Think of sending an email like sending a request, in a synchronous context the process would go as follows:
Send request
Wait for response..........
Receive response
The whole time you're waiting for the response that thread cannot do anything else, it's wasting CPU cycles that could be used by something else (such as serving other users requests).
I'd like to make a distinction here between your usage of asynchronous and celery.
Pythons actual asynchronous implementation uses an "event loop" to dispatch and receive messages. The "waiting" is done in a separate thread/process which is used exclusively to receive messages, and dispatch those messages to receivers, in this way your thread that sent the request is no longer wasting CPU cycles waiting, it will be invoked by the event loop when it's ready. This is a very vague description of how pythons async works. It won't necessarily make the whole process faster for the user unless there are a lot of emails being sent.
Celery on the other hand is an asynchronous task queue, in which you have producers (your web application) sending messages, a broker (data store) which stores and distributes messages, and consumers (workers) which pull messages from the broker and processes them. The consumers are a totally separate process (often a totally separate server) from your web application, it frees up your web application to focus on returning the response to the client as soon as possible. The process when sending emails through celery would look like:
Web application sends a message to the broker and returns the response to the user. Here's a json pseudo-message. (The broker actually stores the messages as either pickled objects or JSON)
{
"task": "my_app.send_email",
"args": ["Subject Line", "Hello, World! This is your email contents", "to_email#example.com", "from_email#example.com"], #
"kwargs": {} # No keyword arguments
}
The celery worker is constantly checking with the broker for new messages to process if it is not currently processing. Sometimes the celery worker will pull in batches of messages so there is less overhead, this is configurable.
The celery worker executes a function (defined by the "task" in the message), using the arguments and keyword arguments.
That is a very simple example of why you may want to use celery to send emails, so you can return the response to the user as fast as possible! It's also well suited to longer running tasks, such as processing image thumbnails:
User uploads an image, which you store somewhere (Amazon S3 for example)
You send a message to the broker saying "execute my process_image_thumbails task with the files S3 URL as the argument"
You return the response to your user. It's nice and quick from the users perspective.
A worker picks up the message, downloads the file from S3, and processes it into thumbnails of varying sizes.
As you use celery for more new use cases you encounter new problems. For example, what do we do if someone requests the thumbnail while it's processing? I'll leave that to your imagination.

Long-running connection HTTP server (Python)

I am trying to design a web application that processes large quantities of large mixed-media files coming from asynchronous processes. Each process can take several minutes.
The files are either uploaded as a POST body or pulled by the web server according to a source URL provided. The files can be processed by a variety of external tools in a synchronous or asynchronous way.
I need to be able to load balance this application so I can process multiple large files simultaneously for as much as I can afford to scale.
I think Python is my best choice for this project, but beside this, I am open to any solution. The app can either deliver the file back or rely on a messaging channel to notify the clients about the process completion.
Some approaches I thought I might use:
1) Use a non-blocking web server such as Tornado that keeps the connection open until the file processing is done. The external processing command is launched and the web server waits until the file is ready and pipes the resulting IO stream directly back to the web app that returns it. Since the processes sending requests are asynchronous, they might afford to wait (unless memory or some other issues come up).
2) Use a regular web server like Cherrypy (which I am more confident with) and have the webapp use a messaging channel to report the processing progress. The web server returns a HTTP response as soon as it receives the file, validates it and sends it to a background process. At the same time it sends a message notifying the process start. The background process then takes care of delivering the file to an available location and sending another message to the channel notifying the location of the new file. This solution looks more flexible than 1), but requires writing a separate script to handle the messages outside the web application, as well as a separate storage space for the temp files that have to be cleaned up at a certain point.
3) Use some internal messaging capability of any of the webserves mentioned above, which I am not familiar with...
Edit: something like CherryPy's pub-sub engine (http://cherrypy.readthedocs.org/en/latest/extend.html?highlight=messaging#publish-subscribe-pattern) could be a good solution.
Any suggestions?
Thank you,
gm

I had a similar situation come up with a really large scale data processing engine that my team implemented. We wanted to build our api calls in Flask, some of which can take many hours to complete, but have a way to notify the user in real time what is going on.
Basically what I came up with is was what you described as option 2. On the same machine that I am serving the flask app through apache, I created a tornado app that serves up a websocket that reports progress to the end user. Once my main page is served, it establishes the websocket connection to the tornado server, and the flask app periodically sends updates to the tornado app, and down to the end user. Even if the browser is closed during the long running application, apache keeps the request alive and processing, and upon logging back in, I can still see the current progress.
I wrote about this solution in some more detail here:
http://jonfeatherstone.com/2013/08/01/mongo-and-websockets-for-application-logging/
Good luck!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.