My setup is flask-socketio with a flask-restful webserver.
Eventlet is installed, so in production mode, eventlet webserver is used.
I understand flask-socketio and eventlet webserver themselves are event-loop based.
Does flask-socketio and eventlet webserver runs on the same eventloop (same thread) or in two different threads?
I think you are confusing the terminology.
The event loop is the task scheduler. This is provided by eventlet, and a single event loop is used for the whole application, including the Flask and the Flask-SocketIO parts.
Each time a request arrives to the eventlet web server, it will allocate a new task for it. So basically each request (be it Flask or Flask-SocketIO, HTTP or WebSocket) will get its own task. Tasks are constantly being created and destroyed as requests are handled.
When you use eventlet, tasks are not threads, they are greenlets, that is why I avoided calling them threads above and used the more generic "task" term. They behave like threads in many ways, but they are not.
Related
I'm working on a flask framework trying to schedule a job that will be triggered in 30 min from lunch and will happen only once.
I tried to work with threading.Timer, But since my job calling a REST request I'm getting RunTimeError: 'working outside of request context' which I just couldn’t solve.
From this thread, I understand that it is not recommended using the threading module on a flask server:
How do you schedule timed events in Flask?
So I'm looking for a solution for a timed trigger job (which doesn’t work on intervals).
It looks like APscheduler must be interval based.
I would be grateful for any help.
The apscheduler add_job method can take a date trigger that will allow you to do what you want.
Pro tips:
If you use apscheduler inside your flask app process, when going into production with a wsgi server like gunicorn or uwsgi you will hand up with your job being run multiple time(one for each flask worker).
When facing this issue the gunicorn --preload option didn't cut it for me.
So:
You can use flask-apscheduler with his rest server approach if that suits you.
Or separate the apscheduler into a daemon and
use uwsgi mules,
or keep gunicorn running only the web app and use supervisor(or an equivalent) to start the scheduler daemon.
IMHO the separation of gunicorn/flask and apscheduler into two part and use of supervisor is the cleanest yet not so complex solution.
I am writing a Gevent/Flask server in Python. Some of the requests my Flask app takes need to run in the background; there is an endpoint for the client to poll the server for the task's result.
If you search the wisdom of the Internet for the best way to do this, everybody seems to be in favor of setting up one or several worker processes such as Celery or RQ, with a message queue or store such as RabbitMQ or Redis.
My app is small and my deployment is modest. This seems like too much of a hassle for me. I already have cooperative multitasking with Gevent, so I thought I'd just create a greenlet to do the background work in-process, that is, within the Flask app process itself.
This is not the mainstream solution, so my question is: Am I missing something? What am I missing? Is there something in this solution that makes it particularly bad?
I'm currently researching websocket support in Python and am a bit confused with the offerings.
On one hand it's possible to use Flask + gevent. On the other hand, uwsgi has socket support and at last there is an extension that bundles both uwsgi and gevent.
What's the problem with implementing websockets with only one of these? What do I win by mixing them?
Changing the question
What does adding gevent do that threaded uwsgi won't?
In regular HTTP requests the connections between client and server are short-lived, a client connects to the server, sends a request, receives the response and then closes the connection. In this model the server can serve a large number of clients using a small number of workers. The concurrency model in this situation is typically based on threads, processes or a combination of both.
When you use websocket the problem is more complex, because a websocket connection is open for a long period of time, so the server cannot use a small pool of workers to serve a large number of clients, each client needs to get its own dedicated worker. If you use threads and/or processes then your app will not scale to support a large number of clients because you can't have large number of threads/processes.
This is where gevent enters the picture. Gevent has a concurrency model based on greenlets, which scale much better than threads/processes. So serving websocket connections with a gevent based server allows you support more clients, due to the lightweight nature of greenlets. With uWSGI you have a choice of concurrency models to use with web sockets, and that includes the greenlet based model from gevent. You can also use gevent's web server standalone if you want.
But note that gevent does not know anything about web sockets, it is just a server. To use websocket connections you have to add an implementation of the websocket server.
There are two extensions for Flask that simplify the use of websockets. The Flask-Sockets extension by Kenneth Reitz is a wrapper for gevent and gevent-websocket. The Flask-SocketIO extension (shameless plug as I'm the author) is a wrapper for gevent and gevent-socketio on the server, plus Socket.IO on the client. Socket.IO is higher level socket protocol that can use web socket if available but can also use other transport mechanisms on older browsers.
Is using Django with gunicorn is considered to be a replacement for using evented/async servers like Tornado, Node.js, and similar ? Additionally, Will that be helpful in handling long-polling/cometed services?
Finally, is Gunicorn only replacing the memory consuming Apache threads (in case of Apache/mod-wsgi) with lightweight threads, or there are an additional benefits?
Gunicorn by default will spawn regular synchronous WSGI processes. You can however tell it to spawn processes that use gevent, eventlet or tornado instead. I am only familiar with gevent which can certainly be used instead of Node.js for long polling.
The memory footprint per process is about the same for mod_wsgi and gunicorn (in my limited experience), but you get more bells-and-whistles with gunicorn. If you change the default worker class to gevent (or eventlet or tornado) you also get a LOT more performance out of each process.
I've written a Python app that reads a database of tasks, and schedule.enter()s those tasks at various intervals. Each task reschedules itself as it executes.
I'd like to integrate this app with a WSGI framework, so that tasks can be added or deleted in response to HTTP requests. I assume I could use XML-RPC to communicate between the framework process and the task engine, but I'd like to know if there's a framework that has built-in event scheduling which can be modified via HTTP.
Sounds like what you really want is something like Celery. It's a Python-based distributed task queue which has various task behaviours including periodic and crontab.
Prior to version 2.0, it had a dependency on Django, but that has now been reduced to an integration plugin.