Tornado ioloop + threading

Tornado ioloop + threading - python

i have been working on tornado web framework from sometime, but still i didnt understood the ioloop functionality clearly, especially how to use it in multithreading.
Is it possible to create separate instance of ioloop for multiple server ??

The vast majority of Tornado apps should have only one IOLoop, running in the main thread. You can run multiple HTTPServers (or other servers) on the same IOLoop.
It is possible to create multiple IOLoops and give each one its own thread, but this is rarely useful, because the GIL ensures that only one thread is running at a time. If you do use multiple IOLoops you must be careful to ensure that the different threads only communicate with each other through thread-safe methods (i.e. IOLoop.add_callback).

Related

How does Waitress handle concurrent tasks?

I'm trying to build a python webserver using Django and Waitress, but I'd like to know how Waitress handles concurrent requests, and when blocking may occur.
While the Waitress documentation mentions that multiple worker threads are available, it doesn't provide a lot of information on how they are implemented and how the python GIL affects them (emphasis my own):
When a channel determines the client has sent at least one full valid HTTP request, it schedules a "task" with a "thread dispatcher". The thread dispatcher maintains a fixed pool of worker threads available to do client work (by default, 4 threads). If a worker thread is available when a task is scheduled, the worker thread runs the task. The task has access to the channel, and can write back to the channel's output buffer. When all worker threads are in use, scheduled tasks will wait in a queue for a worker thread to become available.
There doesn't seem to be much information on Stackoverflow either. From the question "Is Gunicorn's gthread async worker analogous to Waitress?":
Waitress has a master async thread that buffers requests, and enqueues each request to one of its sync worker threads when the request I/O is finished.
These statements don't address the GIL (at least from my understanding) and it'd be great if someone could elaborate more on how worker threads work for Waitress. Thanks!

Here's how the event-driven asynchronous servers generally work:
Start a process and listen to incoming requests. Utilizing the event notification API of the operating system makes it very easy to serve thousands of clients from single thread/process.
Since there's only one process managing all the connections, you don't want to perform any slow (or blocking) tasks in this process. Because then it will block the program for every client.
To perform blocking tasks, the server delegates the tasks to "workers". Workers can be threads (running in the same process) or separate processes (or subprocesses). Now the main process can keep on serving clients while workers perform the blocking tasks.
How does Waitress handle concurrent tasks?
Pretty much the same way I just described above. And for workers it creates threads, not processes.
how the python GIL affects them
Waitress uses threads for workers. So, yes they are affected by GIL in that they aren't truly concurrent though they seem to be. "Asynchronous" is the correct term.
Threads in Python run inside a single process, on a single CPU core, and don't run in parallel. A thread acquires the GIL for a very small amount of time and executes its code and then the GIL is acquired by another thread.
But since the GIL is released on network I/O, the parent process will always acquire the GIL whenever there's a network event (such as an incoming request) and this way you can stay assured that the GIL will not affect the network bound operations (like receiving requests or sending response).
On the other hand, Python processes are actually concurrent: they can run in parallel on multiple cores. But Waitress doesn't use processes.
Should you be worried?
If you're just doing small blocking tasks like database read/writes and serving only a few hundred users per second, then using threads isn't really that bad.
For serving a large volume of users or doing long running blocking tasks, you can look into using external task queues like Celery. This will be much better than spawning and managing processes yourself.

Hint: Those were my comments to the accepted answer and the conversation below, moved to a separate answer for space reasons.
Wait.. The 5th request will stay in the queue until one of the 4 threads is done with their previous handling, and therefore gone back to the pool. One thread will only ever server one request at a time. "IO bound" tasks only help in that the threads waiting for IO will implicitly (e.g. by calling time.sleep) tell the scheduler (python's internal one) that it can pass the GIL along to another thread since there's currently nothing to do, so that the others will get more CPU time for their stuff. On thread level this is fully sequential, which is still concurrent and asynchronous on process level, just not parallel. Just to get some wording staight.
Also, Python threads are "standard" OS threads (like those in C). So they will use all CPU cores and make full use of them. The only thing restricting them is that they need to hold the GIL when calling Python C-API functions, because the whole API in general is not thread-safe. On the other hand, calls to non-Python functions, i.e. functions in C extensions like numpy for example, but also many database APIs, including anything loaded via ctypes, do not hold the GIL while running. Why should they, they are running external C binaries which don't know anything of the Python interpreter running in the parent process. Therefore, such tasks will run truely in parallel when called from a WSGI app hosted by waitress. And if you've got more cores available, turn the thread number up to that amount (threads=X kwarg on waitress.create_server).

Twisted threading and Thread Pool Difference

What is the difference between using
from twisted.internet import reactor, threads
and just using
import thread
using a thread pool?
What is the twisted thing actually doing? Also, is it safe to use twisted threads?

What is the difference
With twisted.internet.threads, Twisted will manage the thread and a thread pool for you. This puts less of a burden on devs and allows devs to focus more on the business logic instead of dealing with the idiosyncrasies of threaded code. If you import thread yourself, then you have to manage threads, get the results from threads, ensure results are synchronized, make sure too many threads don't start up at once, fire a callback once the threads are complete, etc.
What is the twisted thing actually doing?
It depends on what "thing" you're talking about. Can you be more specific? Twisted has various thread functions you can leverage and each may function slightly different from each other.
And is it safe to use twisted threads.
It's absolutely safe! I'd say it's more safe than managing threads yourself. Take a look at all the functionality that Twisted's thread provides, then think about if you had to write this code yourself. If you've ever worked with threads, you'll know what it starts off simple enough, but as your application grows and if you didn't make good decisions about threads, then your application can become very complicated and messy. In general, Twisted will handle the threads in a uniform way and in a way that devs would expect a well behaved threaded app to behave.
References
https://twistedmatrix.com/documents/current/core/howto/threading.html

Is Flask-SocketIO's emit function thread safe?

I have a Flask-SocketIO application. Can I safely call socketio.emit() from different threads? Is socketio.emit() atomic like the normal socket.send()?

The socketio.emit() function is thread safe, or I should say that it is intended to be thread-safe, as there is currently one open issue related to this. Note that 'thread' in this context means a supported threading model. Most people use Flask-SocketIO in conjunction with eventlet or gevent in production, so in those contexts thread means "green" thread.
The open issue is related to using a message queue, which is necessary when you have multiple servers. In that set up, the accesses to the queue are not thread safe at this time. This is a bug that needs to be fixed, but as a workaround, you can create a different socketio object per thread.
On second question regarding if socketio.emit() is atomic, the answer is no. This is not a simple socket write operation. The payload needs to be formatted in certain way to comply with the Socket.IO protocol, then depending on the selected transport (long-polling or websocket) the write happens in a completely different way.

Should I use epoll or just blocking recv in threads?

I'm trying to write a scalable custom web server.
Here's what I have so far:
The main loop and request interpreter are in Cython. The main loop accepts connections and assigns the sockets to one of the processes in the pool (has to be processes, threads won't get any benefit from multi-core hardware because of the GIL).
Each process has a thread pool. The process assigns the socket to a thread.
The thread calls recv (blocking) on the socket and waits for data. When some shows up, it gets piped into the request interpreter, and then sent via WSGI to the application running in that thread.
Now I've heard about epoll and am a little confused. Is there any benefit to using epoll to get socket data and then pass that directly to the processes? Or should I just go the usual route of having each thread wait on recv?
PS: What is epoll actually used for? It seems like multithreading and blocking fd calls would accomplish the same thing.

If you're already using multiple threads, epoll doesn't offer you much additional benefit.
The point of epoll is that a single thread can listen for activity on many file selectors simultaneously (and respond to events on each as they occur), and thus provide event-driven multitasking without requiring the spawning of additional threads. Threads are relatively cheap (compared to spawning processes), but each one does require some overhead (after all, they each have to maintain a call stack).
If you wanted to, you could rewrite your pool processes to be single-threaded using epoll, which would reduce your overall thread usage count, but of course you'd have to consider whether that's something you care about or not - in general, for low numbers of simultaneous requests on each worker, the overhead of spawning threads wouldn't matter, but if you want each worker to be able to handle 1000s of open connections, that overhead can become significant (and that's where epoll shines).
But...
What you're describing sounds suspiciously like you're basically reinventing the wheel - your:
main loop and request interpreter
pool of processes
sounds almost exactly like:
nginx (or any other load balancer/reverse proxy)
A pre-forking tornado app
Tornado is a single-threaded web server python module using epoll, and it has the capability built-in for pre-forking (meaning that it spawns multiple copies of itself as separate processes, effectively creating a process pool). Tornado is based on the tech created to power Friendfeed - they needed a way to handle huge numbers of open connections for long-polling clients looking for new real-time updates.
If you're doing this as a learning process, then by all means, reinvent away! It's a great way to learn. But if you're actually trying to build an application on top of these kinds of things, I'd highly recommend considering using the existing, stable, communally-developed projects - it'll save you a lot of time, false starts, and potential gotchas.
(P.S. I approve of your avatar. <3)

The epoll function (and the other functions in the same family poll and select) allow you to write single threading networking code that manage multiple networking connection. Since there is no threading, there is no need fot synchronisation as would be required in a multi-threaded program (this can be difficult to get right).
On the other hand, you'll need to have an explicit state machine for each connection. In a threaded program, this state machine is implicit.
Those function just offer another way to multiplex multiple connexion in a process. Sometimes it is easier not to use threads, other times you're already using threads, and thus it is easier just to use blocking sockets (which release the GIL in Python).

Instance methods called in a separate thread than the instantiation thread

I'm trying to wrap my head around what is happening in this recipe, because I'm planning on implementing a wx/twisted app similar to this (ie. wx and twisted running in separate threads). I understand that both twisted and wx event-loops need to be accessed in a thread-safe manner (ie. reactor.callFromThread, wx.PostEvent, etc). What I am questioning is the thread-safety of passing in instance methods of objects instantiated in one thread (in the case of this recipe, the GUI thread) as deferred callBack and errBack methods for a reactor running in a separate thread. Is that a good idea?
There is a wxreactor available in twisted, but googling reveals that there have been numerous problems with it since it was introduced to the library. Even the person who initially came up with the wxreactor technique, advocates running wx and twisted in separate threads.
I haven't been able to find any other examples of this technique, but I'd love to see some.

I wouldn't say that it's a "good idea". You should just run the reactor and the GUI in the same thread with wxreactor.
The timer-driven event-loop starving approach described by Mr. Schroeder is the worst possible fail-safe way to implement event-loop integration. If you use wxreactor (not wxsupport) Twisted now uses an approach where multiplexing is shunted off to a thread internally so that nothing needs to use a timer. Better yet would be for wxpython to expose wxSocket and have someone base a reactor on it.
However, if you're set on using a separate thread to communicate with Twisted, the one thing to keep in mind is that while you can use objects that originate from any thread you like as the value to pass to Deferred.callback, you must call Deferred.callback only in the reactor thread itself. Deferreds are not threadsafe; thanks to some debugging utilities, not even the Deferred class is threadsafe, so you need to be very careful when you are using them to never leave the Twisted main thread. i.e. when you have a result in the UI thread, use reactor.callFromThread(myDeferred.callback, myresult).

The sole act of passing instance methods between threads is safe as long as you properly synchronize eventual destruction of those instances (threads share memory so it really doesn't matter which one did the allocation/initialization of a bit of it).
The overall thread safety depends on what those methods actually do, so just make them "play nice" and you should be ok.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.