Python + WSGI + Eventlet Websockets (100% CPU)

Python + WSGI + Eventlet Websockets (100% CPU) - python

I am currently working with WSGI, Eventlet and REDIS to create a websocket server. We are seeing a very high CPU load for what I would expect with the number of connections.
We have around 2000 connections to the websocket server which push information once roughly every 1m30s. The info is captured and placed into the REDIS DB, if there any messages waiting for the client then the messages are sent back in reply.
Python version 2.7
The wsgi setup looks like the below
wsgi.server(eventlet.listen(('127.0.0.1',8000),backlog=5000),hello_world,max_size=5000)
I have patched the eventlet lib at the script start using
import eventlet
eventlet.monkey_patch()
To hopefully get around any Redis related deadlocks causing high CPU.
The server is a EC2 C4 large running on Ubuntu 16.04. Amazon
Nothing else is running on the server other than this script to 100% CPU seems very high to me but perhaps my expectations are incorrect.
Can anyone help with perhaps some common gotchas?

Related

Keep Flask with RabbitMQ app running on IIS

When my Flask app is launched, it starts a RabbitMQ consumer on a new thread. The consumer should keep running regardless of requests/responses and wait for incoming messages. Everything works fine when running locally and also when I deploy the app on linux.
Unfortunatly, due to some third-party dependencies I must deploy the app on a Windows server. I am able to configure IIS to serve the app using wfastcgi, but it seems that after some time with no activity the app gets shutdown until the next HTTP request, when it is automatically initalized again. This behaviour would not bother me normally, but it in this case the RabbitMQ consumer is also closed every time and then the incoming messages from the queue are not recieved.
Is there a way to keep the app running infinitly?
I saw some settings in IIS like Activity Timeout, Idle Timeout, etc. but I don't know which one is relevant. Also, what if I want the app to keep running and consuming messages from RabbitMQ even if there are no requests for a very long time, e.g. months or years?

flask socket-io, sometimes client calls freeze the server

I occasionally have a problem with flask socket-io freezing, and I have no clue how to fix it.
My client connects to my socket-io server and performs some chat sessions. It works nicely. But for some reason, sometimes from the client side, there is some call that blocks the whole server (The server is stuck in the process, and all other calls are frozen). What is strange is that the server can be blocked as long as the client side app is not totally shutdown.This is an ios-app / web page, and I must totally close the app or the safari page. Closing the socket itself, and even deallocating it doesn't resolve the problem. When the app is in the background, the sockets are closed and deallocated but the problem persists.
This is a small server, and it deals with both html pages and the socket-server so I have no idea if it is the socket itself or the html that blocks the process. But each time the server was freezing, the log showed some socket calls.
Here is how I configured my server:
socketio = SocketIO(app, ping_timeout=5)
socketio.run(app, host='0.0.0.0', port=5001, debug=True, ssl_context=context)
So my question is:
What can freeze the server (this seems to happen when I leave the app or web-site open for a long time while doing nothing). If I use the services normally the server never freezes. And how I can prevent it from happening (Even if I don't know what causing this, is there a way to blindly stop my server from being stuck at a call?
Thanks you for the answers

According to your comment above, you are using the Flask development web server, without the help of an asynchronous framework such as eventlet or gevent. Besides this option being highly inefficient, you should know that this web server is not battle tested, it is meant for short lived tests during development. I'm not sure it is able to run for very long, specially under the unusual conditions Flask-SocketIO puts it through, which regular Flask apps do not exercise. I think it is quite possible that you are hitting some obscure bug in Werkzeug that causes it to hang.
My recommendation is that you install eventlet and try again. All you need to do is pip install eventlet, and assuming you are not passing an explicit async_mode argument, then just by installing this package Flask-SocketIO should configure itself to use it.
I would also remove the explicit timeout setting. In almost all cases, the defaults are sufficient to maintain a healthy connection.

Apache process idling and eating memory

I'm stuck trying to debug an Apache process that keeps growing in memory size. I'm running Apache 2.4.6 with MPM Prefork on virtual Ubuntu host with 4GB of RAM, serving a Django app with mod_wsgi. The app is heavy with AJAX calls and Apache is getting between 300-1000 requests per minute. Here's what I'm seeing:
As soon as I restart Apache, the first child process (with lowest PID) will keep growing its memory usage, reaching over a gig in 6 or 7 minutes. All the other Apache process will keep memory usage between 10MB-50MB per process.
CPU usage for the troublesome process will fluctuate, sometimes dipping down very low, other times hovering at 20% or sometimes spiking higher.
The troublesome process will run indefinitely until I restart Apache.
I can see in my Django logs that the troublesome process is serving some requests to multiple remote IPs (I'm seeing reports of caught exceptions for URLs my app doesn't like, primarily).
Apache error logs will often (but not always) show "IOError: failed to write data" for the PID, sometimes across multiple IPs.
Apache access logs do not show any requests completed associated with this PID.
Running strace on the PID gets no results other than 'restart_syscall(<... resuming interrupted call ...>' even when I can see that PID mentioned in my app logs at a time when strace was running.
I've tried setting low values of MaxRequestsPerChild and MaxMemFree and neither has seemed to have any effect.
What could this be or how could I debug further? The fact that I see no output of strace makes me that my application has an infinite loop. If that were the case, how could I go about tracing the PID back to the code path it executed or the request that started the trouble?

Instead of restarting Apache, stop and start Apache. There is a known no-fix memory leak issue with Apache.
Also, consider using nginx and gunicorn—this setup is a lightweight, faster, and often recommended alternative to serving your django app, and static files.
References:
Performance
Memory Usage
Apache/Nginx Comparison

How can I serve a wsgi app on demand?

I have a small server on which I host the wsgi applications I write. This server does not have a lot of ram, and the applications are not frequently used and rarely more than one at once.
Is there a way to configure the server so that the applications are only started when they are needed (when I try to connect on the socket they're served on), somewhat like inetd does ?

depends on the server software you use.
if you use nginx + uwsgi for example, you can configure the uwsgi workers to only be created on requests and get destroyed after a certain amount of inactivity.
http://projects.unbit.it/uwsgi/wiki/Doc
look for "idle" "cheap" "cheaper"

any python socket server framework?

I'm looking for a python socket server framework - not to handle http, but to handle tcp sockets. I've done it myself, but adding all the features is tedious. This framework would handle thread pooling, socket setup, signal handling, etc.
A big feature is code-reloading. If I use apache/mod_python, or django, or whatever, I don't have to restart the server to get it to use new/changed code. Anybody know how that's done?
thanks!
Colin

Twisted is the usual suspect. Reloading in the case of mod_wsgi is easy since only the WSGI server needs to be restarted, not the whole web server (not that restarting the web server is all that hard, mind you...).

Use Apache, mod_wsgi in daemon mode and follow these guidelines.
Update: I mentioned Apache because you did in your question - I assumed you were talking about a web application that also acted as a socket server.
The Python library has socket servers (see the documentation). AFAIK you can't do hot code reloading in Python without potentially losing packets, for that you would need something specifically designed for hot code reloading, such as Erlang, or else just have a dumb socket receiver which receives and queues packets, and a smarter backend process which does code reloading and packet handling. In that case, your receiver would be acting as a proxy.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.