uWSGI: Python's concurrent.futures hang on all but one uWSGI process

uWSGI: Python's concurrent.futures hang on all but one uWSGI process - python

I'm using Python's concurrent.futures module (module version 2.1.3, Python version 2.7.3). I have nginx running with 4 worker processes, and 4 uWSGI running (on Ubuntu precise) as an upstart daemon, with the following uwsgi config (note enable-threads is true, so the GIL is accessible, and lazy is true):
virtualenv=[ path to venv ]
chdir=[ path to python project ]
enable-threads=true
lazy=true
buffer-size=32768
post-buffering=true
processes=4
master=true
module=[ my app ].wsgi
callable=wsgi_app
logto=/var/log/uwsgi.log
pidfile=[ replaced ]
plugins=python27
socket=[ replaced, but works fine ]
The entire app works fine, but it seems that some missing context is not available to the futures pool: When I call somefunc() without future(), all is well, but when I call somefunc() with future, the HTTP request (I'm using Flask) hangs for quite some time before failing.
The only entries to the log file are related to HTTP requests, and general wsgi startup stuff like:
WSGI application 0 (mountpoint='') ready on interpreter 0x11820a0 pid: 26980 (default app)
How can I get some visibility into the futures execution, or figure out what context might not be available to the futures pool?
Does that make sense?
Thanks in advance.

If you are using ProcessPoolExecutor instead of threads, be sure to add close-on-exec to your uWSGI options, otherwise the connection socket with the client/webserver will be inherited after fork()

Related

How to start remote celery workers from django

I'm trying to use django in combination with celery.
Therefore I came across autodiscover_tasks() and I'm not fully sure on how to use them. The celery workers get tasks added by other applications (in this case a node backend).
So far I used this to start the worker:
celery worker -Q extraction --hostname=extraction_worker
which works fine.
Now I'm not sure what the general idea of the django-celery integration is. Should workers still be started from external (e.g. with the command above), or should they be managed and started from the django application?
My celery.py looks like:
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'main.settings')
app = Celery('app')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
then I have 2 apps containing a tasks.py file with:
#shared_task
def extraction(total):
return 'Task executed'
how can I now register django to register the worker for those tasks?

You just start worker process as documented, you don't need to register anything else
In a production environment you’ll want to run the worker in the
background as a daemon - see Daemonization - but for testing and
development it is useful to be able to start a worker instance by
using the celery worker manage command, much as you’d use Django’s
manage.py runserver:
celery -A proj worker -l info
For a complete listing of the command-line options available, use the
help command:
celery help
celery worker collects/registers task when it runs and also consumes tasks which it found out

Can't get multiple uwsgi workers to work with flask-socketio

In development, flask-socketio (4.1.0) with uwsgi is working nicely with just 1 worker and standard initialization.
Now I'm preparing for production and want to make it work with multiple workers.
I've done the following:
Added redis message_queue in init_app:
socketio = SocketIO()
socketio.init_app(app,async_mode='gevent_uwsgi', message_queue=app.config['SOCKETIO_MESSAGE_QUEUE'])
(Sidenote: we are using redis in the app itself as well)
gevent monkey patching at top of the file that we run with uwsgi
from gevent import monkey
monkey.patch_all()
run uwsgi with:
uwsgi --http 0.0.0.0:63000 --gevent 1000 --http-websockets --master --wsgi-file rest.py --callable application --py-autoreload 1 --gevent-monkey-patch --workers 4 --threads 1
This doesn't seem to work. The connection starts rapidly alternating between a connection and 400 Bad request responses. I suspect these correspond to the ' Invalid session ....' errors I see when I enable SocketIO logging.
Initially it was not using redis at all,
redis-cli > PUBSUB CHANNELS *
resulted in an empty result even with workers=1.
it seemed the following (taken from another SO answer) fixed that:
# https://stackoverflow.com/a/19117266/492148
import gevent
import redis.connection
redis.connection.socket = gevent.socket
after doing so I got a "flask-socketio" pubsub channel with updating data.
but after returning to multiple workers, the issue returned. Given that changing the redis socket did seem to bring things in the right direction I feel like the monkeypatching isn't working properly yet, but the code I used seems to match all examples I can find and is at the very top of the file that is loaded by uwsgi.

You can run as many workers as you like, but only if you run each worker as a standalone single-worker uwsgi process. Once you have all those workers running each on its own port, you can put nginx in front to load balance using sticky sessions. And of course you also need the message queue for the workers to use when coordinating broadcasts.

Eventually found https://github.com/miguelgrinberg/Flask-SocketIO/issues/535
so it seems you can't have multiple workers with uwsgi either as it needs sticky sessions. Documentation mentions that for gunicorn, but I did not interpret that to extend to uwsgi.

Python: Flask uWSGI thread stops when making a network call

I have a flask server running on top of uWSGI, with the following configuration:
[uwsgi]
http-socket = :9000
plugin = python
wsgi-file = /.../whatever.py
enable-threads = true
The flask server has a background thread which makes periodic calls to another server, using the following command:
r = requests.get(...)
I've added logging before and after this command, and it seems that the command never returns, and the thread just stops there.
Any idea why the background thread is hanging? Note that I've added enable-threads = true to the configuration.
Updates
I've added a timeout parameter to requests.get(). Now the behaviour is unexpected - the background thread works in one server, but fails in another.

killing all the uWSGI instances and restarting them using sudo service uwsgi restart solved the problem.
It seems that sudo service uwsgi stop does not actually stop all the instances of uwsgi.

How to run python websocket on gunicorn

I found this 0 dependency python websocket server from SO: https://gist.github.com/jkp/3136208
I am using gunicorn for my flask app and I wanted to run this websocket server using gunicorn also. In the last few lines of the code it runs the server with:
if __name__ == "__main__":
server = SocketServer.TCPServer(
("localhost", 9999), WebSocketsHandler)
server.serve_forever()
I cannot figure out how to get this websocketserver.py running in gunicorn. This is because one would think you would want gunicorn to run server_forever() as well as the SocketServer.TCPServer(....
Is this possible?

GUnicorn expects a WSGI application (PEP 333) not just a function. Your app has to accept an environ variable and a start_response callback and return an iterator of data (roughly speaking). All the machinery encapsuled by SocketServer.StreamRequestHandler is on gunicorn side. I imagine this is a lot of work to modify this gist to become a WSGI application (But that'll be fun!).
OR, maybe this library will get the job done for you: https://github.com/CMGS/gunicorn-websocket

If you use Flask-Sockets extension, you have a websocket implementation for gunicorn directly in the extension which make it possible to start with the following command line :
gunicorn -k flask_sockets.worker app:app
Though I don't know if that's what you want to do.

mod_wsgi force reload modules

Is there a way to have mod_wsgi reload all modules (maybe in a particular directory) on each load?
While working on the code, it's very annoying to restart apache every time something is changed. The only option I've found so far is to put modname = reload(modname) below every import.. but that's also really annoying since it means I'm going to have to go through and remove them all at a later date..

The link:
http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode
should be emphasised. It also should be emphaised that on UNIX systems daemon mode of mod_wsgi must be used and you must implement the code monitor described in the documentation. The whole process reloading option will not work for embedded mode of mod_wsgi on UNIX systems. Even though on Windows systems the only option is embedded mode, it is possible through a bit of trickery to do the same thing by triggering an internal restart of Apache from the code monitoring script. This is also described in the documentation.

The following solution is aimed at Linux users only, and has been tested to work under Ubuntu Server 12.04.1
To run WSGI under daemon mode, you need to specify WSGIProcessGroup and WSGIDaemonProcess directives in your Apache configuration file, for example
WSGIProcessGroup my_wsgi_process
WSGIDaemonProcess my_wsgi_process threads=15
More details are available in http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives
An added bonus is extra stability if you are running multiple WSGI sites under the same server, potentially with VirtualHost directives. Without using daemon processes, I found two Django sites conflicting with each other and turning up 500 Internal Server Error's alternatively.
At this point, your server is in fact already monitoring your WSGI site for changes, though it only watches the file you specified using WSGIScriptAlias, like
WSGIScriptAlias / /var/www/my_django_site/my_django_site/wsgi.py
This means that you can force the WSGI daemon process to reload by changing the WSGI script. Of course, you don't have to change its contents, but rather,
$ touch /var/www/my_django_site/my_django_site/wsgi.py
would do the trick.
By utilizing the method above, you can automatically reload a WSGI site in production environment without restarting/reloading the entire Apache server, or modifying your WSGI script to do production-unsafe code change monitoring.
This is particularly useful when you have automated deploy scripts, and don't want to restart the Apache server on deployment.
During development, you may use a filesystem changes watcher to invoke touch wsgi.py every time a module under your site changes, for example, pywatch

The mod_wsgi documentation on code reloading is your best bet for an answer.

I know it's an old thread but this might help someone. To kill your process when any file in a certain directory is written to, you can use something like this:
monitor.py
import os, sys, time, signal, threading, atexit
import inotify.adapters
def _monitor(path):
i = inotify.adapters.InotifyTree(path)
print "monitoring", path
while 1:
for event in i.event_gen():
if event is not None:
(header, type_names, watch_path, filename) = event
if 'IN_CLOSE_WRITE' in type_names:
prefix = 'monitor (pid=%d):' % os.getpid()
print "%s %s/%s changed," % (prefix, path, filename), 'restarting!'
os.kill(os.getpid(), signal.SIGKILL)
def start(path):
t = threading.Thread(target = _monitor, args = (path,))
t.setDaemon(True)
t.start()
print 'Started change monitor. (pid=%d)' % os.getpid()
In your server startup, call it like:
server.py
import monitor
monitor.start(<directory which contains your wsgi files>)
if your main server file is in the directory which contains all your files, you can go like:
monitor.start(os.path.dirname(__file__))
Adding other folders is left as an exercise...
You'll need to 'pip install inotify'
This was cribbed from the code here: https://code.google.com/archive/p/modwsgi/wikis/ReloadingSourceCode.wiki#Restarting_Daemon_Processes
This is an answer to my duplicate question here: WSGI process reload modules

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

uWSGI: Python's concurrent.futures hang on all but one uWSGI process - python

If you are using ProcessPoolExecutor instead of threads, be sure to add close-on-exec to your uWSGI options, otherwise the connection socket with the client/webserver will be inherited after fork()

Related

How to start remote celery workers from django

Can't get multiple uwsgi workers to work with flask-socketio

Python: Flask uWSGI thread stops when making a network call

How to run python websocket on gunicorn

mod_wsgi force reload modules

Categories

Resources