Apache process level stickiness

Apache process level stickiness - python

In apache server can we have the process level stickiness, without havinbg to used the daemon mode?
With mod_balancer module we can have stickiness at server level , i want all my request to go to the exactly same process on that server. Is is possible? or what can be the alternative?

You can balance apache mod_wsgi with some parameters. It can load a single process for interpreter and you can choose how many thread per process. Also you can have for a single virtual host resources you need.
WSGIDaemonProcess

Related

mod_wsgi: Reload Code via Inotify - not every N seconds

Up to now I followed this advice to reload the code:
https://code.google.com/archive/p/modwsgi/wikis/ReloadingSourceCode.wiki
This has the drawback, that the code changes get detected only every N second. I could use N=0.1, but this results in useless disk IO.
AFAIK the inotify callback of the linux kernel is available via python.
Is there a faster way to detect code changes and restart the wsgi handler?
We use daemon mode on linux.
Why code reload for mod_wsgi at all
There is interest in why I want this at all. Here is my setup:
Most people use "manage.py runserver" for development and some other wsgi deployment for for production.
In my context we have automated the creation of new systems and prod and development systems are mostly identical.
One operating system (linux) can host N systems (virtual environments).
Developers can use runserver or mod_wsgi. Using runserver has the benefit that it's easy for debugging, mod_wsgi has the benefit that you don't need to start the server first.
mod_wsgi has the benefit, that you know the URL: https://dev-server/system-name/myurl/
With runserver you don't know the port. Use case: You want to link from an internal wiki to a dev-system ....
A dirty hack to get code reload for mod_wsgi, which we used in the past: maximum-requests=1 but this is slow.

Preliminaries.
Developers can use runserver or mod_wsgi. Using runserver has the
benefit that you it easy for debugging, mod_wsgi has the benefit that
you don't need to start the server first.
But you do, the server needs to be setup first and that takes a lot of effort. And the server needs to be started here as well though you can configure it to start automatically at boot.
If you are running on port 80 or 443 which is usually the case, the server can be started only by the root. If it needs to be restarted you will have to ask the super user's help again. So ./manage.py runserver scores heavily here.
mod_wsgi has the benefit, that you know the URL:
https://dev-server/system-name/myurl/
Which is no different from the dev server. By default it starts on port 8000 so you can access it as http://dev-server:8000/system-name/myurl/. If you wanted to use SSL with the development server you can use a package such as django-sslserver or you can put nginx in front of django development server.
With runserver you don't know the port. Use case: You want to link from >an internal wiki to a dev-system ....
With runserver, the port is well defined as mentioned above. And you can make it listen on a different port for exapmle with:
./manage.py runserver 0.0.0.0:9090
Note that if you put development server behind apache (as a reverse proxy) or NGINX, restarting problems etc that I have mentioned above do not apply here.
So in short, for development work, what ever you do with mod_wsgi can be done with the django development server (aka ./manage.py runserver).
Inotify
Here we are getting to the main topic at last. Assuming you have installed inotify-tools you could type this into your shell. You don't need to write a script.
while inotifywait -r -e modify .; do sudo kill -2 yourpid ; done
This will result in the code being reloaded when ...
... using daemon mode with a single process you can send a SIGINT
signal to the daemon process using the ‘kill’ command, or have the
application send the signal to itself when a specific URL is
triggered.
ref: http://modwsgi.readthedocs.io/en/develop/user-guides/frequently-asked-questions.html#application-reloading
alternatively
while inotifywait -r -e modify .; do touch wsgi.py ; done
when
... using daemon mode, with any number of processes, and the process
reload mechanism of mod_wsgi 2.0 has been enabled, then all you need
to do is touch the WSGI script file, thereby updating its modification
time, and the daemon processes will automatically shutdown and restart
the next time they receive a request.
In both situations we are using the -r flag to tell inotify to monitor subdirectories. That means each time you save a .css or .js file apache will reload. But without the -r flag changes to python code in subfolders will be undetected. To have the best of both worls, remove css, js, images etc with the --exclude directive.
What about when your IDE saves an auto backup file? or vim saves the .swp file? That too will cause a code reload. So you would have to exclude those file types too.
So in short, it's a lot of hard work to reproduce what the django development server does free of charge.

You can use inotify hooktables to run any command you want depending on a i-notify signal (here's my source link: http://terokarvinen.com/2016/when-files-change-take-action-inotify-hookable).
After looking the tables you can just reload the code of apache.
For your specific problem, it should be something like:
inotify-hookable --watch-directories sources/ --recursive --on-modify-command './code_reload.sh'
In the previous link, the command to execute was just a simple touch flask/init.wsgi
So, the whole code (adding ignored files was):
inotify-hookable --watch-directories flask/ --recursive --ignore-paths='flask/init.wsgi' --on-modify-command 'touch flask/init.wsgi'
As stated here: Flask + mod_wsgi automatic reload on source code change, if you have enabled WSGIScriptReloading, you can just touch that file. It will cause the entire code to reload (not just the config file). But, if you prefer, you can set any other script to reload the code.
After googling a bit, it seems to be a pretty standard solution for that problem and I think that you can use it for your application.

How can I serve a wsgi app on demand?

I have a small server on which I host the wsgi applications I write. This server does not have a lot of ram, and the applications are not frequently used and rarely more than one at once.
Is there a way to configure the server so that the applications are only started when they are needed (when I try to connect on the socket they're served on), somewhat like inetd does ?

depends on the server software you use.
if you use nginx + uwsgi for example, you can configure the uwsgi workers to only be created on requests and get destroyed after a certain amount of inactivity.
http://projects.unbit.it/uwsgi/wiki/Doc
look for "idle" "cheap" "cheaper"

Restricting system resources in wsgi

I am running Django in a mod_wsgi environment on a shared host. I want to restrict the resources a request can use and ideally raise an Exception if it exceeds that amount. The WSGI options are as follows:
WSGIRestrictSignal off
WSGIRestrictStdout off
The VirtualHost has the following:
WSGIDaemonProcess django processes=10 threads=1 display-name=django-web
WSGIProcessGroup django
In the request I do the following:
signal.setrlimit(resource.RLIMIT_CPU, (10, -1))
signal.signal(signal.SIGXCPU, cpu_signal)
signal.signal(signal.SIGALRM, timeout_signal)
signal.alarm(1)
#Do some stuff
signal.alarm(0)
However when I run the request I get the error signal only works in main thread despite when I print out the number of active threads and the thread name I get there is one thread and the current thread name is MainThread so i don't understand why Python tries to set the signal it doesn't believe its running in the main thread.
I am running Python 2.7.2, Django 1.3.1 Apache 2.2.21 and WSGI 3.3

Send an email to mod_wsgi mailing list and will discuss there what is possible. In mod_wsgi 4.0 dev version there are some experimental mechanisms for killing process when CPU usage exceeds some value, but it is process wide. Doing it on a per request basis is hard because of multithreading being able to be used. Here on stackoverflow is not the place to be holding a discussion about it.

Tornado code deployment

Is there a canonical code deployment strategy for tornado-based web application deployment. Our current configuration is 4 tornado processes running behind NginX? (Our specific use case is behind EC2.)
We've currently got a solution that works well enough, whereby we launch the four tornado processes and save the PIDs to a file in /tmp/. Upon deploying new code, we run the following sequence via fabric:
Do a git pull from the prod branch.
Remove the machine from the load balancer.
Wait for all in flight connections to finish with a sleep.
Kill all the tornadoes in the pid file and remove all *.pyc files.
Restart the tornadoes.
Attach the machine back to the load balancer.
We've taken some inspiration from this: http://agiletesting.blogspot.com/2009/12/deploying-tornado-in-production.html
Are there any other complete solutions out there?

We run Tornado+Nginx with supervisord as the supervisor.
Sample configuration (names changed)
[program:server]
process_name = server-%(process_num)s
command=/opt/current/vrun.sh /opt/current/app.py --port=%(process_num)s
stdout_logfile=/var/log/server/server.log
stderr_logfile=/var/log/server/server.err
numprocs = 6
numprocs_start = 7000
I've yet to find the "best" way to restart things, what I'll probably finally do is have Nginx have a "active" file which is updated letting HAProxy know that we're messing with configuration then wait a bit, swap things around, then re-enable everything.
We're using Capistrano (we've got a backlog task to move to Fabric), but instead of dealing with removing *.pyc files we symlink /opt/current to the release identifier.

I haven't deployed Tornado in production, but I've been playing with Gevent + Nginx and have been using Supervisord for process management - start/stop/restart, logging, monitoring - supervisorctl is very handy for this. Like I said, not a deployment solution, but maybe a tool worth using.

Python module being reloaded for each request with django and mod_wsgi

I have a variable in init of a module which get loaded from the database and takes about 15 seconds.
For django development server everything is working fine but looks like with apache2 and mod_wsgi the module is loaded with every request (taking 15 seconds).
Any idea about this behavior?
Update: I have enabled daemon mode in mod wsgi, looks like its not reloading the modules now! needs more testing and I will update.

You were likely ignoring the fact that in embedded mode of mod_wsgi or with mod_python, the application is multiprocess. Thus requests may go to different processes and you will see a delay the first time a process which hasn't been hit before is encountered. In mod_wsgi daemon mode the default has only a single process. That or as someone else mentioned you had MaxRequestsPerChild set to 1, which is a really bad idea.

I guess, you had a value of 1 for MaxClients / MaxRequestsPerChild and/or ThreadsPerChild in your Apache settings. So Apache had to startup Django for every mod_python call. That's why it took so long. If you have a wsgi-daemon, then a restart takes only place if you "touch" the wsgi script.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.