mod_wsgi: Reload Code via Inotify - not every N seconds

mod_wsgi: Reload Code via Inotify - not every N seconds - python

Up to now I followed this advice to reload the code:
https://code.google.com/archive/p/modwsgi/wikis/ReloadingSourceCode.wiki
This has the drawback, that the code changes get detected only every N second. I could use N=0.1, but this results in useless disk IO.
AFAIK the inotify callback of the linux kernel is available via python.
Is there a faster way to detect code changes and restart the wsgi handler?
We use daemon mode on linux.
Why code reload for mod_wsgi at all
There is interest in why I want this at all. Here is my setup:
Most people use "manage.py runserver" for development and some other wsgi deployment for for production.
In my context we have automated the creation of new systems and prod and development systems are mostly identical.
One operating system (linux) can host N systems (virtual environments).
Developers can use runserver or mod_wsgi. Using runserver has the benefit that it's easy for debugging, mod_wsgi has the benefit that you don't need to start the server first.
mod_wsgi has the benefit, that you know the URL: https://dev-server/system-name/myurl/
With runserver you don't know the port. Use case: You want to link from an internal wiki to a dev-system ....
A dirty hack to get code reload for mod_wsgi, which we used in the past: maximum-requests=1 but this is slow.

Preliminaries.
Developers can use runserver or mod_wsgi. Using runserver has the
benefit that you it easy for debugging, mod_wsgi has the benefit that
you don't need to start the server first.
But you do, the server needs to be setup first and that takes a lot of effort. And the server needs to be started here as well though you can configure it to start automatically at boot.
If you are running on port 80 or 443 which is usually the case, the server can be started only by the root. If it needs to be restarted you will have to ask the super user's help again. So ./manage.py runserver scores heavily here.
mod_wsgi has the benefit, that you know the URL:
https://dev-server/system-name/myurl/
Which is no different from the dev server. By default it starts on port 8000 so you can access it as http://dev-server:8000/system-name/myurl/. If you wanted to use SSL with the development server you can use a package such as django-sslserver or you can put nginx in front of django development server.
With runserver you don't know the port. Use case: You want to link from >an internal wiki to a dev-system ....
With runserver, the port is well defined as mentioned above. And you can make it listen on a different port for exapmle with:
./manage.py runserver 0.0.0.0:9090
Note that if you put development server behind apache (as a reverse proxy) or NGINX, restarting problems etc that I have mentioned above do not apply here.
So in short, for development work, what ever you do with mod_wsgi can be done with the django development server (aka ./manage.py runserver).
Inotify
Here we are getting to the main topic at last. Assuming you have installed inotify-tools you could type this into your shell. You don't need to write a script.
while inotifywait -r -e modify .; do sudo kill -2 yourpid ; done
This will result in the code being reloaded when ...
... using daemon mode with a single process you can send a SIGINT
signal to the daemon process using the ‘kill’ command, or have the
application send the signal to itself when a specific URL is
triggered.
ref: http://modwsgi.readthedocs.io/en/develop/user-guides/frequently-asked-questions.html#application-reloading
alternatively
while inotifywait -r -e modify .; do touch wsgi.py ; done
when
... using daemon mode, with any number of processes, and the process
reload mechanism of mod_wsgi 2.0 has been enabled, then all you need
to do is touch the WSGI script file, thereby updating its modification
time, and the daemon processes will automatically shutdown and restart
the next time they receive a request.
In both situations we are using the -r flag to tell inotify to monitor subdirectories. That means each time you save a .css or .js file apache will reload. But without the -r flag changes to python code in subfolders will be undetected. To have the best of both worls, remove css, js, images etc with the --exclude directive.
What about when your IDE saves an auto backup file? or vim saves the .swp file? That too will cause a code reload. So you would have to exclude those file types too.
So in short, it's a lot of hard work to reproduce what the django development server does free of charge.

You can use inotify hooktables to run any command you want depending on a i-notify signal (here's my source link: http://terokarvinen.com/2016/when-files-change-take-action-inotify-hookable).
After looking the tables you can just reload the code of apache.
For your specific problem, it should be something like:
inotify-hookable --watch-directories sources/ --recursive --on-modify-command './code_reload.sh'
In the previous link, the command to execute was just a simple touch flask/init.wsgi
So, the whole code (adding ignored files was):
inotify-hookable --watch-directories flask/ --recursive --ignore-paths='flask/init.wsgi' --on-modify-command 'touch flask/init.wsgi'
As stated here: Flask + mod_wsgi automatic reload on source code change, if you have enabled WSGIScriptReloading, you can just touch that file. It will cause the entire code to reload (not just the config file). But, if you prefer, you can set any other script to reload the code.
After googling a bit, it seems to be a pretty standard solution for that problem and I think that you can use it for your application.

Related

Error with Python subprocess when running Flask app using nginx + WSGI

I have developed a Python web server using Flask, and some of the endpoints make use of the subprocess module to call different executables. On development, using the Flask debug server, everything works fine. However, when running the server along with nginx+WSGI (on the exact same machine), some subprocess calls fail.
For example, one of the tools I'm using is Microsoft's dotnet, which I installed from my user as sudo apt-get install -y aspnetcore-runtime-5.0 and is then called from Python with the subprocess module. When I run the server with python3 server.py, it works like a charm. However, when using nginx and WSGI, the subprocess call fails with an exception that says: /bin/sh: 1: dotnet: not found.
I suspect this is due to the command not being accessible to the user and group running the server. I have used this guide as a reference to deploy the app, and on the wsgi .ini file, I have set uid = javierd and gid = www-data, while on the systemd .service file I have User=javierd, Group=www-data.
I have tried to add the executables' paths to /etc/profile, but it didn't work, and I don't know any other way to fix it. I find also very surprising that this happens to some executables, but not to all, and that it happes to dotnet, for example, which is located at /usr/bin/dotnet and therefore should be accessible to every user. Any idea on how to solve this problem? Furthermore, if somebody could explain me why this is happening, I would really appreciate the effort.
Thanks a lot!

Ok, finally after having a big headache, I noticed the error, and it was really simple.
On the tutorial I linked, when creating the system service file, the following line was included: Environment="PATH=/home/myuser/myfolder/enviroment/bin".
Of course, as this was overriding the path, there was no way of executing the commands. Once I notices it I just removed that line, restarted the service, and it was fixed.

How to watch activity in production environment like ./manage.py runserver?

In a local testing environment, python manage.py runserver is like this wonderful all-seeing-eye for activity in your development and page requests.
But in the actual server environment it's a black box - you can't see much of anything or diagnose issues.
How do you get an interface such as in manage.py runserver in a live environment?

Server Logs
Mostly, server logs typically replace the stream of information in the console.
You can set up logging here: https://docs.djangoproject.com/en/1.8/topics/logging/
Django exception email
With DEBUG off, and settings.ADMINS set, you will automatically receive a full traceback debug email every time there's an exception.
Runserver
You can also still run runserver on production environments, then hit the URL via CURL or other to say drop into pdb if it's an app issue.
Error monitoring tools
There are other tools, such as sentry https://github.com/getsentry/sentry which is a joy to use and debug issues. It takes exceptions and sends them to a multi platform (even frontend/JS exceptions) exception monitoring tool and draws lots of useful exception data.
Newrelic is another application monitoring tool that would automatically track exceptions with full tracebacks.
Brute force
You can always write to files with only a few lines of python without relying on any major tools:
with open('some-file.txt', 'a') as f:
f.write('foobar\n')

How can I change this Django Application

I was tasked with making some changes to a Django application. I've never worked with Django and I am having trouble figuring out how to get my changes to compile and be available online.
What I know so far is that the application is currently available online. netstat tells me that httpd is listening on port 80. My change was made in the myapp/views.py file.
I tried to restart httpd using services httpd restart but my changes did not take effect. I've been looking into the issue a bit an I believe that I need to run a command along the lines of:
I tried calling python manage.py runserver MY.IP.AD.DR:8000 and I get:
python manage.py runserver 129.64.101.14:8000
Validating models...
0 errors found
Django version 1.4.1, using settings 'cutsheets.settings'
Development server is running at http://MY.IP.AD.DR:8000/
Quit the server with CONTROL-C.
Nice that no errors are found but when I navigate to http://MY.IP.AD.DR:8000/ I just get a "Unable to connect" message from my browser. I tried with port 81 too and had the same problem.

Without knowing exactly how your application is set up, I can't really say exactly how to solve this problem.
I can tell you that it's quite common to use two web servers with Django - one handles the static content, and reverse proxies everything else to a different port where the Django app is listening. Restarting the normal HTTP daemon therefore wouldn't affect the Django app, so you need to restart the one handling the Django app. Until you restart it, the prior version of the code will be running.
I generally use Nginx as my static server and Gunicorn with the Django app, with Supervisor used to run Gunicorn, and this is a common setup. I recommend you take a look at the config for the main web server to see if it forwards anything to another port. If so, you need to see what server is running on that port and restart it.
Also, is there a Fabric configuration (fabfile.py)? A lot of people use Fabric to automate Django deployments, and if there is one then there may be a command already defined for deploying.

AWS deployment goes down frequently

So, I've got a very basic deployment on an EC2 instance that largely works, except for a couple of large issues. Right now I'm just ssh'ing into the box and running
python -m SimpleHTTPServer 80
and I have the box on a security group that allows http requests in on Port 80.
This seems to work, but if I leave it alone for a while (1-2 hours usually) my elastic ip will start returning 404s. I really need this server to stay up for demos to third parties. Any ideas on how to make sure it stays up?
Additionally it goes down when I close the terminal that's ssh'd into my box, which is extremely non-ideal as I would like this demo to stay up even if my computer is off. That's a less urgent matter, but any advice on that would also be appreciated.

Use screen!
Here's a quick tutorial: http://www.nixtutor.com/linux/introduction-to-gnu-screen/
Essentially just ssh in, open a new window via screen, start the server via python -m SimpleHTTPServer 80, then detach from the window. Additionally, you should be able to close your terminal and it should stay up.

I was able to solve this a little hackishly by putting together a cron job to run a bash script that spun up a server, but I'm not sure if this is the best way. It seems to have solved my problems in the short term though. For reference, this is the code I used:
import SimpleHTTPServer
import SocketServer
PORT = 80
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
httpd = SocketServer.TCPServer(("", PORT), Handler)
httpd.serve_forever()
Which I wrapped in a simple bash script:
#!/bin/bash
cd relevant/directory
sudo -u ubuntu python simple_server.py
I sure there was a better permissioning to use, but after that I just ran
chmod -R 777 bash_script.sh
To make sure I wouldn't run into any issues on that front.
And then placed in a cronjob to run every minute (The more the merrier, right?)
crontab -e (Just to bring up the relevant file)
Added in this line:
*/1 * * * * path/to/bash_script.sh
And it seems to be working. I closed out my ssh'd terminal and everything still runs and nothing has gone down yet. I will update if something does, but I'm generally happy with this solution (not that I will be in 2 weeks once I learn more about the subject), but it seems very minimal and low level, which means I at least understand what I just did.

SimpleHTTPServer just serves static pages on port 80, mainly for use during development.
For production usage (if you want to use EC2) I recommend you read up on Apache or nginx. Basically you want a web server that runs on Linux.
If you think your site will remain static files (HTML, CSS, JS) I recommend you host them on Amazon S3 instead. S3 is cheaper and way more reliable. Take a look at this answer for instructions: Static hosting on Amazon S3 - DNS Configuration
Enjoy!

Tornado code deployment

Is there a canonical code deployment strategy for tornado-based web application deployment. Our current configuration is 4 tornado processes running behind NginX? (Our specific use case is behind EC2.)
We've currently got a solution that works well enough, whereby we launch the four tornado processes and save the PIDs to a file in /tmp/. Upon deploying new code, we run the following sequence via fabric:
Do a git pull from the prod branch.
Remove the machine from the load balancer.
Wait for all in flight connections to finish with a sleep.
Kill all the tornadoes in the pid file and remove all *.pyc files.
Restart the tornadoes.
Attach the machine back to the load balancer.
We've taken some inspiration from this: http://agiletesting.blogspot.com/2009/12/deploying-tornado-in-production.html
Are there any other complete solutions out there?

We run Tornado+Nginx with supervisord as the supervisor.
Sample configuration (names changed)
[program:server]
process_name = server-%(process_num)s
command=/opt/current/vrun.sh /opt/current/app.py --port=%(process_num)s
stdout_logfile=/var/log/server/server.log
stderr_logfile=/var/log/server/server.err
numprocs = 6
numprocs_start = 7000
I've yet to find the "best" way to restart things, what I'll probably finally do is have Nginx have a "active" file which is updated letting HAProxy know that we're messing with configuration then wait a bit, swap things around, then re-enable everything.
We're using Capistrano (we've got a backlog task to move to Fabric), but instead of dealing with removing *.pyc files we symlink /opt/current to the release identifier.

I haven't deployed Tornado in production, but I've been playing with Gevent + Nginx and have been using Supervisord for process management - start/stop/restart, logging, monitoring - supervisorctl is very handy for this. Like I said, not a deployment solution, but maybe a tool worth using.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.