flask deployment using internal werkzeug development server - python

Why is it not recommended to use the flask/werkzeug internal development webserver in production? What sort of issues can arise?
I'm asking because in work I'm being forced to do so and use a make shift cron to re-run the service every day!

If you're having to use a cron job to kill & restart it on a daily basis, you've already found a major issue with using the Flask development server. The development server is not written for stability, longevity, configurability, security, speed or much of anything other than convenience during development.
A proper WSGI setup will be faster, handle multiple connections properly and, most importantly for you, periodically restart your app process to clean out any cruft that might build up.

I had a network call inside the response handler that had no timeout. Something went wrong and it was waiting for a while (I was using the requests module), and then apparently never recovered.
Since Werkzeug server had only one thread, the whole development server became completely unavailable.

Related

Using Sanic's inbuilt webserver in Production

Django documentation states regarding their development server:
Don’t use this server in anything resembling a production environment.
It’s intended only for use while developing. (We’re in the business of
making Web frameworks, not Web servers.)
Sanic's deployment documentation do not say that we can not use it's built in server in production. It states:
Deploying Sanic is very simple using one of three options: the inbuilt
webserver, an ASGI webserver, or gunicorn. It is also very common to
place Sanic behind a reverse proxy, like nginx.
For me it means freedom from Apache. It also means that Nginx, Gunicorn, Daphne, Uvicorn, Hypercorn etc. are optional.
However, I found some negative comments regarding its built in server in Sanic: python web server that's written to die fast. On the other hand, Their github repository seems very active. Did they addressed the issues mentioned in the Reddit post?
Am I missing something?
Issue 1 deals with request size and timeout settings that allow for DoS attacks by flooding the server with too much data. These settings can be adjusted by the admin, according to the server hardware and the requirements of the site being run. That being said, the defaults probably should be lower than they are, to make such attacks on unconfigured servers more difficult.
Issue 2 claims that there is no backpressure handling in streaming responses. The current version does have flow control and thus gets proper backpressure control, avoiding such issues. Since this was quite badly overlooked in Python's asyncio protocol design, a lot of applications had such problems in the past, presumably also including Sanic at the time the blog was written.
As it is now, the Sanic server can certainly run directly on Internet, and that is in fact much safer against DoS than running Django behind nginx or Apache, where any long-lasting POST request blocks an entire Django worker.

Running Python through FastCGI with nginx on Ubuntu

I've already looked at other threads on this, but most don't go into enough setup detail which is where I need help.
I have an Ubuntu based VPS running with nginx, serving PHP sites through php-cgi on port 9000.
I'd like to start doing more with Python, so I've written a deployment script which I essentially want to use as a post-receive hook on my local GitLab server as my first python script. I can run this script successfully by running python script.py on the command line but in order to use this as a post-receive hook I need it be able to access it via http.
I looked at this guide on the nginx wiki but partway down is says to:
And start the django fastcgi process:
python ./manage.py runfcgi host=127.0.0.1 port=8080
Now, like I said I am pretty new to python, and I have never used the Django framework. Can anyone assit on how I am supposed to start the fastcgi server? Do I replace ./manage.py with the name of my script? Any help would be appriciated as everything I've found online refers to working with Django.
Do I replace ./manage.py with the name of my script?
No. It's highly unlikely your script is a FastCGI server, or that it can accept HTTP requests of any kind since you mention running it over the command line. (From what little I know of FastCGI, an app supporting it has to be able to handle a stream of requests coming in over stdin in a specific format, so there's definitely some plumbing involved.)
I'd say the easiest approach would be to use some web framework just to act as HTTP/FastCGI middleware. For your use a "microframework" like Flask (or even Paste but I found the documentation inscrutable) sounds like it'd work fine. The idea would be to have two interfaces to your main code, one that can handle command line arguments, and one that can handle a HTTP request, ultimately both would just call one function that actually does the work. (If you want to keep the command-line version of the app.)
The Flask documentation also mentions using uWSGI or standalone workers as deployment options. I'm not familiar with the former; the latter I wouldn't recommend for a simple, low-traffic app for the same reasons as the approach in the next paragraph.
Considering you use a VPS, you might even be able to just run the app as a standalone server process using the http.server module, but I'm not sure that's the better choice unless you absolutely want to avoid using any sort of framework. You'd have to make sure the app starts up if the server is rebooted or that it restarts when it crashes and it seems easier to just have nginx do the job of the supervisor.
UPDATE: Scratch that, it seems that nginx won't handle supervising a FastCGI worker process for you, which would've been the main advantage of the approach. In light of that it doesn't matter which of the three approaches you use since you'll have to set up a service supervisor one way or the other. I'd say go with uWSGI since flup (which is needed for Flask+FastCGI) seems abandoned since 2011, and the uWSGI protocol is apparently supported in nginx natively. Otherwise you'd need to use a different webserver than nginx, one that will manage a FastCGI worker for you. If this is an option, I'd consider Cherokee, which can be configured using a web GUI.
tl;dr: you need to write a (very simple) webapp. While it is feasible to do this without a web framework of any kind, in my opinion using one is easier, since you some (nontrivial) plumbing for free and there's a lot of guidance available on how to deploy them.

Best practice for making a django webapp restart itself

We have certain sysadmin settings that we expose to superusers of our django webapp. Things like the domain name (uses contrib.sites) and single sign-on configuration. Some of these settings are cached by the system, sometimes because we want to avoid an extra DB hit in the middleware on every request if we can help it, sometimes because it's contrib.sites, which has its own caching. So when the settings get changed, the changes don't take effect until the app is reloaded.
We want the app to restart itself when these changes are made, so that our clients don't need to pester us to do the restart for them.
Our webapp is running on apache via mod_wsgi, so we should be able to do this just by touching the wsgi file for the app whenever one of these settings is changed, but it feels a little weird to do that, and I'm worried there's some more graceful convention we should be following.
Is there a right way to apply updates that are cached and require the app to reload? Invalidating the caches for these things will be pretty hairy, so I think I'd avoid that unless the app restart thing has serious drawbacks.
For mod_wsgi read:
http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode
Some other WSGI servers have similar options, but options in other WSGI servers are usually more limited.
If you use WSGI and your process is being watched by a controller like supervisord, gunicorn, uwsgi or similar then you can simply send yourself a SIGINT or SIGQUIT (depending on controller). It should shut down the current process gracefully and the controller will restart it for you.
import signal, os
os.kill(os.getpid(), signal.SIGQUIT)
If you are running it on apache with mod_wsgi, just update the timestamp of wsgi config file everytime you make change to a model. Apache automatically restarts the application if the wsgi file gets updated.
It depends on your setup:
If you are using wsgi on a single server you could touch the wsgi file to let apache restart every instance of the app
If you are using gunicorn you probably use supervisord to controll it. Then a supervisorctl restart APPNAME would be the solution
If you scale your app on multiple servers you have to ensure that every server restarts their instances. There are several ways to achieve this:
use the same filesystem if you are using mod_wsgi then a touch would count for every server
log in to the other servers using ssh and make them restart your app
I am sure there are more ways to restart your app but it highly depends on your setup and whether or not you have to restart all instances or only one.

Is there a way to deploy new code with Tornado/Python without restarting the server?

I've recently started to experiment with Python and Tornado web server/framework for web development. Previously, I have used PHP with my own framework on a LAMP stack. With PHP, deploying updated code/new code is as easy as uploading it to the server because of the way mod_php and Apache interact.
When I add new code or update code in Python/Tornado, do I need to restart the Tornado server? I could see this being problematic if you have a number of active users.
(a) Do I have to restart the server, or is there another/better way?
(b) If so, how can I avoid users being disconnected/getting errors/etc. while it's restarting (which could take a few seconds)?
[One possible thought is to use the page flipping paradigm with Nginx pointing to a server, launch the new server instance with updated code, redirect Nginx there and take down the original server...?]
It appears the best method is to use Nginx with multiple Tornado instances as I alluded to in my original question and as Cole mentions. Nginx can reload its configuration file on the fly . So the process looks like this:
Update Python/Tornado web application code
Start a new instance of the application on a different port
Update the configuration file of Nginx to point to the new instance (testing the syntax of the configuration file first)
Reload the Nginx configuration file with a kill -HUP command
Stop the old instance of Python/Tornado web server
A couple useful resources on Nginx regarding hot-swapping the configuration file:
https://calomel.org/nginx.html (in "Explaining the directives in nginx.conf" section)
http://wiki.nginx.org/CommandLine (in "Loading a New Configuration Using Signals" section)
Use HAProxy or Nginx and proxy to multiple Tornado processes, which you can then restart one by one. The Tornado docs cover Nginx, but it doesn't support websockets, so if you're using them you'll need HAProxy.
You could use a debug=True switch with the tornado web instance.
T_APP = tornado.web.Application(<URL_MAP>, debug=True)
This reflects the handler changes as and when they happen.
Is this what you are searching for?
A module to automatically restart the server when a module is modified.
http://www.tornadoweb.org/en/branch2.4/autoreload.html
If you just want to deploy new code with tornado/python during development without restarting the server, you can use the realtimefunc decorator in this GitHub repository.

Python web programming

Good morning.
As the title indicates, I've got some questions about using python for web development.
What is the best setup for a development environment, more specifically, what webserver to use, how to bind python with it. Preferably, I'd like it to be implementable in both, *nix and win environment.
My major concern when I last tried apache + mod_python + CherryPy was having to reload webserver to see the changes. Is it considered normal? For some reason cherrypy's autoreload didn't work at all.
What is the best setup to deploy a working Python app to production and why? I'm now using lighttpd for my PHP web apps, but how would it do for python compared to nginx for example?
Is it worth diving straight with a framework or to roll something simple of my own? I see that Django has got quite a lot of fans, but I'm thinking it would be overkill for my needs, so I've started looking into CherryPy.
How exactly are Python apps served if I have to reload httpd to see the changes? Something like a permanent process spawning child processes, with all the major file includes happening on server start and then just lazy loading needed resources?
Python supports multithreading, do I need to look into using that for a benefit when developing web apps? What would be that benefit and in what situations?
Big thanks!
What is the best setup for a development environment?
Doesn't much matter. We use Django, which runs in Windows and Unix nicely. For production, we use Apache in Red Hat.
Is having to reload webserver to see the changes considered normal?
Yes. Not clear why you'd want anything different. Web application software shouldn't be dynamic. Content yes. Software no.
In Django, we develop without using a web server of any kind on our desktop. The Django "runserver" command reloads the application under most circumstances. For development, this works great. The times when it won't reload are when we've damaged things so badly that the app doesn't properly.
What is the best setup to deploy a working Python app to production and why?
"Best" is undefined in this context. Therefore, please provide some qualification for "nest" (e.g., "fastest", "cheapest", "bluest")
Is it worth diving straight with a framework or to roll something simple of my own?
Don't waste time rolling your own. We use Django because of the built-in admin page that we don't have to write or maintain. Saves mountains of work.
How exactly are Python apps served if I have to reload httpd to see the changes?
Two methods:
Daemon - mod_wsgi or mod_fastcgi have a Python daemon process to which they connect. Change your software. Restart the daemon.
Embedded - mod_wsgi or mod_python have an embedded mode in which the Python interpreter is inside the mod, inside Apache. You have to restart httpd to restart that embedded interpreter.
Do I need to look into using multi-threaded?
Yes and no. Yes you do need to be aware of this. No, you don't need to do very much. Apache and mod_wsgi and Django should handle this for you.
So here are my thoughts about it:
I am using Python Paste for developing my app and eventually also running it (or any other python web server). I am usually not using mod_python or mod_wsgi as it makes development setup more complex.
I am using zc.buildout for managing my development environment and all dependencies together with virtualenv. This gives me an isolated sandbox which does not interfere with any Python modules installed system wide.
For deployment I am also using buildout/virtualenv, eventually with a different buildout.cfg. I am also using Paste Deploy and it's configuration mechanism where I have different config files for development and deployment.
As I am usually running paste/cherrypy etc. standalone I am using Apache, NGINX or maybe just a Varnish alone in front of it. It depends on what configuration options you need. E.g. if no virtual hosting, rewrite rules etc. are needed, then I don't need a full featured web server in front. When using a web server I usually use ProxyPass or some more complex rewriting using mod_rewrite.
The Python web framework I use at the moment is repoze.bfg right now btw.
As for your questions about reloading I know about these problems when running it with e.g. mod_python but when using a standalone "paster serve ... -reload" etc. it so far works really well. repoze.bfg additionally has some setting for automatically reloading templates when they change. If the framework you use has that should be documented.
As for multithreading that's usually used then inside the python web server. As CherryPy supports this I guess you don't have to worry about that, it should be used automatically. You should just eventually make some benchmarks to find out under what number of threads your application performs the best.
Hope that helps.
+1 to MrTopf's answer, but I'll add some additional opinions.
Webserver
Apache is the webserver that will give you the most configurability. Avoid mod_python because it is basically unsupported. On the other hand, mod_wsgi is very well supported and gives you better stability (in other words, easier to configure for cpu/memory usage to be stable as opposed to spikey and unpredictable).
Another huge benefit, you can configure mod_wsgi to reload your application if the wsgi application script is touched, no need to restart Apache. For development/testing servers you can even configure mod_wsgi to reload when any file in your application is changed. This is so helpful I even run Apache+mod_wsgi on my laptop during development.
Nginx and lighttpd are commonly used for webservers, either by serving Python apps directly through a fastCGI interface (don't bother with any WSGI interfaces on these servers yet) or by using them as a front end in front of Apache. Calls into the app get passed through (by proxy) to Apache+mod_wsgi and then nginx/lighttpd serve the static content directly.
Nginx has the added advantage of being able to serve content directly from memcached if you want to get that sophisticated. I've heard disparaging comments about lighttpd and it does seem to have some development problems, but there are certainly some big companies using it successfully.
Python stack
At the lowest level you can program to WSGI directly for the best performance. There are lots of helpful WSGI modules out there to help you in areas you don't want to develop yourself. At this level you'll probably want to pick third-party WSGI components to do things like URL resolving and HTTP request/response handling. A great request/response component is WebOb.
If you look at Pylons you can see their idea of "best-of-breed" WSGI components and a framework that makes it easier than Django to choose your own components like templating engine.
Django might be overkill but I don't think that's a really good argument against. Django makes the easy stuff easier. When you start to get into very complicated applications is where you really need to look at moving to lower level frameworks.
Look at Google App Engine. From their website:
Google App Engine lets you run your
web applications on Google's
infrastructure. App Engine
applications are easy to build, easy
to maintain, and easy to scale as your
traffic and data storage needs grow.
With App Engine, there are no servers
to maintain: You just upload your
application, and it's ready to serve
your users.
You can serve your app using a free
domain name on the appspot.com domain,
or use Google Apps to serve it from
your own domain. You can share your
application with the world, or limit
access to members of your
organization.
App Engine costs nothing to get
started. Sign up for a free account,
and you can develop and publish your
application for the world to see, at
no charge and with no obligation. A
free account can use up to 500MB of
persistent storage and enough CPU and
bandwidth for about 5 million page
views a month.
Best part of all: It includes Python support, including Django. Go to http://code.google.com/appengine/docs/whatisgoogleappengine.html
When you use mod_python on a threaded Apache server (the default on Windows), CherryPy runs in the same process as Apache. In that case, you almost certainly don't want CP to restart the process.
Solution: use mod_rewrite or mod_proxy so that CherryPy runs in its own process. Then you can autoreload to your heart's content. :)

Categories