How to remote debug Flask request behind uWSGI in PyCharm - python

I've read some documentation online about how to do remote debugging with PyCharm - https://www.jetbrains.com/help/pycharm/remote-debugging.html
But there was one key issue with that for what I was trying to do, with my setup - Nginx connecting to uWSGI, which then connects to my Flask app. I'm not sure, but setting up something like,
import sys
sys.path.append('pycharm-debug.egg')
import pydevd
pydevd.settrace('localhost', port=11211,
stdoutToServer=True, stderrToServer=True,
suspend=False)
print 'connected'
from wsgi_configuration_module import app
My wsgi_configuration_module.py file is the uWSGI file used in Production, i.e. no debug.
Connects the debugger to the main/master process of uWSGI, which is run once only, at uWSGI startup / reload, but if you try to set a breakpoint in code blocks of your requests, I've found it to either skip over it, or hang entirely, without ever hitting it, and uWSGI shows a gateway error, after timeout.

The problem here, as far as I see it is exactly that last point, the debugger connects to uWSGI / the application process, which is not any of the individual request processes.
To solve this, from my situation, it needed 2 things changed, 1 of which is the uWSGI configuration for my app. Our production file looks something like
[uwsgi]
...
master = true
enable-threads = true
processes = 5
But here, to give the debugger (and us) an easy time to connect to the request process, and stay connected, we change this to
[uwsgi]
...
master = true
enable-threads = false
processes = 1
Make it the master, disable threads, and limit it to only 1 process - http://uwsgi-docs.readthedocs.io/en/latest/Options.html
Then, in the startup python file, instead of setting the debugger to connect when the entire flask app starts, you set it to connect in a function decorated with the handy flask function, before_first_request http://flask.pocoo.org/docs/0.12/api/#flask.Flask.before_first_request, so the startup script changes to something like,
import sys
import wsgi_configuration_module
sys.path.append('pycharm-debug.egg')
import pydevd
app = wsgi_configuration_module.app
#app.before_first_request
def before_first_request():
pydevd.settrace('localhost', port=11211,
stdoutToServer=True, stderrToServer=True,
suspend=False)
print 'connected'
#
So now, you've limited uWSGI to no threads, and only 1 process to limit the chance of any mixup with them and the debugger, and set pydevd to only connect before the very first request. Now, the debugger connects (for me) successfully once, at the first request in this function, prints 'connected' only once, and from then on breakpoints connect in any of your request endpoint functions without issue.

Related

Using the Multiprocessing library with Flask. How to have access to the result of a process? [duplicate]

I'm using Flask for developing a website and while in development I run flask using the following file:
#!/usr/bin/env python
from datetime import datetime
from app import app
import config
if __name__ == '__main__':
print('################### Restarting #', datetime.utcnow(), '###################')
app.run(port=4004, debug=config.DEBUG, host='0.0.0.0')
When I start the server, or when it auto-restarts because files have been updated, it always shows the print line twice:
################### Restarting # 2014-08-26 10:51:49.167062 ###################
################### Restarting # 2014-08-26 10:51:49.607096 ###################
Although it is not really a problem (the rest works as expected), I simply wonder why it behaves like this? Any ideas?
The Werkzeug reloader spawns a child process so that it can restart that process each time your code changes. Werkzeug is the library that supplies Flask with the development server when you call app.run().
See the restart_with_reloader() function code; your script is run again with subprocess.call().
If you set use_reloader to False you'll see the behaviour go away, but then you also lose the reloading functionality:
app.run(port=4004, debug=config.DEBUG, host='0.0.0.0', use_reloader=False)
You can disable the reloader when using the flask run command too:
FLASK_DEBUG=1 flask run --no-reload
You can use the werkzeug.serving.is_running_from_reloader function if you wanted to detect when you are in the reloading child process:
from werkzeug.serving import is_running_from_reloader
if is_running_from_reloader():
print(f"################### Restarting # {datetime.utcnow()} ###################")
However, if you need to set up module globals, then you should instead use the #app.before_first_request decorator on a function and have that function set up such globals. It'll be called just once after every reload when the first request comes in:
#app.before_first_request
def before_first_request():
print(f"########### Restarted, first request # {datetime.utcnow()} ############")
Do take into account that if you run this in a full-scale WSGI server that uses forking or new subprocesses to handle requests, that before_first_request handlers may be invoked for each new subprocess.
If you are using the modern flask run command, none of the options to app.run are used. To disable the reloader completely, pass --no-reload:
FLASK_DEBUG=1 flask run --no-reload
Also, __name__ == '__main__' will never be true because the app isn't executed directly. Use the same ideas from Martijn's answer, except without the __main__ block.
if os.environ.get('WERKZEUG_RUN_MAIN') != 'true':
# do something only once, before the reloader
if os.environ.get('WERKZEUG_RUN_MAIN') == 'true':
# do something each reload
I had the same issue, and I solved it by setting app.debug to False. Setting it to True was causing my __name__ == "__main__" to be called twice.
From Flask 0.11, it's recommended to run your app with flask run rather than python application.py. Using the latter could result in running your code twice.
As stated here :
... from Flask 0.11 onwards the flask method is recommended. The reason for this is that due to how the reload mechanism works there are some bizarre side-effects (like executing certain code twice...)
I am using plugin - python-dotenv
and i will put this on my config file - .flaskenv:
FLASK_RUN_RELOAD=False
and this will avoid flask run twice for me.
One of the possible reason why the Flask app run itself twice is a configuration of WEB_CONCURRENCY setting on Heroku. To set into one, you can write in console
heroku config:set WEB_CONCURRENCY=1
I had the same issue. I solved it by modifying my main and inserting use_reloader=False into it. If anybody is here looking for a workaround for this problem then the below code will get you started, however, you will remove the functionality of changes in code being detected automatically, and restarting the application will not work. You will have to manually stop and restart your application after each edit in code.
if __name__ == '__main__':
app.run(debug=True, use_reloader=False)

How do I run two python Flask application(project) on server in parallel

I have two different Flask project. I want to run them on server on different link.
Currently I saw at a time one project I can see running.
I tried running on same port with different link, and also with different port. But I see it runs only one project at a time.
Project 1
if __name__ == '__main__':
app.run(host="0.0.0.0", port=5001,debug = True)
Project 2
I tried running
export FLASK_APP=app.py
flask run --host 0.0.0.0 --port 5000
Also this way
if __name__ == '__main__':
app.run(host="0.0.0.0", port="5000",debug = True)
I recently did a parallel threading operation with my own website in Flask. So I completely understand your confusion, though I'm going to explain this the best of my abilities.
When creating parallel operations, it's best to use multi-threading. Basically multi-threading is meant for splitting operations up and doing them simultaneously on the CPU. Though this must be supported by the CPU, which most by today are supporting Multi-Threading.
Anyways, with the application. I initialized the Flask Application classes to share the data between all the threads, by using the main thread as the memory handler. Afterwards, I created the pages. Then within the initialization 'if statement'(if __name__ == '__main__') - Known as a driver class in other languages. I initialized and started the threads to do there parts of the application.
Notes:
Flask doesn't allow debug mode to be executed while not on the Main Thread. Basically meaning you cannot use the multi-threading on the Flask Apps when debugging the application, which is no problem. VSCode has a great output console to give me enough information to figure out the issues within the application. Though... sometimes thread error finding can be.. painful at times, it's best to watch your steps when debugging.
Another thing is you can still operate the threaded feature on Flask. Which I like to use on any Flask Application I make, because it allows better connection for the clients. For example, thread is disabled; the client connects and holds up the main thread, which holds it for a millisecond then releases it. Having threaded enabled; allows the clients to open and release multiple requests. Instead of all the clients piping through one thread.
Why would that be important? Well, if a client runs a heavy script that has to do operations on the local host machine, then that page's request query will be taking a larger amount of time. In returns, makes the client hold that main thread pipe, so therefore no-one else could connect.
My Code for your Issue:
import threading
from flask import Flask
# My typical setup for a Flask App.
# ./media is a folder that holds my JS, Imgs, CSS, etc.
app1 = Flask(__name__, static_folder='./media')
app2 = Flask(__name__, static_folder='./media')
#app1.route('/')
def index1():
return 'Hello World 1'
#app2.route('/')
def index2():
return 'Hello World 2'
# With Multi-Threading Apps, YOU CANNOT USE DEBUG!
# Though you can sub-thread.
def runFlaskApp1():
app1.run(host='127.0.0.1', port=5000, debug=False, threaded=True)
def runFlaskApp2():
app2.run(host='127.0.0.1', port=5001, debug=False, threaded=True)
if __name__ == '__main__':
# Executing the Threads seperatly.
t1 = threading.Thread(target=runFlaskApp1)
t2 = threading.Thread(target=runFlaskApp2)
t1.start()
t2.start()
PS: Run this app by doing python app.py instead of
export FLASK_APP=app.py
flask run --host 0.0.0.0 --port 5000
Hope this helps you, and happy developing!

Python threading, starting thread makes 2 threads [duplicate]

I'm using Flask for developing a website and while in development I run flask using the following file:
#!/usr/bin/env python
from datetime import datetime
from app import app
import config
if __name__ == '__main__':
print('################### Restarting #', datetime.utcnow(), '###################')
app.run(port=4004, debug=config.DEBUG, host='0.0.0.0')
When I start the server, or when it auto-restarts because files have been updated, it always shows the print line twice:
################### Restarting # 2014-08-26 10:51:49.167062 ###################
################### Restarting # 2014-08-26 10:51:49.607096 ###################
Although it is not really a problem (the rest works as expected), I simply wonder why it behaves like this? Any ideas?
The Werkzeug reloader spawns a child process so that it can restart that process each time your code changes. Werkzeug is the library that supplies Flask with the development server when you call app.run().
See the restart_with_reloader() function code; your script is run again with subprocess.call().
If you set use_reloader to False you'll see the behaviour go away, but then you also lose the reloading functionality:
app.run(port=4004, debug=config.DEBUG, host='0.0.0.0', use_reloader=False)
You can disable the reloader when using the flask run command too:
FLASK_DEBUG=1 flask run --no-reload
You can use the werkzeug.serving.is_running_from_reloader function if you wanted to detect when you are in the reloading child process:
from werkzeug.serving import is_running_from_reloader
if is_running_from_reloader():
print(f"################### Restarting # {datetime.utcnow()} ###################")
However, if you need to set up module globals, then you should instead use the #app.before_first_request decorator on a function and have that function set up such globals. It'll be called just once after every reload when the first request comes in:
#app.before_first_request
def before_first_request():
print(f"########### Restarted, first request # {datetime.utcnow()} ############")
Do take into account that if you run this in a full-scale WSGI server that uses forking or new subprocesses to handle requests, that before_first_request handlers may be invoked for each new subprocess.
If you are using the modern flask run command, none of the options to app.run are used. To disable the reloader completely, pass --no-reload:
FLASK_DEBUG=1 flask run --no-reload
Also, __name__ == '__main__' will never be true because the app isn't executed directly. Use the same ideas from Martijn's answer, except without the __main__ block.
if os.environ.get('WERKZEUG_RUN_MAIN') != 'true':
# do something only once, before the reloader
if os.environ.get('WERKZEUG_RUN_MAIN') == 'true':
# do something each reload
I had the same issue, and I solved it by setting app.debug to False. Setting it to True was causing my __name__ == "__main__" to be called twice.
From Flask 0.11, it's recommended to run your app with flask run rather than python application.py. Using the latter could result in running your code twice.
As stated here :
... from Flask 0.11 onwards the flask method is recommended. The reason for this is that due to how the reload mechanism works there are some bizarre side-effects (like executing certain code twice...)
I am using plugin - python-dotenv
and i will put this on my config file - .flaskenv:
FLASK_RUN_RELOAD=False
and this will avoid flask run twice for me.
One of the possible reason why the Flask app run itself twice is a configuration of WEB_CONCURRENCY setting on Heroku. To set into one, you can write in console
heroku config:set WEB_CONCURRENCY=1
I had the same issue. I solved it by modifying my main and inserting use_reloader=False into it. If anybody is here looking for a workaround for this problem then the below code will get you started, however, you will remove the functionality of changes in code being detected automatically, and restarting the application will not work. You will have to manually stop and restart your application after each edit in code.
if __name__ == '__main__':
app.run(debug=True, use_reloader=False)

WSGI - Why does it cache the output for : os.popen("date").read() in multi threads and picks one randomly?

I woke up and refreshed my wsgi script via web.
within this wsgi script there is this python code.
import os
ooo = os.popen("date").read()
the system date was incorrect. so i refreshed the wsgi script.
the system time was now showing something that was BEFORE the system time i saw earlier.
the more i refreshed the web browser.. the more i noticed the output is rather random.
the system time was as if..
either python or wsgi was caching it 10 times in 10 different threads and then displaying one randomly from those 10 cached threads.
based on information.. it turns out python is not doing the caching and that WSGI might be responsible for this caching.
well... my understanding was WSGI was simply permitting python to work via web.. i had no idea it was doing things as well such as threading and caching.
i even see suggestions that WSGI is loaded once and thus it can only execute it once.
does this mean i have to reload the WSGI script for every time i want a non-cached result ?
so basically might as well restart the whole apache for everytime i execute a wsgi script on my web site ?
i suppose this means i would be restarting apache 1 million times daily if my web site received that many hits per day ?
how can i tell WSGI to not cache the output for
os.popen("date").read()
?
The 'caching' happens because the WSGI server/container only loads the application once. After that, for each request it calls the WSGI function. That means that any global (module level) variables will only be initialized once.
Take this simple example:
#!/usr/bin/python2
import time
startup_time = time.ctime()
def application(environ, start_response):
current_time = time.ctime()
start_response('200 OK', [('Content-Type', 'text/plain')])
return [startup_time, '\n', current_time]
if __name__ == '__main__':
from wsgiref import simple_server
srv = simple_server.make_server('localhost', 8080, application)
srv.serve_forever()
if you'll run this, you'll see that the first time stays constant, the second doesn't. The same is true if you run this example using a different WSGI server (like apache/mod_wsgi), only that usually multiple instances of the application are launced and used to serve different requests. That explains why you see different values.
So the solution is simple: everything that should be dynamic must be generated within the call to the wsgi function, don't use globals.

Gunicorn Internal Server Errors

I have a Gunicorn server running a Django application which has a tendency to crash quite frequently. Unfortunately when it crashes all the Gunicorn workers go down simultaneously and silently bypass Django's and django-sentry's logging. All the workers return "Internal Server Error" but the arbiter does not crash so supervisord does not register it as a crash and thus does not restart the process.
My question is, is there a way to hook onto a Gunicorn worker crash and possibly send an email or do a logging statement? Secondly is there a way to get supervisord to restart Gunicorn server that is returning nothing but 500's?
Thanks in advance.
I highly recommend using zc.buildout. Here is an example using plugin Superlance for supervisord with buildout:
[supervisor]
recipe = collective.recipe.supervisor
plugins =
superlance
...
programs =
10 zeo ${zeo:location}/bin/runzeo ${zeo:location}
20 instance1 ${instance1:location}/bin/runzope ${instance1:location} true
...
eventlisteners =
Memmon TICK_60 ${buildout:bin-directory}/memmon [-p instance1=200MB]
HttpOk TICK_60 ${buildout:bin-directory}/httpok [-p instance1 -t 20 http://localhost:8080/]
Which will do http request every 20 seconds and restart process if it fails.
http://pypi.python.org/pypi/collective.recipe.supervisor/0.16

Categories