Use Flask's "app" singleton to Dask Scheduler/Workers

Use Flask's "app" singleton to Dask Scheduler/Workers - python

The Case:
We have some time-consuming functional/integration tests that utilize Flask's current_app for configuration (global variables etc.) and some logging.
We are trying to distribute and parallelize those tests on a cluster (for the moment a local "cluster" created from Dask's Docker image.).
The Issue(s?):
Let's assume the following example:
A time-consuming function:
def will_take_my_time(n)
# Add the 'TAKE_YOUR_TIME' in the config in how many seconds you want
time.sleep(current_app.config['TAKE_YOUR_TIME'])
return n
A time-consuming test:
def need_my_time_test(counter=None):
print(f"Test No. {will_take_my_time(counter)}")
A Flask CLI command that creates a Dask Client to connect to the cluster and execute 10 tests of need_my_time_test:
#app.cli.command()
def itests(extended):
with Client(processes=False) as dask_client:
futures = dask_client.map(need_my_time_test, range(10))
print(f"Futures: {futures}")
print(f"Gathered: {dask_client.gather(futures)}")
EDIT: For convenience let's add an application factory for an easier reproducible example:
def create_app():
app = Flask(__name__)
app.config.from_mapping(
SECRET_KEY='dev',
DEBUG=True,
)
#app.route('/hello')
def hello():
return 'Hello, World!'
#app.cli.command()
def itests(extended):
with Client(processes=False) as dask_client:
futures = dask_client.map(need_my_time_test, range(10))
print(f"Futures: {futures}")
print(f"Gathered: {dask_client.gather(futures)}")
Using the above with flask itests, we are running into the following error (described here):
RuntimeError: Working outside of application context.
This typically means that you attempted to use functionality that
needed to interface with the current application object in some way.
To solve this, set up an application context with app.app_context().
We have tried:
Pushing the app_context (app.app_context().push()) on the app singleton creation.
Using with current_app.app_context(): on the CLI command and some of the functions that use the current_app.
Sending the app_context through a Dask Variable but it cannot serialize the context.
With no avail.
The questions:
Are there any suggestions on what should we try (containing the "where" will be highly appreciated)?
Is there something that we tried correct but misused and we should retry it differently?

When using current_app proxy, it is assumed that the Flask app is created in the same process that the proxy is used.
This is not the situation when tasks submitted to the workers are run.
The tasks are executed isolated away from the Flask app created in the process that submitted the tasks.
In the task, define the flask app and provide the application context there.
import time
from flask import Flask
from dask.distributed import Client
def _create_app():
app = Flask(__name__)
app.config.from_mapping(
SECRET_KEY='dev',
DEBUG=True,
TAKE_YOUR_TIME=0.2
)
return app
def will_take_my_time(n):
# Add the 'TAKE_YOUR_TIME' in the config in how many seconds you want
app = _create_app()
with app.app_context():
time.sleep(app.config['TAKE_YOUR_TIME'])
return n
def need_my_time_test(counter=None):
print(f"Test No. {will_take_my_time(counter)}")
def create_app():
app = _create_app()
#app.route('/hello')
def hello():
return 'Hello, World!'
#app.cli.command()
def itests():
with Client(processes=False) as dask_client:
futures = dask_client.map(need_my_time_test, range(10))
print(f"Futures: {futures}")
print(f"Gathered: {dask_client.gather(futures)}")
return app
app = create_app()

Related

How do I import my Flask app into my Pytest tests?

I'm using a application factory pattern, and when I tried to run my test, I get "Attempted to generate a URL without the application context being". I created a fixture to create the application:
#pytest.fixture
def app():
yield create_app()
but when I run my test
def test_get_activation_link(self, app):
user = User()
user.set_password(self.VALID_PASS)
generated_link = user.get_activation_link()
I get the above error (from the line of code url = url_for("auth.activate")). I'm also trying to figure out to have the app creation run for every test, without having to import it into every test, but I can't seem to find if that's possible.

This works for my app
import pytest
from xxx import create_app
#pytest.fixture
def client():
app = create_app()
app.config['TESTING'] = True
with app.app_context():
with app.test_client() as client:
yield client
def smoke_test_homepage(client):
"""basic tests to make sure test setup works"""
rv = client.get("/")
assert b"Login" in rv.data
So, you missed the application context.
At this year's Flaskcon there was an excellent talk about the Flask context - I highly recommend this video.
https://www.youtube.com/watch?v=fq8y-9UHjyk

Notification from Flask to JS using socket.IO

I'm using the Flask-SocketIO library which works fine but I need to send a notification with emit to the outside of a socket.io decorator and it's a real pain. Looking at the solutions, many people use rabbitmq or redis but I don't know how to use them.
Here's my code :
from flask import Flask, render_template
from flaskwebgui import FlaskUI
from flask_socketio import SocketIO, emit
app = Flask(__name__)
async_mode = None
app.config['SECRET_KEY'] = 'hello'
socketio = SocketIO(app, async_mode=async_mode, message_queue='amqp:///socketio')
def run_sock():
socketio.run(app, debug=True)
ui = FlaskUI(app, fullscreen=True, server=run_sock,)
#app.route("/")
def index():
return render_template('index.html')
#socketio.on('test', namespace='/test')
def test():
print("test")
if __name__ == "__main__":
ui.run()
io = SocketIO(message_queue='amqp:///socketio')
io.emit('test_emit', {'data': 'toto'}, namespace='/test')
My JS front-end never gets the test_emit message, how do I do?

The problem with your emit is that it appears below the ui.run() call, which does not return until you close the application. Move the emit to any function in your application that executes while the server is running (such as a Flask view function) and it should work just fine.
Why do you have two SocketIO objects in the same process? The socketio instance that you defined near the top of the script can be used anywhere within the process, no need to create a second instance. You do not need to use a message queue for this problem, since you have all the usages of Socket.IO within a single process.

How do you make flask execute a function via the app.run command?

I have been trying to follow the tutorials to get flask apps to run on Heroku, like this one: https://dev.to/emcain/how-to-set-up-a-twitter-bot-with-python-and-heroku-1n39.
They all tell you to put this in your code in a file server.py:
from flask import Flask
app = Flask(__name__)
app.run(host='0.0.0.0')
And then run the app via the following command:
python3 server.py
But the tutorials don't explain how to connect the actual function you want to run using the app. In my case, I have a File testbot.py that has the function test(arg1) that contains the code I want to execute:
def test(arg1):
while(1):
#do stuff with arg1 on twitter
I want to do something like this:
from flask import Flask
from testbot import test
from threading import Thread
app = Flask(__name__)
app.addfunction(test(arg1='hardcodedparameter'))
app.run(host='0.0.0.0')
So that when the app runs my test() function executes with the argument. Right now my server is starting, but nothing is happening.
Am I thinking about this correctly?
*Edit: I got it working with the solution, so my server.py now looks like this:
from flask import Flask
from testbot import test
def main_process():
test("hardcodeparam")
app = Flask(__name__)
Thread(target=main_process).start()
app.run(debug=True,host='0.0.0.0')
And now test runs as expected.

Before app.run, register the function with a path, e.g.
#app.route('/')
def test(): # no argument
... do one iteration
return 'ok'
Then visiting the URL will trigger the function. Sites such as https://cron-job.org/ can automate that visiting on a regular basis for free, as suggested here.
If the regular intervals aren't good enough, then you could try:
#app.route('/')
def index(): # no argument
return 'ok'
def test():
while True:
# do stuff
from threading import Thread
Thread(target=test).start()
app.run(...)
You will probably still need to have a job regularly visiting the URL so that Heroku sees that the server is alive and in use.

How to use variables created in gunicorn's server hooks?

I am using gunicorn to run my Flask application. I would like to register server hooks to perform some action at the start of the application and before shutting down, but I am confused on how to pass variables to these functions and how to extract variables created within them.
In gunicorn.conf.py:
bind = "0.0.0.0:8000"
workers = 2
loglevel = "info"
preload = True
def on_starting(server):
# register some variables here
print "Starting Flask application"
def on_exit(server):
# perform some clean up tasks here using variables from the application
print "Shutting down Flask application"
In app.py, the sample Flask application:
from flask import Flask, request, jsonify
app = Flask(__name__)
#app.route('/hello', methods=['POST'])
def hello_world():
return jsonify(message='Hello World')
if __name__ == '__main__':
app.run(host='0.0.0.0', port=9000, debug=False)
Running gunicorn like so: $ gunicorn -c gunicorn.conf.py app:app

A bit late, but you have access to the flask application instance through server.app.wsgi(). It returns the same instance used by the workers (the one that is also returned by flask.current_app).
def on_exit(server):
flask_app = server.app.wsgi()
# do whatever with the flask app

Put the data you need to pass to the hooks into environment variables.
You can also store the data to be passed to the hooks and from them in files.

What are you trying to achieve is not possible due to the wsgi interface and the way the state is managed between requests.

Shared list among celery workers and Flask using python's multiprocessing

I'm building a Flask application which relies on Celery to process some long running tasks. Each task will essentially append a dictionary to a shared list once it has finished processing - this list is shared by the celery workers and the routes of the Flask application. The Flask component essentially consists of a set of routes to retrieve the contents of the shared list and modify the order of the elements.
I thin I have successfully shared the list between the Celery workers using a Manager from the Python's multiprocessing module. However, the changes made to this list are not seen by the Flask application. Here is a minimal application which illustrates the issue:
import os
import json
from flask import Flask
from multiprocessing import Manager
from celery import Celery
application = Flask(__name__)
redis_url = os.environ.get('REDIS_URL')
if redis_url is None:
redis_url = 'redis://localhost:6379/0'
# Set the secret key to enable cookies
application.secret_key = 'some secret key'
application.config['SESSION_TYPE'] = 'filesystem'
# Redis and Celery configuration
application.config['BROKER_URL'] = redis_url
application.config['CELERY_RESULT_BACKEND'] = redis_url
celery = Celery(application.name, broker=redis_url)
celery.conf.update(BROKER_URL=redis_url,
CELERY_RESULT_BACKEND=redis_url)
manager = Manager()
shared_queue = manager.list() # THIS IS THE SHARED LIST
#application.route("/submit", methods=['GET'])
def submit_song():
add_song_to_queue.delay()
return 'Added a song to the queue'
#application.route("/playlist", methods=['GET', 'POST'])
def get_playlist():
playlist = []
i = 0
queue_size = len(shared_queue)
while i < queue_size:
print(shared_queue[i])
playlist.append(shared_queue[i])
return json.dumps(playlist)
#celery.task
def add_song_to_queue():
shared_queue.append({'some':'data!'})
print(len(shared_queue))
if __name__ == "__main__":
application.run(host='0.0.0.0', debug=True)
In the celery logs I can clearly see that the dictionaries are being appended to the list, and that the size of the list increases. However, when I access the /playlist route on my browser I always get an empty list.
Does anyone know how I can get the list to be shared among all the workers and the Flask application?

I found a solution by moving away from Celery and instead using multiprocessing.Pool as a task queue and shared memory through Manager as shown in sample code in the question. This link has an excellent example of how this solution can be integrated with Flask: http://gouthamanbalaraman.com/blog/python-multiprocessing-as-a-task-queue.html
from multiprocessing import Pool
from flask import Flask
app = Flask(__name__)
_pool = None
def expensive_function(x):
# import packages that is used in this function
# do your expensive time consuming process
return x*x
#app.route('/expensive_calc/<int:x>')
def route_expcalc(x):
f = _pool.apply_async(expensive_function,[x])
r = f.get(timeout=2)
return 'Result is %d'%r
if __name__=='__main__':
_pool = Pool(processes=4)
try:
# insert production server deployment code
app.run()
except KeyboardInterrupt:
_pool.close()
_pool.join()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Use Flask's "app" singleton to Dask Scheduler/Workers - python

Related

How do I import my Flask app into my Pytest tests?

Notification from Flask to JS using socket.IO

How do you make flask execute a function via the app.run command?

How to use variables created in gunicorn's server hooks?

Shared list among celery workers and Flask using python's multiprocessing

Categories

Resources