Distributed python: Celery send_task gets COMMAND_INVALID - python

Context
I developed a Flask API that sends tasks to my computing environment.
To use this, you should make a post request to the API.
Then, the API received your request, process it and send necessary data, through the RABBITMQ broker, a message to be held by the computing environment.
At the end, it should send the result back to the API
Some code
Here is an example of my API and my Celery application:
#main.py
# Package
import time
from flask import Flask
from flask import request, jsonify, make_response
# Own module
from celery_app import celery_app
# Environment
app = Flask()
# Endpoint
#app.route("/test", methods=["POST"])
def test():
"""
Test route
Returns
-------
Json formatted output
"""
# Do some preprocessing in here
result = celery_app.send_task(f"tasks.Client", args=[1, 2])
while result.state == "PENDING":
time.sleep(0.01)
result = result.get()
if result["sucess"]:
result_code = 200
else:
result_code = 500
output = str(result)
return make_response(
jsonify(
text=output,
code_status=result_code, ),
result_code,
)
# Main thread
if __name__ == "__main__":
app.run()
In a different file, I have setup my celery application connected to RABBITMQ Queue
#celery_app.py
from celery import Celery, Task
celery_app = Celery("my_celery",
broker=f"amqp://{USER}:{PASSWORD}#{HOSTNAME}:{PORT}/{COLLECTION}",
backend="rpc://"
)
celery_app.conf.task_serializer = "pickle"
celery_app.conf.result_serializer = "pickle"
celery_app.conf.accept_content = ["pickle"]
celery_app.conf.broker_connection_max_retries = 5
celery_app.conf.broker_pool_limit = 1
class MyTask(Task):
def run(self, a, b):
return a + b
celery_app.register_task(MyTask())
To run it, you should launch:
python3 main.py
Do not forget to run the celery worker (after registering tasks in it)
Then you can make a post request on it:
curl -X POST http://localhost:8000/test
The problem to resolve
When this simple API is running, I am sending request on my endpoint.
Unfortunatly, it fails 1 time on 4.
I have 2 messages:
The first message is:
amqp.exceptions.PreconditionFailed: (0, 0): (406) PRECONDITION_FAILED - delivery acknowledgement on channel 1 timed out. Timeout value used: 1800000 ms. This timeout value can be configured, see consumers doc guide to learn more
Then, because of the time out, my server has lost the message so:
File "main.py", line x, in test
result = celery_app.send_task("tasks.Client", args=[1, 2])
amqp.exceptions.InvalidCommand: Channel.close_ok: (503) COMMAND_INVALID - unimplemented method
Resolve this error
There are 2 solutions to get around this problem
retry to send a tasks until it fails 5 times in a row (try / except amqp.exceptions.InvalidCommand)
change the timeout value.
Unfortunatly, it doesn't seems to be the best ways to solve it.
Can you help me ?
Regards
PS:
my_packages:
Flask==2.0.2
python==3.6
celery==4.4.5
rabbitmq==latest

1. PreconditionFailed
I change my RabbitMQ version from latest to 3.8.14.
Then, I set up a celery task timeout using time_limit and soft_time_limit.
And it works :)
2. InvalidCommand
To resolve this problem, I use this retryfunctionnaluity.
I setup:
# max_retries=3
# autoretry_for=(InvalidCommand,)

Related

Python Flask API with stomp listener multiprocessing

TLDR:
I need to setup a flask app for multiprocessing such that the API and stomp queue listener are running in separate processes and therefore not interfering with each other's operations.
Details:
I am building a python flask app that has API endpoints and also creates a message queue listener to connect to an activemq queue with the stomp package.
I need to implement multiprocessing such that the API and listener do not block each other's operation. That way the API will accept new requests and the listener will continue to listen for new messages and carry out tasks accordingly.
A simplified version of the code is shown below (some details are omitted for brevity).
Problem: The multiprocessing is causing the application to get stuck. The worker's run method is not called consistently, and therefore the listener never gets created.
# Start the worker as a subprocess -- this is not working -- app gets stuck before the worker's run method is called
m = Manager()
shared_state = m.dict()
worker = MyWorker(shared_state=shared_state)
worker.start()
After several days of troubleshooting I suspect the problem is due to the multiprocessing not being setup correctly. I was able to prove that this is the case because when I stripped out all of the multiprocessing code and called the worker's run method directly, the all of the queue management code is working correctly, the CustomWorker module creates the listener, creates the message, and picks up the message. I think this indicates that the queue management code is working correctly and the source of the problem is most likely due to the multiprocessing.
# Removing the multiprocessing and calling the worker's run method directly works without getting stuck so the issue is likely due to multiprocessing not being setup correctly
worker = MyWorker()
worker.run()
Here is the code I have so far:
App
This part of the code creates the API and attempts to create a new process to create the queue listener. The 'custom_worker_utils' module is a custom module that creates the stomp listener in the CustomWorker() class run method.
from flask import Flask, request, make_response, jsonify
from flask_restx import Resource, Api
import sys, os, logging, time
basedir = os.path.dirname(os.getcwd())
sys.path.append('..')
from custom_worker_utils.custom_worker_utils import *
from multiprocessing import Manager
# app.py
def create_app():
app = Flask(__name__)
app.config['BASE_DIR'] = basedir
api = Api(app, version='1.0', title='MPS Worker', description='MPS Common Worker')
logger = get_logger()
'''
This is a placeholder to trigger the sending of a message to the first queue
'''
#api.route('/initialapicall', endpoint="initialapicall", methods=['GET', 'POST', 'PUT', 'DELETE'])
class InitialApiCall(Resource):
#Sends a message to the queue
def get(self, *args, **kwargs):
mqconn = get_mq_connection()
message = create_queue_message(initial_tracker_file)
mqconn.send('/queue/test1', message, headers = {"persistent":"true"})
return make_response(jsonify({'message': 'Initial Test Call Worked!'}), 200)
# Start the worker as a subprocess -- this is not working -- app gets stuck before the worker's run method is called
m = Manager()
shared_state = m.dict()
worker = MyWorker(shared_state=shared_state)
worker.start()
# Removing the multiprocessing and calling the worker's run method directly works without getting stuck so the issue is likely due to multiprocessing not being setup correctly
#worker = MyWorker()
#worker.run()
return app
Custom worker utils
The run() method is called, connects to the queue and creates the listener with the stomp package
# custom_worker_utils.py
from multiprocessing import Manager, Process
from _datetime import datetime
import os, time, json, stomp, requests, logging, random
'''
The listener
'''
class MyListener(stomp.ConnectionListener):
def __init__(self, p):
self.process = p
self.logger = p.logger
self.conn = p.mqconn
self.conn.connect(_user, _password, wait=True)
self.subscribe_to_queue()
def on_message(self, headers, message):
message_data = json.loads(message)
ticket_id = message_data[constants.TICKET_ID]
prev_status = message_data[constants.PREVIOUS_STEP_STATUS]
task_name = message_data[constants.TASK_NAME]
#Run the service
if prev_status == "success":
resp = self.process.do_task(ticket_id, task_name)
elif hasattr(self, 'revert_task'):
resp = self.process.revert_task(ticket_id, task_name)
else:
resp = True
if (resp):
self.logger.debug('Acknowledging')
self.logger.debug(resp)
self.conn.ack(headers['message-id'], self.process.conn_id)
else:
self.conn.nack(headers['message-id'], self.process.conn_id)
def on_disconnected(self):
self.conn.connect('admin', 'admin', wait=True)
self.subscribe_to_queue()
def subscribe_to_queue(self):
queue = os.getenv('QUEUE_NAME')
self.conn.subscribe(destination=queue, id=self.process.conn_id, ack='client-individual')
def get_mq_connection():
conn = stomp.Connection([(_host, _port)], heartbeats=(4000, 4000))
conn.connect(_user, _password, wait=True)
return conn
class CustomWorker(Process):
def __init__(self, **kwargs):
super(CustomWorker, self).__init__()
self.logger = logging.getLogger("Worker Log")
log_level = os.getenv('LOG_LEVEL', 'WARN')
self.logger.setLevel(log_level)
self.mqconn = get_mq_connection()
self.conn_id = random.randrange(1,100)
for k, v in kwargs.items():
setattr(self, k, v)
def revert_task(self, ticket_id, task_name):
# If the subclass does not implement this,
# then there is nothing to undo so just return True
return True
def run(self):
lst = MyListener(self)
self.mqconn.set_listener('queue_listener', lst)
while True:
pass
Seems like Celery is excatly what you need.
Celery is a task queue that can distribute work across worker-processes and even across machines.
Miguel Grinberg created a great post about that, Showing how to accept tasks via flask and spawn them using Celery as tasks.
Good Luck!
To resolve this issue I have decided to run the flask API and the message queue listener as two entirely separate applications in the same docker container. I have installed and configured supervisord to start and the processes individually.
[supervisord]
nodaemon=true
logfile=/home/appuser/logs/supervisord.log
[program:gunicorn]
command=gunicorn -w 1 -c gunicorn.conf.py "app:create_app()" -b 0.0.0.0:8081 --timeout 10000
directory=/home/appuser/app
user=appuser
autostart=true
autorestart=true
stdout_logfile=/home/appuser/logs/supervisord_worker_stdout.log
stderr_logfile=/home/appuser/logs/supervisord_worker_stderr.log
[program:mqlistener]
command=python3 start_listener.py
directory=/home/appuser/mqlistener
user=appuser
autostart=true
autorestart=true
stdout_logfile=/home/appuser/logs/supervisord_mqlistener_stdout.log
stderr_logfile=/home/appuser/logs/supervisord_mqlistener_stderr.log

Deploy Flask app with celery worker on Google Cloud

I have a very simple Flask app example which uses a celery worker to process a task asynchronously:
app.py
app.config['CELERY_BROKER_URL'] = os.environ.get('REDISCLOUD_URL', 'redis://localhost:6379')
app.config['CELERY_RESULT_BACKEND']= os.environ.get('REDISCLOUD_URL', 'redis://localhost:6379')
app.config['SQLALCHEMY_DATABASE_URI'] = conn_str
celery = make_celery(app)
db.init_app(app)
#app.route('/')
def index():
return "Working"
#app.route('/test')
def test():
task = reverse.delay("hello")
return task.id
#celery.task(name='app.reverse')
def reverse(string):
return string[::-1]
if __name__ == "__main__":
app.run()
To run it locally, I run celery -A app.celery worker --loglevel=INFO
in one terminal, and python app.py in another terminal.
I'm wondering how can I deploy this application on Google Cloud? I don't want to use Task Queues since it is only compatible with Python 2. Is there a good piece of documentation available for doing something like this? Thanks
App engine task queues is the previous version of Google Cloud Tasks, this has full support for App Engine Flex/STD and Python 3.x runtimes.
You need to create a Cloud Task Queue and an App engine service to handle the tasks
Gcloud command to create a queue
gcloud tasks queues create [QUEUE_ID]
Task handler code
from flask import Flask, request
app = Flask(__name__)
#app.route('/example_task_handler', methods=['POST'])
def example_task_handler():
"""Log the request payload."""
payload = request.get_data(as_text=True) or '(empty payload)'
print('Received task with payload: {}'.format(payload))
return 'Printed task payload: {}'.format(payload)
Code to push a task
"""Create a task for a given queue with an arbitrary payload."""
from google.cloud import tasks_v2
client = tasks_v2.CloudTasksClient()
# replace with your values.
# project = 'my-project-id'
# queue = 'my-appengine-queue'
# location = 'us-central1'
# payload = 'hello'
parent = client.queue_path(project, location, queue)
# Construct the request body.
task = {
'app_engine_http_request': { # Specify the type of request.
'http_method': tasks_v2.HttpMethod.POST,
'relative_uri': '/example_task_handler'
}
}
if payload is not None:
# The API expects a payload of type bytes.
converted_payload = payload.encode()
# Add the payload to the request.
task['app_engine_http_request']['body'] = converted_payload
if in_seconds is not None:
timestamp = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
# Add the timestamp to the tasks.
task['schedule_time'] = timestamp
# Use the client to build and send the task.
response = client.create_task(parent=parent, task=task)
print('Created task {}'.format(response.name))
return response
requirements.txt
Flask==1.1.2
gunicorn==20.0.4
google-cloud-tasks==2.0.0
You can check this full example in GCP Python examples Github page

How to get flask request context in celery task?

I have a flask server running within a gunicorn.
In my flask application I want to handle large upload files (>20GB), so I plan on letting a celery task do the handling of the large file.
The problem is that retrieving the file from request.files already takes quite long, in the meantime gunicorn terminates the worker handling that request. I could increase the timeout time, but the maximum file size is currently unknown, so I don't know how much time I would need.
My plan was to make the request context available to the celery task, as it was described here: http://xion.io/post/code/celery-include-flask-request-context.html, but I cannot make it work
Q1 Is the signature right?
I set the signature with
celery.signature(handle_large_file, args={}, kwargs={})
and nothing is complaining. I get the arguments I pass from the flask request handler to the celery task, but that's it. Should I somehow get a handle to the context here?
Q2 how to use the context?
I would have thought if the flask request context was available I could just use request.files in my code, but then I get the warning that I am out of context.
Using celery 4.4.0
Code:
# in celery.py:
from flask import request
from celery import Celery
celery = Celery('celery_worker',
backend=Config.CELERY_RESULT_BACKEND,
broker=Config.CELERY_BROKER_URL)
#celery.task(bind=True)
def handle_large_file(task_object, data):
# do something with the large file...
# what I'd like to do:
files = request.files['upfile']
...
celery.signature(handle_large_file, args={}, kwargs={})
# in main.py
def create_app():
app = Flask(__name__.split('.')[0])
...
celery_worker.conf.update(app.config)
# copy from the blog
class RequestContextTask(Task):...
celery_worker.Task = RequestContextTask
# in Controller.py
#FILE.route("", methods=['POST'])
def upload():
data = dict()
...
handle_large_file.delay(data)
What am I missing?

Returning 'still loading' response with Flask API

I have a scikit-learn classifier running as a Dockerised Flask app, launched with gunicorn. It receives input data in JSON format as a POST request, and responds with a JSON object of results.
When the app is first launched with gunicorn, a large model (serialised with joblib) is read from a database, and loaded into memory before the app is ready for requests. This can take 10-15 minutes.
A reproducible example isn't feasible, but the basic structure is illustrated below:
from flask import Flask, jsonify, request, Response
import joblib
import json
def classifier_app(model_name):
# Line below takes 10-15 mins to complete
classifier = _load_model(model_name)
app = Flask(__name__)
#app.route('/classify_invoice', methods=['POST'])
def apicall():
query = request.get_json()
results = _build_results(query['data'])
return Response(response=results,
status=200,
mimetype='application/json')
print('App loaded!')
return app
How do I configure Flask or gunicorn to return a 'still loading' response (or suitable error message) to any incoming http requests while _load_model is still running?
Basically, you want to return two responses for one request. So there are two different possibilities.
First one is to run time-consuming task in background and ping server with simple ajax requests every two seconds to check if task is completed or not. If task is completed, return result, if not, return "Please standby" string or something.
Second one is to use websockets and flask-socketio extension.
Basic server code would be something like this:
from threading import Thread
from flask import Flask
app = Flask(__name__)
socketio = SocketIO(app)
def do_work():
result = your_heavy_function()
socketio.emit("result", {"result": result}, namespace="/test/")
#app.route("/api/", methods=["POST"])
def start():
socketio.start_background_task(target=do_work)
# return intermediate response
return Response()
On the client side you should do something like this
var socket = io.connect('http://' + document.domain + ':' + location.port + '/test/');
socket.on('result', function(msg) {
// Process your request here
});
For further details, visit this blog post, flask-socketio documentation for server-side reference and socketio documentation for client-side reference.
PS Using web-sockets this you can make progress-bar too.

Multithreading with Flask

I'd like to call generate_async_audio_service from a view and have it asynchronously generate audio files for the list of words using a threading pool and then commit them to a database.
I keep running into an error that I'm working out of the application context even though I'm creating a new polly and s3 instance each time.
How can I generate/upload multiple audio files at once?
from flask import current_app,
from multiprocessing.pool import ThreadPool
from Server.database import db
import boto3
import io
import uuid
def upload_audio_file_to_s3(file):
app = current_app._get_current_object()
with app.app_context():
s3 = boto3.client(service_name='s3',
aws_access_key_id=app.config.get('BOTO3_ACCESS_KEY'),
aws_secret_access_key=app.config.get('BOTO3_SECRET_KEY'))
extension = file.filename.rsplit('.', 1)[1].lower()
file.filename = f"{uuid.uuid4().hex}.{extension}"
s3.upload_fileobj(file,
app.config.get('S3_BUCKET'),
f"{app.config.get('UPLOADED_AUDIO_FOLDER')}/{file.filename}",
ExtraArgs={"ACL": 'public-read', "ContentType": file.content_type})
return file.filename
def generate_polly(voice_id, text):
app = current_app._get_current_object()
with app.app_context():
polly_client = boto3.Session(
aws_access_key_id=app.config.get('BOTO3_ACCESS_KEY'),
aws_secret_access_key=app.config.get('BOTO3_SECRET_KEY'),
region_name=app.config.get('AWS_REGION')).client('polly')
response = polly_client.synthesize_speech(VoiceId=voice_id,
OutputFormat='mp3', Text=text)
return response['AudioStream'].read()
def generate_polly_from_term(vocab_term, gender='m'):
app = current_app._get_current_object()
with app.app_context():
audio = generate_polly('Celine', vocab_term.term)
file = io.BytesIO(audio)
file.filename = 'temp.mp3'
file.content_type = 'mp3'
return vocab_term.id, upload_audio_file_to_s3(file)
def generate_async_audio_service(terms):
pool = ThreadPool(processes=12)
results = pool.map(generate_polly_from_term, terms)
# do something w/ results
This is not necessarily a fleshed-out answer, but rather than putting things into comments I'll put it here.
Celery is a task manager for python. The reason you would want to use this is if you have tasks pinging Flask, but they take longer to finish than the interval of tasks coming in, then certain tasks will be blocked and you won't get all of your results. To fix this, you hand it to another process. This goes like so:
1) Client sends a request to Flask to process audio files
2) The files land in Flask to be processed, Flask will send an asyncronous task to Celery.
3) Celery is notified of the task and stores its state in some sort of messaging system (RabbitMQ and Redis are the canonical examples)
4) Flask is now unburdened from that task and can receive more
5) Celery finishes the task, including the upload to your database
Celery and Flask are then two separate python processes communicating with one another. That should satisfy your multithreaded approach. You can also retrieve the state from a task through Flask if you want the client to verify that the task was/was not completed. The route in your Flask app.py would look like:
#app.route('/my-route', methods=['POST'])
def process_audio():
# Get your files and save to common temp storage
save_my_files(target_dir, files)
response = celery_app.send_tast('celery_worker.files', args=[target_dir])
return jsonify({'task_id': response.task_id})
Where celery_app comes from another module worker.py:
import os
from celery import Celery
env = os.environ
# This is for a rabbitMQ backend
CELERY_BROKER_URL = env.get('CELERY_BROKER_URL', 'amqp://0.0.0.0:5672/0')
CELERY_RESULT_BACKEND = env.get('CELERY_RESULT_BACKEND', 'rpc://')
celery_app = Celery('tasks', broker=CELERY_BROKER_URL, backend=CELERY_RESULT_BACKEND)
Then, your celery process would have a worker configured something like:
from celery import Celery
from celery.signals import after_task_publish
env = os.environ
CELERY_BROKER_URL = env.get('CELERY_BROKER_URL')
CELERY_RESULT_BACKEND = env.get('CELERY_RESULT_BACKEND', 'rpc://')
# Set celery_app with name 'tasks' using the above broker and backend
celery_app = Celery('tasks', broker=CELERY_BROKER_URL, backend=CELERY_RESULT_BACKEND)
#celery_app.task(name='celery_worker.files')
def async_files(path):
# Get file from path
# Process
# Upload to database
# This is just if you want to return an actual result, you can fill this in with whatever
return {'task_state': "FINISHED"}
This is relatively basic, but could serve as a starting point. I will say that some of Celery's behavior and setup is not always the most intuitive, but this will leave your flask app available to whoever wants to send files to it without blocking anything else.
Hopefully that's somewhat helpful

Categories