Celery: Connect to Remote Broker to Share Tasks - python

I have many time-consuming tasks that need to be shared by several machines. I currently have one master machine using Celery workers to do the task. I'm using RabbitMQ as the broker and redis as the backend running in that machine locally. The master machine is also responsible for deploying tasks and return results.
I wonder if it is possible to have slave machines remotely connected to the broker and result backend in the master machine to fetch jobs, so that all the machines work together. I think I just need to configure RabbitMQ and redis settings somehow and then start the Celery workers in the slave machines. Thanks a lot.

When looking at the Celery documentation there is absolutely no limitation that you can't access RabbitMQ from worker your processes as a remote server instead of just using localhost. Take a look CELERY_QUEUE_HA_POLICY here.

Related

Celery How to make a worker run only when other workers are broken?

I have two servers, there is one celery worker on each server. I use Redis as the broker to collaborate with workers.
My question is how can I make only one worker run for most of the time, and once this worker is broken, another worker will turn on as a backup worker?
Basically, just take one worker as a back-up.
I know how to specify a task to a certain worker by a queue on the worker respectively, after reading the doc [http://docs.celeryproject.org/en/latest/userguide/routing.html#redis-message-priorities]
This is, in my humble opinion, completely against the point of having distributed system to off-load CPU heavy, or long-running tasks, or have thousands of small tasks that you can't run elsewhere...
- You are running two servers anyway, so why keeping the other one idle? More workers mean you will be able to process more tasks concurrently.
If you are not convinced, and still want to do this, you need to write a tiny service on machine with idle Celery worker. This service will periodically check the health of the active worker, and if that check fails, you will run Celery worker on the backup server.
Here is a question for you - why this service simply does not restart the Celery worker on the active server? - It is pretty much possible to do that, so again, I see no justification for having a completely idle machine doing nothing. If you are on a cloud platform, you can easily spin up a new instance from an existing image of your Celery worker. This is scenario I use in production.

Cannot connect to rabbitmq message broker in flask-celery application

I have created a flask application to process GNSS data. There are certain functions which takes a lot of time to execute. Therefore i have integrated celery to perform those functions as Asynchronous tasks. First I have tested the app in localhost by adding message broker as rabbitmq
app.config['CELERY_BROKER_URL']='amqp://localhost//'
app.config['CELERY_RESULT_BACKEND']='db+postgresql://username:pssword#localhost/DBname'
After fully tested the application in virtualenv I deployed It on heroku and added rabbitmq addon. Then I changed the app.config as follows.
app.config['CELERY_BROKER_URL']='amqp://myUsername:Mypassowrd#small-fiver-23.bigwig.lshift.net:10123/FlGJwZfbz4TR'
app.config['CELERY_RESULT_BACKEND']='db+postgres://myusername:Mypassword#ec2-54-163-246-193.compute-1.amazonaws.com:5432/dhcbl58v8ifst/MYDB'
After changing the above I ran the celery worker
celery -A app.celery worker --loglevel=info
and get this error
[2018-03-16 11:21:16,796: ERROR/MainProcess] consumer: Cannot connect to amqp://SHt1Xvhb:**#small-fiver-23.bigwig.lshift.net:10123/FlGJwZfbz4TR: timed out.
How can I check whether my heroku addon is working from Rabbitmq management console
It seems the port 10123 is not exposed. Can you try telnet small-fiver-23.bigwig.lshift.net 10123 from the server and see if you're able to connect successfully to the server?
If not, you have to expose that port to be accessible from the server you're trying to connect to.

Python-rq with flask + uwsgi + Nginx : Do I need more uwsgi processes or redis workers?

I have a server with above configuration and I am processing long tasks but I have to update user about the process state, which I am doing through Firebase. To respond to the client immediately I enqueue the job in redis using python-rq.
I am using flask and uwsgi and Nginx. In uwsgi conf file, there is a field which asks for number of processes.
My question is, Do I need to start multiple uwsgi processes, or more redis workers?
Does starting more uwsgi workers will create more redis workers?
How would the scaling work, My server has 1 vCPU and 2GB ram. I have aws autoscaling for production. Should I run more uWsgi workers and how many redis workers with only one queue.
I am starting the worker independently. The flask app is importing the connection and adding the job.
my startup script
my worker code
It depends upon how you're running rq workers. There can be two cases
1) Running rq workers from inside the app. Then increasing number of workers in uwsgi settings will automatically spawn num_rq_workers_in_app_conf * num_app_workers_in_uwsgi_conf
2) Running rq workers outside application like using supervisord. Where you can manually control number of rq workers independently of app.
According to me running rq workers under supervisord is a better option than point 1. It helps in effective debugging of each worker and one more issue which I've encountered while using rq is that rq-workers running via point 1 strategy unregisters themselves from rq i.e becomes dead for rq although running in background in few weeks interval.

celery launches more processes than configured

I'm running a celery machine, using redis as the broker with the following configuration:
celery -A project.tasks:app worker -l info --concurrency=8
When checking the number of celery running processes, I see more than 8.
Is there something that I am missing? Is there a limit for max concurrency?
This problem causes huge memory allocation, and is killing the machine.
With the default settings Celery will always start one more process than the number you ask. This additional process is a kind of bookkeeping process that is used to coordinate the other processes that are part of the worker. It communicates with the rest of Celery, and dispatches the tasks to the processes that actually run the tasks.
Switching to a different pool implementation than the "prefork" default might reduce the number of processes created but that's opening new can of worms.
For the concurrency problem, I have no suggestion.
For the memory problem, you can look at redis configuration in ~/.redis/redis.conf. You have a maxmemory attribute which fix a limit upon tasks…
See the Redis configuration

How can I communicate with Celery on Cloud Foundry?

I have a wsgi app with a celery component. Basically, when certain requests come in they can hand off relatively time-consuming tasks to celery. I have a working version of this product on a server I set up myself, but our client recently asked me to deploy it to Cloud Foundry. Since Celery is not available as a service on Cloud Foundry, we (me and the client's deployment team) decided to deploy the app twice – once as a wsgi app and once as a standalone celery app, sharing a rabbitmq service.
The code between the apps is identical. The wsgi app responds correctly, returning the expected web pages. vmc logs celeryapp shows that celery is to be up-and-running, but when I send requests to wsgi that should become celery tasks, they disappear as soon as they get to a .delay() statement. They neither appear in the celery logs nor do they appear as an error.
Attempts to debug:
I can't use celery.contrib.rdb in Cloud Foundry (to supply a telnet interface to pdb), as each app is sandboxed and port-restricted.
I don't know how to find the specific rabbitmq instance these apps are supposed to share, so I can see what messages it's passing.
Update: to corroborate the above statement about finding rabbitmq, here's what happens when I try to access the node that should be sharing celery tasks:
root#cf:~# export RABBITMQ_NODENAME=eecef185-e1ae-4e08-91af-47f590304ecc
root#cf:~# export RABBITMQ_NODE_PORT=57390
root#cf:~# ~/cloudfoundry/.deployments/devbox/deploy/rabbitmq/sbin/rabbitmqctl list_queues
Listing queues ...
=ERROR REPORT==== 18-Jun-2012::11:31:35 ===
Error in process <0.36.0> on node 'rabbitmqctl17951#cf' with exit value: {badarg,[{erlang,list_to_existing_atom,["eecef185-e1ae-4e08-91af-47f590304ecc#localhost"]},{dist_util,recv_challenge,1},{dist_util,handshake_we_started,1}]}
Error: unable to connect to node 'eecef185-e1ae-4e08-91af-47f590304ecc#cf': nodedown
diagnostics:
- nodes and their ports on cf: [{'eecef185-e1ae-4e08-91af-47f590304ecc',57390},
{rabbitmqctl17951,36032}]
- current node: rabbitmqctl17951#cf
- current node home dir: /home/cf
- current node cookie hash: 1igde7WRgkhAea8fCwKncQ==
How can I debug this and/or why are my tasks vanishing?
Apparently the problem was caused by a deadlock between the broker and the celery worker, such that the worker would never acknowledge the task as complete, and never accept a new task, but never crashed or failed either. The tasks weren't vanishing; they were simply staying in queue forever.
Update: The deadlock was caused by the fact that we were running celeryd inside a wrapper script that installed dependencies. (Literally pip install -r requirements.txt && ./celeryd -lINFO). Because of how Cloud Foundry manages process trees, Cloud Foundry would try to kill the parent process (bash), which would HUP celeryd, but ultimately lots of child processes would never die.

Categories