Getting Celery task results using RPC backend - python

I'm struggling with getting results from the Celery task.
My app entry point looks like this:
from app import create_app,celery
celery.conf.task_default_queue = 'order_master'
order_app = create_app('../config.order_master.py')
Now, before I start the application I start the RabbitMQ and ensure it has no queues:
root#3d2e6b124780:/# rabbitmqctl list_queues
Timeout: 60.0 seconds ...
Listing queues for vhost / ...
root#3d2e6b124780:/#
Now I start the application. After the start I still see no queues in the RabbitMQ. When I start the task from the application jobs.add_together.delay(2, 3) I get the task ID:
ralfeus#web-2 /v/w/order (multiple-instances)> (order) curl localhost/test
{"result":"a2c07de4-f9f2-4b21-ae47-c6d92f2a7dfe"}
ralfeus#web-2 /v/w/order (multiple-instances)> (order)
At that moment I can see that my queue has one message:
root#3d2e6b124780:/# rabbitmqctl list_queues
Timeout: 60.0 seconds ...
Listing queues for vhost / ...
name messages
dd65ba89-cce9-3e0b-8252-c2216912a910 0
order_master 1
root#3d2e6b124780:/#
Now I start Celery worker:
ralfeus#web-2 /v/w/order (multiple-instances)>
/usr/virtualfish/order/bin/celery -A main_order_master:celery worker --loglevel=INFO -n order_master -Q order_master --concurrency 2
INFO:app:Blueprints are registered
-------------- celery#order_master v5.0.0 (singularity)
--- ***** -----
-- ******* ---- Linux-5.4.0-51-generic-x86_64-with-glibc2.29 2020-10-22 16:38:56
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app: app:0x7f374715c5b0
- ** ---------- .> transport: amqp://guest:**#172.17.0.1:5672//
- ** ---------- .> results: rpc://
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> order_master exchange=order_master(direct) key=order_master
[tasks]
. app.jobs.add_together
. app.jobs.post_purchase_orders
[2020-10-22 16:38:57,263: INFO/MainProcess] Connected to amqp://guest:**#172.17.0.1:5672//
[2020-10-22 16:38:57,304: INFO/MainProcess] mingle: searching for neighbors
[2020-10-22 16:38:58,354: INFO/MainProcess] mingle: all alone
[2020-10-22 16:38:58,375: INFO/MainProcess] celery#order_master ready.
[2020-10-22 16:38:58,377: INFO/MainProcess] Received task: app.jobs.add_together[f855bec7-307d-4570-ab04-3d036005a87b]
[2020-10-22 16:40:38,616: INFO/ForkPoolWorker-2] Task app.jobs.add_together[f855bec7-307d-4570-ab04-3d036005a87b] succeeded in 100.13561034202576s: 5
So it's visible the worker could pick up the task and execute it and produce a result. However I can't get the result. Instead, when I request the result I get following:
curl localhost/test/f855bec7-307d-4570-ab04-3d036005a87b
{"state":"PENDING"}
ralfeus#web-2 /v/w/order (multiple-instance)> (order)
If I check the queues now I see that:
root#3d2e6b124780:/# rabbitmqctl list_queues
Timeout: 60.0 seconds ...
Listing queues for vhost / ...
name messages
dd65ba89-cce9-3e0b-8252-c2216912a910 1
65d80661-6195-3986-9fa2-e468eaab656e 0
celeryev.9ca5a092-9a0c-4bd5-935b-f5690cf9665b 0
order_master 0
celery#order_master.celery.pidbox 0
root#3d2e6b124780:/#
I see the queue dd65ba89-cce9-3e0b-8252-c2216912a910 has one message, which as I check contains result. So why has it appeared there and how do I get that? All manuals say I just need to get task by ID. But in my case the task is still in pending state.

According to Celery documentation:
RPC Result Backend (RabbitMQ/QPid)
The RPC result backend (rpc://) is special as it doesn’t actually
store the states, but rather sends them as messages. This is an
important difference as it means that a result can only be retrieved
once, and only by the client that initiated the task. Two different
processes can’t wait for the same result.
So using rpc:// isn't suitable for retrieving results later by another request.

Related

Celery configuration in settings.py file

Can anyone explain about these lines in celery RabbitMQ in Django. Which time it will be use ?
I ran 2 tasks(addition operation and endpoint in django) in celery RabbitMq without these lines successfully. So Please explain when it will be used in settings.py and celery rabbitmq
CELERY_BROKER_URL = 'amqp://localhost'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'
__init__.py :
from .celery import app as celery_app
__all__ = ('celery_app',)
Thanks in advance
The reason why your tasks are still running even without those explicit settings is because Celery has default values for them as written in its documentation.
https://docs.celeryproject.org/en/stable/userguide/configuration.html
To visualize this, here is a run where we wouldn't set the broker_url.
$ cat > tasks.py
from celery import Celery
app = Celery('my_app')
$ celery --app=tasks worker --loglevel=INFO
...
- ** ---------- [config]
- ** ---------- .> app: my_app:0x7f5a09295160
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 5 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
...
As you can see, even if we didn't set the broker explicitly, it defaulted to transport: amqp://guest:**#localhost:5672// which is the default for RabbitMQ as stated in the docs:
broker_url
Default: "amqp://"
The transport part is the broker implementation to use, and the
default is amqp, (uses librabbitmq if installed or falls back to
pyamqp).
Here is a run where we would explicitly set the broker_url. To see the difference, let's say our RabbitMQ broker listens at port 666 of localhost 127.0.0.1 with a different password.
$ cat > tasks.py
from celery import Celery
app = Celery('my_app')
app.conf.broker_url = "amqp://guest:a-more-secure-password#127.0.0.1:666"
$ celery --app=tasks worker --loglevel=INFO
...
- ** ---------- [config]
- ** ---------- .> app: my_app:0x7fb02579f160
- ** ---------- .> transport: amqp://guest:**#127.0.0.1:666//
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 5 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
...
Now, the broker was set to our configured value transport: amqp://guest:**#127.0.0.1:666//
You need to change those settings if the value would be different from the default ones. For further details about each configurable setting, please refer to the docs.
One particular use case of overriding the default value is as seen above in the example for broker_url, where we need to explicitly set it to use the RabbitMQ running in amqp://guest:a-more-secure-password#127.0.0.1:666 instead of the supposed-to-be default value of amqp://guest:guest#127.0.0.1:5672 which would have resulted to error consumer: Cannot connect to amqp://guest:**#127.0.0.1:5672//: [Errno 104] Connection reset by peer. Trying again in 2.00 seconds... (1/100) if we didn't set it.
Other references:
Default RabbitMQ user guest:guest
Default RabbitMQ port 5672

Celery Worker don't execute cassandra queries

I'm using
celery == 4.1.0 (latentcall)
[cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]
Python 2.7.14
I'm trying to execute Cassandra Query in Celery worker function. But Celery worker received task but not execute Query.
tasks.py
from cassandra.cluster import Cluster
from celery import Celery
app = Celery('<workername>', backend="rpc://", broker='redis://localhost:6379/0')
dbSession = Cluster().connect()
#app.tasks()
def get_data():
query = "SELECT * FROM customers"
CustomerObj = dbSession.execute(dbSession.prepare(query))
return CustomerObj
get_data.delay()
I start worker using :
$ celery worker -A <worker_name> -l INFO -c 1
-------------- celery#ubuntu v4.1.0 (latentcall)
---- **** -----
--- * *** * -- Linux-4.13.0-21-generic-x86_64-with-Ubuntu-17.10-artful 2018-04-20 14:31:41
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: Woker:0x7fa4a0e6f310
- ** ---------- .> transport: redis://localhost:6379/0
- ** ---------- .> results: rpc://
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
-------------- [queues]
.> celery exchange=celery(direct) key=celery
[tasks]
. Worker.get_data
[2018-04-20 14:31:41,271: INFO/MainProcess] Connected to redis://localhost:6379/0
[2018-04-20 14:31:41,285: INFO/MainProcess] mingle: searching for neighbors
[2018-04-20 14:31:42,315: INFO/MainProcess] mingle: all alone
.............
[2018-04-20 14:31:42,332: INFO/MainProcess] celery#ubuntu ready.
[2018-04-20 14:31:43,823: INFO/MainProcess] Received task: <worker_name>.get_data[8de91fdf-1388-4d5c-bb22-8cb00c1c065e]
Worker process is just stopped there.It will not execute that SELECT query and give any data.
Anyone suggest me How can I run this code to execute Cassandra Queries.
I think that you can't define dbSession globally.
Celery task can run in different workers, so the connection can't be global.
I can suggest two options:
Create the session within the task. It should work. The pros is that you'll create new session per each task. Maybe lazy (#LazyProperty) should help here.
You can create the connection in the worker level: try to create your session when worker start, maybe with worker_init signal (ref). The problem here is that you can have concurrency level > 1 (depends how you start the worker) - and than you need pool of sessions to serve more than one celery task at a time (handle more than one Cassandra session at a time).
By the way you should use the global keyword in python. If you are running one instance it may fix too.
Here is a related question that might help you: Celery Worker Database Connection Pooling
Good luck!
Since celery doesn't use the application's connection instance. Initiate a new connection at celery initiation. the below snippet is as per Cassandra documentation for celery
from celery import Celery
from celery.signals import worker_process_init, beat_init
from cassandra.cqlengine import connection
from cassandra.cqlengine.connection import (
cluster as cql_cluster, session as cql_session)
def cassandra_init(**kwargs):
""" Initialize a clean Cassandra connection. """
if cql_cluster is not None:
cql_cluster.shutdown()
if cql_session is not None:
cql_session.shutdown()
connection.setup()
# Initialize worker context for both standard and periodic tasks.
worker_process_init.connect(cassandra_init)
beat_init.connect(cassandra_init)
app = Celery()
This worked for me

Celery AsyncResult is always PENDING

I'm working on a demo and the code is simple:
# The Config
class Config:
BROKER_URL = 'redis://127.0.0.1:6379/0'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379/0'
CELERY_ACCEPT_CONTENT = ['application/json']
# The Task
#celery_app.task()
def add(x, y):
return x + y
To start the worker:
$ celery -A appl.task.celery_app worker --loglevel=info -broker=redis://localhost:6379/0
-------------- celery#ALBERTATMP v3.1.13 (Cipater)
---- **** -----
--- * *** * -- Linux-3.2.0-4-amd64-x86_64-with-debian-7.6
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: celery_test:0x293ffd0
- ** ---------- .> transport: redis://localhost:6379/0
- ** ---------- .> results: disabled
- *** --- * --- .> concurrency: 2 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
To schedule task:
>>> from appl.task import add
>>> r = add.delay(1, 2)
>>> r.id
'c41d4e22-ccea-408f-b48f-52e3ddd6bd66'
>>> r.task_id
'c41d4e22-ccea-408f-b48f-52e3ddd6bd66'
>>> r.status
'PENDING'
>>> r.backend
<celery.backends.redis.RedisBackend object at 0x1f35b10>
Then the worker will execute the task:
[2014-07-29 17:54:37,356: INFO/MainProcess] Received task: appl.task.add[beeef023-c582-42e1-baf7-9e19d9de32a0]
[2014-07-29 17:54:37,358: INFO/MainProcess] Task appl.task.add[beeef023-c582-42e1-baf7-9e19d9de32a0] succeeded in 0.00108124599865s: 3
But the result remains PENDING:
>>> res = add.AsyncResult(r.id)
>>> res.status
'PENDING'
I've tried the official FAQ. But it did not help.
>>> celery_app.conf['CELERY_IGNORE_RESULT']
False
What did I do wrong? Thanks!
Its been a while, but am leaving this more for others who come along with a similar issue:
In your screenshot, you see that the results are disabled
When you instantiate your celery instance, make sure that you have the right config inputs
from celery import Celery,Task
# here im using an AMQP broker with a memcached backend to store the results
celery = Celery('task1',broker='amqp://guest:guest#127.0.0.1:5672//',backend='cache+memcached://127.0.0.1:11211/')
For some reason, i always have trouble getting the celery instance parametered through the config file and hence explicitly passed in the broker and backend during instantiation as shown above
Now you'll see the results rightly configured to be memcached (in my instance - should be redis in yours). Also make sure that your task is picked up in the list of tasks (task1.add)
If you still cant get it to work, while starting celery try using the debug option as below
celery worker -A task1.celery -l debug
see if something is going wrong in the information it spews out
In my case, it fixed your error and result was set to success and i was able to recover 3 on r.get()
Try to change your broker to something else (like rabbitmq) and check the status again.
Make sure your redis server is up and accessible for celery.
redis-cli
keys *
and you should see some keys related to celery, if not it means there is a issue in your broker
This works for me:
from celery.result import AsyncResult
celery_task_result = AsyncResult(task_id)
task_state = celery_task_result.state
and task_state get all kinds of status: 'FAILURE', 'SUCCESS', 'PENDING', etc.

Celery, task called from other task is not working

I'm trying to make application with cellery. It should work on few workers and different workers are consuming from different queues. I've got something like this:
#celery.task
def task1():
do_something()
task2.delay()
#celery.task
def task2()
do_something()
so task1 which is running on worker1 should call task2 which should be send to queue from which consuming worker2. Problem is that it is not working. I receive id of AsyncResult but state of this task is all time PENDING. When I call task2 manually from python console it works fine.
Maybe I'm doing something wrong and it is not possiblem to run one task from other one?
And one more thing. Worker1 is doing task1 and send task2 to queue from which he not consuming - from this queue is consuming only worker2
Here's a simple example that I think accomplishes what you want.
from celery import Celery
import random
import string
celery = Celery('two_q',backend='amqp',broker='amqp://guest#localhost//')
#celery.task
def generate_rand_string(n):
# n = number of characters
rand_str = "".join([random.choice(string.lowercase) for i in range(n)])
#calls the second task and adds it to second queue
reverse.apply_async((rand_str,),queue="q2")
print rand_str
return rand_str
#celery.task
def reverse(s):
print s[::-1]
return s[::-1]
generate_rand_string.apply_async((10,), queue="q1")
When called with the -Q argument which specifies the list of the queues
celery worker --app=two_q -l info -Q q1,q2
it produces the following output:
pawel#iqmxma82x7:~/py/celery$ celery worker --app=two_q -l info -Q q1,q2
-------------- celery#iqmxma82x7 v3.0.23 (Chiastic Slide)
---- **** -----
--- * *** * -- Linux-3.2.0-54-generic-pae-i686-with-Ubuntu-12.04-precise
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> broker: amqp://guest#localhost:5672//
- ** ---------- .> app: cel_group:0x9bfef8c
- ** ---------- .> concurrency: 4 (processes)
- *** --- * --- .> events: OFF (enable -E to monitor this worker)
-- ******* ----
--- ***** ----- [queues]
-------------- .> q1: exchange:q1(direct) binding:q1
.> q2: exchange:q2(direct) binding:q2
[Tasks]
. two_q.generate_rand_string
. two_q.reverse
[2013-09-15 19:10:35,708: WARNING/MainProcess] celery#iqmxma82x7 ready.
[2013-09-15 19:10:35,716: INFO/MainProcess] consumer: Connected to amqp://guest#127.0.0.1:5672//.
[2013-09-15 19:10:40,731: INFO/MainProcess] Got task from broker: two_q.generate_rand_string[fa2ad56e-c66d-44a9-b908-2d95b2c9e5f3]
[2013-09-15 19:10:40,767: WARNING/PoolWorker-1] jjikjkepkc
[2013-09-15 19:10:40,768: INFO/MainProcess] Got task from broker: two_q.reverse[f52a8247-4674-4183-a826-d73cef1b64d4]
[2013-09-15 19:10:40,770: INFO/MainProcess] Task two_q.generate_rand_string[fa2ad56e-c66d-44a9-b908-2d95b2c9e5f3] succeeded in 0.0217289924622s: 'jjikjkepkc'
[2013-09-15 19:10:40,782: WARNING/PoolWorker-3] ckpekjkijj
[2013-09-15 19:10:40,801: INFO/MainProcess] Task two_q.reverse[f52a8247-4674-4183-a826-d73cef1b64d4] succeeded in 0.0195469856262s: 'ckpekjkijj'
You get two queues(q1,q2) and two workers.
As a comparison if you either call it without -Q argument or with only one queue:
celery worker --app=two_q -l info
The "reverse" task will not be called, because q2 to which it is added will not be known to celery.
Hope it helps.

Celery in Amazon Server

I have two nodes (two computers are already connected) in the server and every node has 4 cores, how can I cluster these two computers using celery + RabbitMQ, when I start celeryd, it shows that:
-------------- celery#ip-10-77-137-41 v3.0.19 (Chiastic Slide)
---- **** -----
--- * *** * -- Linux-3.2.0-36-virtual-x86_64-with-Ubuntu-12.04-precise
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> broker: amqp://celeryuser#localhost:5672/celeryvhost
- ** ---------- .> app: default:0x29ecf90 (.default.Loader)
- ** ---------- .> concurrency: 2 (processes)
- *** --- * --- .> events: OFF (enable -E to monitor this worker)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery: exchange:celery(direct) binding:celery
Can I just set concurrency as 8 to using these two nodes?
On each server launch a celery worker with concurrency level 4
celery worker --concurrency=4

Categories