I have a Deadlock when using a Celery task to save new customers from a CSV. This is what I have working so far.
for line in csv.reader(instance.data_file.read().splitlines()):
for index, item in enumerate(line):
number = int(item)
# TODO: Turn into task
Customer.objects.create_customer(
mobile=number,
campaign=instance.campaign,
reward_group=instance.reward_group,
company=instance.company,
)
No errors.
However, when add this same code to a Celery task I get the following error...
Deadlock found when trying to get lock; try restarting transaction'
So, this leads me to believe that I have done something wrong with my celery setup here. Can anyone spot what?
Here is the new Celery task that gives the deadlock error. I'm using shared_task as these task will at some point run on a different machine without Django, but that should not matter for now.
The first row in the CSV import ok, then I get a deadlock error...
for line in csv.reader(instance.data_file.read().splitlines()):
for index, item in enumerate(line):
number = int(item)
celery_app.send_task('test.tasks.create_customer_from_import', args=[number, instance.id], kwargs={})
tasks.py
# Python imports
from __future__ import absolute_import
# Core Django imports
from celery import shared_task
from mgm.core.celery import app as celery_app
#shared_task
def create_customer_from_import(number, customer_upload_id):
customer_upload = CustomerUpload.objects.get(pk=customer_upload_id)
new_customer = Customer.objects.create_customer(
mobile=number,
campaign=customer_upload.campaign,
reward_group=customer_upload.reward_group,
company=customer_upload.company,
)
return new_customer
celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'test.settings')
app = Celery('test-tasks')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
This is the CustomerManager:
class CustomerManager(models.Manager):
def create_customer(self, mobile, campaign, reward_group, company, password=None):.
user = AppUser.objects.create_user(mobile=mobile)
# Creates a new customer for a company and campaign
customer = self.model(
user=user,
campaign=campaign,
reward_group=reward_group,
company=company
)
customer.save(using=self._db)
Your code doesn't look wrong, but you're probably getting the deadlock because of the concurrency of multiple celery workers. From http://celery.readthedocs.org/en/latest/faq.html#mysql-is-throwing-deadlock-errors-what-can-i-do:
MySQL has default isolation level set to REPEATABLE-READ, if you don’t
really need that, set it to READ-COMMITTED. You can do that by adding
the following to your my.cnf:
[mysqld]
transaction-isolation = READ-COMMITTED
Related
I can't import my celery app to run tasks from my main Python application. I want to be able to run celery tasks from the myprogram.py file.
My celery_app.py file is as follows:
import celery
app = celery.Celery('MyApp', broker='redis://localhost:6379/0')
app.conf.broker_url = 'redis://localhost:6379/0'
app.conf.result_backend = 'redis://localhost:6379/0'
app.autodiscover_tasks()
#app.task(ignore_result=True)
def task_to_run():
print("Task Running")
# The following call runs a worker in celery
task_to_run.delay()
if __name__ == '__main__':
app.start()
Application structure
projectfolder/core/celery_app.py # Celery app
projectfolder/core/myprogram.py # My Python application
projectfolder/core/other python files...
The file myprogram.py contains the following:
from .celery_app import task_to_run
task_to_run.delay()
Error:
Received unregistered task of type 'projectfolder.core.celery_app.task_to_run'.
The message has been ignored and discarded.
Did you remember to import the module containing this task?
Or maybe you're using relative imports?
strategy = strategies[type_]
KeyError: 'projectfolder.core.celery_app.task_to_run'
Thanks
interesting, I didn't know about autodiscover_tasks, I guess it's new in 4.1
As I see in the documentation, this function takes list of packages to search. You might want to call it with:
app.autodiscover_tasks(['core.celery_app'])
or it might be better to extract the task to a seperate file called tasks.py and then it would be just:
app.autodiscover_tasks(['core']).
Alternatively, you can use the inculde parameter when creating the Celery instance:
app = celery.Celery('MyApp', broker='redis://localhost:6379/0', include=['core.celery_app']) or wherever your tasks are.
Good luck
This is the first time I'm using Celery, and honestly, I'm not sure I'm doing it right. My system has to run on Windows, so I'm using RabbitMQ as the broker.
As a proof of concept, I'm trying to create a single object where one task sets the value, another task reads the value, and I also want to show the current value of the object when I go to a certain url. However I'm having problems sharing the object between everything.
This is my celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE','cesGroundStation.settings')
app = Celery('cesGroundStation')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind = True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
The object I'm trying to share is:
class SchedulerQ():
item = 0
def setItem(self, item):
self.item = item
def getItem(self):
return self.item
This is my tasks.py
from celery import shared_task
from time import sleep
from scheduler.schedulerQueue import SchedulerQ
schedulerQ = SchedulerQ()
#shared_task()
def SchedulerThread():
print ("Starting Scheduler")
counter = 0
while(1):
counter += 1
if(counter > 100):
counter = 0
schedulerQ.setItem(counter)
print("In Scheduler thread - " + str(counter))
sleep(2)
print("Exiting Scheduler")
#shared_task()
def RotatorsThread():
print ("Starting Rotators")
while(1):
item = schedulerQ.getItem()
print("In Rotators thread - " + str(item))
sleep(2)
print("Exiting Rotators")
#shared_task()
def setSchedulerQ(schedulerQueue):
schedulerQ = schedulerQueue
#shared_task()
def getSchedulerQ():
return schedulerQ
I'm starting my tasks in my apps.py...I'm not sure if this is the right place as the tasks/workers don't seem to work until I start the workers in a separate console where I run the celery -A cesGroundStation -l info.
from django.apps import AppConfig
from scheduler.schedulerQueue import SchedulerQ
from scheduler.tasks import SchedulerThread, RotatorsThread, setSchedulerQ, getSchedulerQ
class SchedulerConfig(AppConfig):
name = 'scheduler'
def ready(self):
schedulerQ = SchedulerQ()
setSchedulerQ.delay(schedulerQ)
SchedulerThread.delay()
RotatorsThread.delay()
In my views.py I have this:
def schedulerQ():
queue = getSchedulerQ.delay()
return HttpResponse("Your list: " + queue)
The django app runs without errors, however my output from "celery -A cesGroundStation -l info" is this: Celery command output
First it seems to start multiple "SchedulerThread" tasks, secondly the "SchedulerQ" object isn't being passed to the Rotators, as it's not reading the updated value.
And if I go to the url for which shows the views.schedulerQ view I get this error:
Django views error
I have very, very little experience with Python, Django and Web Development in general, so I have no idea where to start with that last error. Solutions suggest using Redis to pass the object to the views, but I don't know how I'd do that using RabbitMQ. Later on the schedulerQ object will implement a queue and the scheduler and rotators will act as more of a producer/consumer dynamic with the view showing the contents of the queue, so I believe using the database might be too resource intensive. How can I share this object across all tasks, and is this even the right approach?
The right approach would be to use some persistence layer, such as a database or results back end to store the information you want to share between tasks if you need to share information between tasks (in this example, what you are currently putting in your class).
Celery operates on a distributed message passing paradigm - a good way to distill that idea for this example, is that your module will be executed independently every time a task is dispatched. Whenever a task is dispatched to Celery, you must assume it is running in a seperate interpreter and loaded independently of other tasks. That SchedulerQ class is instantiated anew each time.
You can share information between tasks in ways described in the docs linked previously and some best practice tips discuss data persistence concerns.
My Celery task isn't executing in the background in my Django 1.7/Python3 project.
# settings.py
BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULTBACKEND = BROKER_URL
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
CELERY_ALWAYS_EAGER = False
I have celery.py in my root app module as such:
from __future__ import absolute_import
import os
import django
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_app.settings')
django.setup()
app = Celery('my_app')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
and load the app in __init__.py in the root module:
from __future__ import absolute_import
from .celery import app as celery_app
My task is set up as a shared task in a tasks.py file in my app module:
from __future__ import absolute_import
from celery import shared_task
#shared_task
def update_statistics(profile, category):
# more code
and I call the task as a group:
. . .
job = group([update_statistics(f.profile, category)
for f in forecasts])
job.apply_async()
However, I'm not seeing any status updates in my task queue, which I am starting via:
$ celery -A my_app worker -l info
The task is being executed, just not in the background. If I add a print statement to the task code, I will see the output in my Django development server console instead of the Celery queue.
After the task runs in the foreground, I'm greeted with this exception:
'NoneType' object has no attribute 'app'
Here's the full traceback if you're interested: https://gist.github.com/alsoicode/0263d251e3744227ba46
You're calling the tasks directly in your list comprehension when you create the group, so they're executed then and there. You need to use the .subtask() method (or its shortcut, .s()) to create the subtasks without calling them:
job = group([update_statistics.s(f.profile, category) for f in forecasts])
I have an issue with Celery queue routing when using current_app.send_task
I have two workers (each one for each queue)
python manage.py celery worker -E -Q priority --concurrency=8 --loglevel=DEBUG
python manage.py celery worker -Q low --concurrency=8 -E -B --loglevel=DEBUG
I have two queues defined in celeryconfig.py file:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.core.exceptions import ImproperlyConfigured
from celery import Celery
from django.conf import settings
try:
app = Celery('proj', broker=getattr(settings, 'BROKER_URL', 'redis://'))
except ImproperlyConfigured:
app = Celery('proj', broker='redis://')
app.conf.update(
CELERY_TASK_SERIALIZER='json',
CELERY_ACCEPT_CONTENT=['json'],
CELERY_RESULT_SERIALIZER='json',
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend',
CELERY_DEFAULT_EXCHANGE='tasks',
CELERY_DEFAULT_EXCHANGE_TYPE='topic',
CELERY_DEFAULT_ROUTING_KEY='task.priority',
CELERY_QUEUES=(
Queue('priority',routing_key='priority.#'),
Queue('low', routing_key='low.#'),
),
CELERY_DEFAULT_EXCHANGE='priority',
CELERY_IMPORTS=('mymodule.tasks',)
CELERY_ENABLE_UTC = True
CELERY_TIMEZONE = 'UTC'
if __name__ == '__main__':
app.start()
In the definition of tasks, we use decorator to explicit the queue:
#task(name='mymodule.mytask', routing_key='low.mytask', queue='low')
def mytask():
# does something
pass
This task is run indeed in the low queue when this task is run using:
from mymodule.tasks import mytask
mytask.delay()
But it's not the case when it's run using: (it's run in the default queue: "priority")
from celery import current_app
current_app.send_task('mymodule.mytask')
I wonder why this later way doesn't route the task to the "low" queue!
p.s: I use redis.
send_task is a low-level method. It sends directly to the broker the task signature without going through your task decorator.
With this method, you can even send a task without loading the task code/module.
To solve your problem, you can fetch the routing_key/queue from configuration directly:
route = celery.amqp.routes[0].route_for_task("mymodule.mytask")
Out[10]: {'queue': 'low', 'routing_key': 'low.mytask'}
celery.send_task("myodule.mytask", queue=route['queue'], routing_key=route['routing_key']`
I'm running the First Steps with Celery Tutorial.
We define the following task:
from celery import Celery
app = Celery('tasks', broker='amqp://guest#localhost//')
#app.task
def add(x, y):
return x + y
Then call it:
>>> from tasks import add
>>> add.delay(4, 4)
But I get the following error:
AttributeError: 'DisabledBackend' object has no attribute '_get_task_meta_for'
I'm running both the celery worker and the rabbit-mq server. Rather strangely, celery worker reports the task as succeeding:
[2014-04-22 19:12:03,608: INFO/MainProcess] Task test_celery.add[168c7d96-e41a-41c9-80f5-50b24dcaff73] succeeded in 0.000435483998444s: 19
Why isn't this working?
Just keep reading tutorial. It will be explained in Keep Results chapter.
To start Celery you need to provide just broker parameter, which is required to send messages about tasks. If you want to retrieve information about state and results returned by finished tasks you need to set backend parameter. You can find full list with description in Configuration docs: CELERY_RESULT_BACKEND.
I suggest having a look at:
http://www.cnblogs.com/fangwenyu/p/3625830.html
There you will see that
instead of
app = Celery('tasks', broker='amqp://guest#localhost//')
you should be writing
app = Celery('tasks', backend='amqp', broker='amqp://guest#localhost//')
This is it.
In case anyone made the same easy to make mistake as I did: The tutorial doesn't say so explicitly, but the line
app = Celery('tasks', backend='rpc://', broker='amqp://')
is an EDIT of the line in your tasks.py file. Mine now reads:
app = Celery('tasks', backend='rpc://', broker='amqp://guest#localhost//')
When I run python from the command line I get:
$ python
>>> from tasks import add
>>> result = add.delay(4,50)
>>> result.ready()
>>> False
All tutorials should be easy to follow, even when a little drunk. So far this one doesn't reach that bar.
What is not clear by the tutorial is that the tasks.py module needs to be edited so that you change the line:
app = Celery('tasks', broker='pyamqp://guest#localhost//')
to include the RPC result backend:
app = Celery('tasks', backend='rpc://', broker='pyamqp://')
Once done, Ctrl + C the celery worker process and restart it:
celery -A tasks worker --loglevel=info
The tutorial is confusing in that we're making the assumption that creation of the app object is done in the client testing session, which it is not.
In your project directory find the settings file.
Then run the below command in your terminal:
sudo vim settings.py
copy/paste the below config into your settings.py:
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend'
Note: This is your backend for storing the messages in the queue if you are using django-celery package for your Django project.
Celery rely both on a backend AND a broker.
This solved it for me using only Redis:
app = Celery("tasks", backend='redis://localhost',broker="redis://localhost")
Remember to restart worker in your terminal after changing the config
I solved this error by adding app after taskID:
response = AsyncResult(taskID, app=celery_app)
where celery_app = Celery('ANYTHING', broker=BROKER_URL, backend=BACKEND_URL )
if you want to get the status of the celery task to know whether it is "PENDING","SUCCESS","FAILURE"
status = response.status
My case was simple - I used interactive Python console and Python cached imported module. I killed console and started it again - everything works as it should.
import celery
app = celery.Celery('tasks', broker='redis://localhost:6379',
backend='mongodb://localhost:27017/celery_tasks')
#app.task
def add(x, y):
return x + y
In Python console.
>>> from tasks import add
>>> result = add.delay(4, 4)
>>> result.ready()
True
Switching from Windows to Linux solved the issue for me
Windows is not guaranteed to work, it's mentioned here
I had the same issue, what resolved it for me was to import the celery file (celery.py) in the init function of you're app with something like:
from .celery import CELERY_APP as celery_app
__all__ = ('celery_app',)
if you use a celery.py file as described here