I'm working with celery that I encountered a problem.
I have two functions:
1) This function will be activated when the program is activated, and it will work infinitely:
from celery.signals import worker_ready
#worker_ready.connect()
def message_poll_start(sender=None, headers=None, body=None, **kwargs):
while True:
time.sleep(2)
print("hello")
2) This function will be activated every ten seconds and write a date in a txt file:
#periodic_task(run_every=timedelta(seconds=10))
def last_record_time_check():
file_text = open('file.txt', 'a')
file_text.write("===========" + str(datetime.datetime.now()) +
" =============== \n\n")
and finally I used celeryd and celerybeat
The first function works without problems, but the second function does not work at all.
[2018-02-06 16:43:17,802: INFO/MainProcess] beat: Starting...
[2018-02-06 16:43:27,947: INFO/MainProcess] Scheduler: Sending due task base.tasks.last_record_time_check (base.tasks.last_record_time_check)
[2018-02-06 16:43:37,925: INFO/MainProcess] Scheduler: Sending due task base.tasks.last_record_time_check (base.tasks.last_record_time_check)
[2018-02-06 16:43:47,926: INFO/MainProcess] Scheduler: Sending due task base.tasks.last_record_time_check (base.tasks.last_record_time_check)
It looks like your function is stuck on first one, as it is always in while loop.
Related
I am new in Celery. I have defined a task like below:
#task(name="process_xls_file")
def process_contact_file_task(file_path):
wb = openpyxl.load_workbook(file_path)
sheet_obj = wb.active
cell_obj = sheet_obj.cell(row=2, column=1)
return cell_obj.value
But if I change the return value of this method it does not reflect changes unless I start the worker.
[2021-03-05 19:20:47,988: INFO/MainProcess] Received task: process_xls_file[8f2e38ed-e94f-4aa0-84f8-b332e49917af]
[2021-03-05 19:20:48,057: INFO/ForkPoolWorker-2] Task process_xls_file[8f2e38ed-e94f-4aa0-84f8-b332e49917af] succeeded in 0.0659001779999997s: 'Name'
[2021-03-05 19:21:16,795: INFO/MainProcess] Received task: process_xls_file[b8153adb-b6c5-4f9d-9177-730f941d82f5]
[2021-03-05 19:21:16,818: INFO/ForkPoolWorker-2] Task process_xls_file[b8153adb-b6c5-4f9d-9177-730f941d82f5] succeeded in 0.019803497000005166s: 'Name'
^C
I do not see new results unless I re-run celery -A contacts worker -l info
Celery does not support hot-reload (they took this out in v3 => v4 feature).
I've built a small web scraper function to get some data from the web and populate it to my db which works just well.
Now I would like to fire this function periodically every 20 seconds using Celery periodic tasks.
I walked through the docu and everything seems to be set up for development (using redis as broker).
This is my tasks.py file in project/stocksapp where my periodically fired functions are:
# Celery imports
from celery.task.schedules import crontab
from celery.decorators import periodic_task
from celery.utils.log import get_task_logger
from datetime import timedelta
logger = get_task_logger(__name__)
# periodic functions
#periodic_task(
run_every=(timedelta(seconds=20)),
name="getStocksDataDax",
ignore_result=True
)
def getStocksDataDax():
print("fired")
Now when I start the worker, the function seems to be fired once and only once (the database gets populated). But after that, the function doesn't get fired anymore, although the console suggests it:
C:\Users\Jonas\Desktop\CFD\CFD>celery -A CFD beat -l info
celery beat v4.4.2 (cliffs) is starting.
__ - ... __ - _
LocalTime -> 2020-05-15 23:06:29
Configuration ->
. broker -> redis://localhost:6379/0
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> 5.00 minutes (300s)
[2020-05-15 23:06:29,990: INFO/MainProcess] beat: Starting...
[2020-05-15 23:06:30,024: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:06:50,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:10,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:30,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:50,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:10,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:30,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:50,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
project/project/celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'CFD.settings')
app = Celery('CFD',
broker='redis://localhost:6379/0',
backend='amqp://',
include=['CFD.tasks'])
app.conf.broker_transport_options = {'visibility_timeout': 3600}
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
The function itself runs about 1 second totally.
Where could basically be an issue in this setup to make the worker/celery fire the function every 20 seconds as supposed to?
celery -A CFD beat -l info only starts the Celery beat process. You should have a separate Celery worker process - in a different terminal run something like celery -A CFD worker -c 8 -O fair -l info.
EDIT 1:
Actually, print statements outputs to the Celery terminal, instead of the terminal where the python program is ran - as #PatrickAllen indicated
OP
I've recently started to use Celery, but can't even get a simple test going where I print a line to the terminal after a 30 second wait.
In my tasks.py:
from celery import Celery
celery = Celery(__name__, broker='amqp://guest#localhost//', backend='amqp://guest#localhost//')
#celery.task
def test_message():
print ("schedule task says hello")
in the main module for my package, I have:
import tasks.py
if __name__ == '__main__':
<do something>
tasks.test_message.apply_async(countdown=30)
I run it from terminal:
celery -A tasks worker --loglevel=info
Task is ran correctly, but nothing on the terminal of the main program. Celery output:
[2016-03-06 17:49:46,890: INFO/MainProcess] Received task: tasks.test_message[4282fa1a-8b2f-4fa2-82be-d8f90288b6e2] eta:[2016-03-06 06:50:16.785896+00:00]
[2016-03-06 17:50:17,890: WARNING/Worker-2] schedule task says hello
[2016-03-06 17:50:17,892: WARNING/Worker-2] The client is not currently connected.
[2016-03-06 17:50:18,076: INFO/MainProcess] Task tasks.test_message[4282fa1a-8b2f-4fa2-82be-d8f90288b6e2] succeeded in 0.18711688100120227s: None
I am using celery's apply_async method to queue tasks. I expect about 100,000 such tasks to run everyday (number will only go up). I am using RabbitMQ as the broker. I ran the code a few days back and RabbitMQ crashed after a few hours. I noticed that apply_async creates a new queue for each task with x-expires set at 1 day. My hypothesis is that RabbitMQ chokes when so many queues are being created. How can I stop celery from creating these extra queues for each task?
I also tried giving the queue parameter to the apply_async and assigned a x-message-ttl to that queue. Messages did go this new queue, however they were immediately consumed and never reached the ttl of 30sec that I had put. And this did not stop celery from creating those extra queues.
Here's my code:
views.py
from celery import task, chain
chain(task1.s(a), task2.s(b),)
.apply_async(link_error=error_handler.s(a), queue="async_tasks_queue")
tasks.py
from celery.result import AsyncResult
#shared_task
def error_handler(uuid, a):
#Handle error
#shared_task
def task1(a):
#Do something
return a
#shared_task
def task2(a, b):
#Do something more
celery.py
app = Celery(
'app',
broker=settings.QUEUE_URL,
backend=settings.QUEUE_URL,
)
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
app.amqp.queues.add("async_tasks_queue", queue_arguments={'durable' : True , 'x-message-ttl': 30000})
From the celery logs:
[2016-01-05 01:17:24,398: INFO/MainProcess] Received task:
project.tasks.task1[615e094c-2ec9-4568-9fe1-82ead2cd303b]
[2016-01-05 01:17:24,834: INFO/MainProcess] Received task:
project.decorators.wrapper[bf9a0a94-8e71-4ad6-9eaa-359f93446a3f]
RabbitMQ had 2 new queues by the names "615e094c2ec945689fe182ead2cd303b" and "bf9a0a948e714ad69eaa359f93446a3f" when these tasks were executed
My code is running on Django 1.7.7, celery 3.1.17 and RabbitMQ 3.5.3.
Any other suggestions to execute tasks asynchronously are also welcome
Try using a different backend - I recommend Redis. When we tried using Rabbitmq as both broker and backend we discovered that it was ill suited to the broker role.
I use Celery to make requests to the server (in tasks). I have hard limit - only 1 request in one second (from one ip).
I read this, so its what I want - 1/s.
In celeryconfig.py I have:
CELERY_DISABLE_RATE_LIMITS = False
CELERY_DEFAULT_RATE_LIMIT = "1/s"
But I have the messages, that I have too many requests per second.
In call.py I use groups.
I think, rate_limits does not work, because I have a mistake in celeryconfig.py.
How to fix that? Thanks!
When you start a celery worker with
celery -A your_app worker -l info
the default concurrency is equal to the number of the cores your machine has. So,eventhough you set a rate limit of '1/s', it is trying to process multiple tasks concurrently.
Also setting a rate_limit in celery_config is a bad idea. Now you have only one task, if you add new tasks to your app, the rate limits will affect each other.
A simple way to achieve your one task per one second is this.
tasks.py
import time
from celery import Celery
app = Celery('tasks', backend='amqp', broker='amqp://guest#localhost//')
#app.task()
def task1():
time.sleep(1)
return('task1')
Now start you worker with a concurrency of ONE
celery -A my_taks.py worker -l info -c 1
This will execute only one task per second. Here is my log with the above code.
[2014-10-13 19:27:41,158: INFO/MainProcess] Received task: task1[209008d6-bb9d-4ce0-80d4-9b6c068b770e]
[2014-10-13 19:27:41,161: INFO/MainProcess] Received task: task1[83dc18e0-22ec-4b2d-940a-8b62006e31cd]
[2014-10-13 19:27:41,168: INFO/MainProcess] Received task: task1[e1b25558-0bb2-405a-8009-a7b58bbfa4e1]
[2014-10-13 19:27:41,171: INFO/MainProcess] Received task: task1[2d864be0-c969-4c52-8a57-31dbd11eb2d8]
[2014-10-13 19:27:42,335: INFO/MainProcess] Task task1[209008d6-bb9d-4ce0-80d4-9b6c068b770e] succeeded in 1.170940883s: 'task1'
[2014-10-13 19:27:43,457: INFO/MainProcess] Task task1[83dc18e0-22ec-4b2d-940a-8b62006e31cd] succeeded in 1.119711205s: 'task1'
[2014-10-13 19:27:44,605: INFO/MainProcess] Task task1[e1b25558-0bb2-405a-8009-a7b58bbfa4e1] succeeded in 1.1454614s: 'task1'
[2014-10-13 19:27:45,726: INFO/MainProcess] Task task1[2d864be0-c969-4c52-8a57-31dbd11eb2d8] succeeded in 1.119111023s: 'task1'