I'm trying to get a function to work in my django project with celerybeat that imports a class based function from a wrapper library. I've been reading that celery doesn't work with classes too easily. my function login_mb doesn't take an argument but when I try register and call this task I get an error Couldn't apply scheduled task login_mb: login_mb() takes 0 positional arguments but 1 was given
Is this because of self in the wrapper function imported?
What could I do to get this to work with celerybeat?
settings.py
CELERY_BEAT_SCHEDULE = {
'login_mb': {
'task': 'backend.tasks.login_mb',
'schedule': timedelta(minutes=30),
} ,
tasks.py
from matchbook.apiclient import APIClient
import logging
from celery import task
log = logging.getLogger(__name__)
#shared_task(bind=True)
def login_mb():
mb = APIClient('abc', '123')
mb.login()
mb.keep_alive()
apiclient.py (wrapper library)
from matchbook.baseclient import BaseClient
from matchbook import endpoints
class APIClient(BaseClient):
def __init__(self, username, password=None):
super(APIClient, self).__init__(username, password)
self.login = endpoints.Login(self)
self.keep_alive = endpoints.KeepAlive(self)
self.logout = endpoints.Logout(self)
self.betting = endpoints.Betting(self)
self.account = endpoints.Account(self)
self.market_data = endpoints.MarketData(self)
self.reference_data = endpoints.ReferenceData(self)
self.reporting = endpoints.Reporting(self)
def __repr__(self):
return '<APIClient [%s]>' % self.username
def __str__(self):
return 'APIClient'
The error is not related to your wrapper library, there seems to be nothing wrong with your task.
The problem arises because you've defined your task with bind=True When done so, celery automatillca injects a parameter to the method containing information about the current task. So you can either remove bind=True, or add a parameter to your task method like so:
#shared_task(bind=True)
def login_mb(self):
mb = APIClient('abc', '123')
mb.login()
mb.keep_alive()
Related
I am trying to create a CustomBaseOperator that inherits from the Airflow BaseOperator. The CustomBaseOperator works fine, but when I try to create a ChildOperator that inherits from the CustomBaseOperator, Airflow treats it as a CustomBaseOperator.
The CustomBaseOperator has an execute function that looks like this :
def make_request(self):
print("Parent request")
def execute(self, context):
self.make_request()
In the Child Operator, I redefine make_request:
class ChildOperator(CustomBaseOperator):
#apply_defaults
def make_request(self):
print("Child Request")
Whenever I run a task that uses ChildOperator, it prints "Parent request" and the legends shows it as a CustomBaseOperator ...
My operators are in an "operators" folder in the "plugins" folder. I am guessing that I can only inherit from "official" operators when creating a custom operator in that folder.
Do you know how I could make the inheritance work ?
This works:
import datetime
from airflow import DAG
from airflow.models import BaseOperator
class CustomBaseOperator(BaseOperator):
def make_request(self):
print("Parent request")
def execute(self, context):
self.make_request()
class ChildOperator(CustomBaseOperator):
def make_request(self):
print("Child Request")
with DAG(dag_id="test_dag", start_date=datetime.datetime(2022, 1, 1), schedule_interval=None) as dag:
test = ChildOperator(task_id="test")
This will print Child Request and display ChildOperator in the Airflow UI. Also, note that setting #apply_defaults is deprecated since Airflow 2.0, it is now applied automatically.
I found the issue : I was using a function that loops into locals() to set self.param = param for every param.
Looking at the docs I got to use my app settings in this way:
import config
...
#router.post('')
async def my_handler(
...
settings: config.SettingsCommon = fastapi.Depends(config.get_settings),
):
...
But I am not satisfied with repeating everywhere import config, config.get_settings.
Is there a way to use settings in my handlers without repeating myself?
Because FastAPI cares about helping you minimize code repetition.
You can use Class Based Views from the fastapi_utils package.
As an example:
from fastapi import APIRouter, Depends, FastAPI
from fastapi_utils.cbv import cbv
from starlette import requests
from logging import Logger
import config
from auth import my_auth
router = APIRouter(
tags=['Settings test'],
dependencies=[Depends(my_auth)] # injected into each query, but my_auth return values are ignored, throw Exceptions
)
#cbv(router)
class MyQueryCBV:
settings: config.SettingsCommon = Depends(config.get_settings) # you can get settings here
def __init__(self, r: requests.Request): # called for each query, after their dependencies have been evaluated
self.logger: Logger = self.settings.logger
self.logger.warning(str(r.headers))
#router.get("/cbv/{test}")
def test_cbv(self, test: str):
self.logger.warning(f"test_cbv: {test}")
return "test_cbv"
#router.get("/cbv2")
def test_cbv2(self):
self.logger.warning(f"test_cbv2")
return "test_cbv2"
It's not currently possible to inject global dependencies. You can still declare them and the code inside the dependencies will run as normal.
Docs on global dependencies for reference.
Without any external dependency, I can think of three ways of using global dependencies. You can set a private variable with your dependency and get that dependency using a function.
You can also use the same approach without a global private variable, but instead using a cache decorator (docs here).
Finally, you can implement the singleton pattern if using a class as a dependency.
Something like:
class Animal:
_singleton = None
#classmethod
def singleton(cls) -> "Animal":
if cls._singleton is None:
cls._singleton = Animal()
return cls._singleton
I'm trying to create some celery tasks as classes, but am having some difficulty. The classes are:
class BaseCeleryTask(app.Task):
def is_complete(self):
""" default method for checking if celery task has completed. """
# simply return result (since by default tasks return boolean indicating completion)
try:
return self.result
except AttributeError:
logger.error('Result not defined. Make sure task has run!')
return False
class MacroReportTask(BaseCeleryTask):
def run(self, params):
""" Override the default run method with signal factory run"""
# hold on to the factory
process = MacroCountryReport(params)
self.result = process.run()
return self.result
but when I initialize the app, and check app.tasks (or run worker), app doesn't seem to have these above tasks in its registry. Other function based tasks (using app.task() decorator) seem to be registered fine.
I run the above task as:
process = SignalFactoryTask()
process.delay(params)
Celery worker errors with the following message:
Received unregistered task of type None.
I think the issue I'm having is: how do I add custom classes to the task registry as I do with regular function based tasks?
Ran into the exact same issue, took hours to find the solution cause I'm 90% sure it's a bug. In your class tasks, try the following
class BaseCeleryTask(app.Task):
def __init__(self):
self.name = "[modulename].BaseCeleryTask"
class MacroReportTask(app.Task):
def __init__(self):
self.name = "[modulename].MacroReportTask"
It seems registering it with the app still has a bug where the name isn't automatically configured. Let me know if that works.
I'm having a task that looks like this
from mybasetask_module import MyBaseTask
#task(base=MyBaseTask)
#my_custom_decorator
def my_task(*args, **kwargs):
pass
and my base task looks like this
from celery import task, Task
class MyBaseTask(Task):
abstract = True
default_retry_delay = 10
max_retries = 3
acks_late = True
The problem I'm running into is that the celery worker is registering the task with the name
'mybasetask_module.__inner'
The task is registerd fine (which is the package+module+function) when I remove #my_custom_decorator from the task or if I provide an explicit name to the task like this
from mybasetask_module import MyBaseTask
#task(base=MyBaseTask, name='an_explicit_task_name')
#my_custom_decorator
def my_task(*args, **kwargs):
pass
Is this behavior expected? Do I need to do something so that my tasks are registered with the default auto registered name in the first case when I have multiple decorators but no explicit task name?
Thanks,
Use the functools.wraps() decorator to ensure that the wrapper returned by my_custom_decorator has the correct name:
from functools import wraps
def my_custom_decorator(func):
#wraps(func)
def __inner():
return func()
return __inner
The task name is taken from the function call that the task decorator wraps, but by inserting a decorator in between, you gave task your __inner wrapping function instead. The functools.wraps() decorator copies all the necessary metadata over from func to the wrapper so that task() can pick up the proper name.
I'm creating a task (by subclassing celery.task.Task) that creates a connection to Twitter's streaming API. For the Twitter API calls, I am using tweepy. As I've read from the celery-documentation, 'a task is not instantiated for every request, but is registered in the task registry as a global instance.' I was expecting that whenever I call apply_async (or delay) for the task, I will be accessing the task that was originally instantiated but that doesn't happen. Instead, a new instance of the custom task class is created. I need to be able to access the original custom task since this is the only way I can terminate the original connection created by the tweepy API call.
Here's some piece of code if this would help:
from celery import registry
from celery.task import Task
class FollowAllTwitterIDs(Task):
def __init__(self):
# requirements for creation of the customstream
# goes here. The CustomStream class is a subclass
# of tweepy.streaming.Stream class
self._customstream = CustomStream(*args, **kwargs)
#property
def customstream(self):
if self._customstream:
# terminate existing connection to Twitter
self._customstream.running = False
self._customstream = CustomStream(*args, **kwargs)
def run(self):
self._to_follow_ids = function_that_gets_list_of_ids_to_be_followed()
self.customstream.filter(follow=self._to_follow_ids, async=False)
follow_all_twitterids = registry.tasks[FollowAllTwitterIDs.name]
And for the Django view
def connect_to_twitter(request):
if request.method == 'POST':
do_stuff_here()
.
.
.
follow_all_twitterids.apply_async(args=[], kwargs={})
return
Any help would be appreciated. :D
EDIT:
For additional context for the question, the CustomStream object creates an httplib.HTTPSConnection instance whenever the filter() method is called. This connection needs to be closed whenever there is another attempt to create one. The connection is closed by setting customstream.running to False.
The task should only be instantiated once, if you think it is not for some reason,
I suggest you add a
print("INSTANTIATE")
import traceback
traceback.print_stack()
to the Task.__init__ method, so you could tell where this would be happening.
I think your task could be better expressed like this:
from celery.task import Task, task
class TwitterTask(Task):
_stream = None
abstract = True
def __call__(self, *args, **kwargs):
try:
return super(TwitterTask, self).__call__(stream, *args, **kwargs)
finally:
if self._stream:
self._stream.running = False
#property
def stream(self):
if self._stream is None:
self._stream = CustomStream()
return self._stream
#task(base=TwitterTask)
def follow_all_ids():
ids = get_list_of_ids_to_follow()
follow_all_ids.stream.filter(follow=ids, async=false)