Inherit from a Custom Operator in Airflow

Inherit from a Custom Operator in Airflow - python

I am trying to create a CustomBaseOperator that inherits from the Airflow BaseOperator. The CustomBaseOperator works fine, but when I try to create a ChildOperator that inherits from the CustomBaseOperator, Airflow treats it as a CustomBaseOperator.
The CustomBaseOperator has an execute function that looks like this :
def make_request(self):
print("Parent request")
def execute(self, context):
self.make_request()
In the Child Operator, I redefine make_request:
class ChildOperator(CustomBaseOperator):
#apply_defaults
def make_request(self):
print("Child Request")
Whenever I run a task that uses ChildOperator, it prints "Parent request" and the legends shows it as a CustomBaseOperator ...
My operators are in an "operators" folder in the "plugins" folder. I am guessing that I can only inherit from "official" operators when creating a custom operator in that folder.
Do you know how I could make the inheritance work ?

This works:
import datetime
from airflow import DAG
from airflow.models import BaseOperator
class CustomBaseOperator(BaseOperator):
def make_request(self):
print("Parent request")
def execute(self, context):
self.make_request()
class ChildOperator(CustomBaseOperator):
def make_request(self):
print("Child Request")
with DAG(dag_id="test_dag", start_date=datetime.datetime(2022, 1, 1), schedule_interval=None) as dag:
test = ChildOperator(task_id="test")
This will print Child Request and display ChildOperator in the Airflow UI. Also, note that setting #apply_defaults is deprecated since Airflow 2.0, it is now applied automatically.

I found the issue : I was using a function that loops into locals() to set self.param = param for every param.

Related

Mock an external service instantiated in python constructor

I'm working on unit tests for a service I made that uses confluent-kafka. The goal is to test successful function calls, exception errors, etc. The problem I'm running into is since I'm instantiating the client in the constructor of my service the tests are failing since I'm unsure how to patch a constructor. My question is how do I mock my service in order to properly test its functionality.
Example_Service.py:
from confluent_kafka.schema_registry import SchemaRegistryClient
class ExampleService:
def __init__(self, config):
self.service = SchemaRegistryClient(config)
def get_schema(self):
return self.service.get_schema()
Example_Service_tests.py
from unittest import mock
#mock.patch.object(SchemaRegistryClient, "get_schema")
def test_get_schema_success(mock_client):
schema_Id = ExampleService.get_schema()
mock_service.assert_called()

The problem is that you aren't creating an instance of ExampleService; __init__ never gets called.
You can avoid patching anything by allowing your class to accept a client maker as an argument (which can default to SchemaRegistryClient:
class ExampleService:
def __init__(self, config, *, client_factory=SchemaRegistryClient):
self.service = client_factory(config)
...
Then in your test, you can simply pass an appropriate stub as an argument:
def test_get_schema_success():
mock_client = Mock()
schema_Id = ExampleService(some_config, client_factory=mock_client)
mock_client.assert_called()

Two ways
mock entire class using #mock.patch(SchemaRegistryClient) OR
replace #mock.patch.object(SchemaRegistryClient, "get_schema") with
#mock.patch.object(SchemaRegistryClient, "__init__")
#mock.patch.object(SchemaRegistryClient, "get_schema")

Pytest fixture with scope “class” doesn't works with "setup_class" method

I'm currently using pytest_addoption to run my API tests, so the tests should run against the environment the user uses on the command line. In my test file, I'm trying to instantiate the UsersSupport class just once, passing the env argument. My code:
conftest.py
import pytest
# Environments
QA1 = 'https://qa1.company.com'
LOCALHOST = 'https://localhost'
def pytest_addoption(parser):
parser.addoption(
'--env',
action='store',
default='qa1'
)
#pytest.fixture(scope='class')
def env(request):
cmd_option = request.config.getoption('env')
if cmd_option == 'qa1':
chosen_env = QA1
elif cmd_option == 'localhost':
chosen_env = LOCALHOST
else:
raise UnboundLocalError('"--env" command line must use "qa1", "localhost"')
return chosen_env
users_support.py
import requests
class UsersSupport:
def __init__(self, env):
self.env = env
self.users_endpoint = '/api/v1/users'
def create_user(self, payload):
response = requests.post(
url=f'{self.env}{self.users_endpoint}',
json=payload,
)
return response
post_create_user_test.py
import pytest
from faker import Faker
from projects import UsersSupport
from projects import users_payload
class TestCreateUser:
#pytest.fixture(autouse=True, scope='class')
def setup_class(self, env):
self.users_support = UsersSupport(env)
self.fake = Faker()
self.create_user_payload = users_payload.create_user_payload
def test_create_user(self):
created_user_res = self.users_support.create_user(
payload=self.create_user_payload
).json()
print(created_user_res)
The issue
When I run pytest projects/tests/post_create_user_test.py --env qa1 I'm getting AttributeError: 'TestCreateUser' object has no attribute 'users_support' error, but if I remove the scope from setup_class method, this method run on every method and not on all methods.
How can I use the env fixture in the setup_class and instantiate the UsersSupport class to use in all methods?

If you use a fixture with class scope, the self parameter does not refer to the class instance. You can, however, still access the class itself by using self.__class__, so you can make class variables from your instance variables.
Your code could look like this:
import pytest
from faker import Faker
from projects import UsersSupport
from projects import users_payload
class TestCreateUser:
#pytest.fixture(autouse=True, scope='class')
def setup_class(self, env):
self.__class__.users_support = UsersSupport(env)
self.__class__.fake = Faker()
self.__class__.create_user_payload = users_payload.create_user_payload
def test_create_user(self):
created_user_res = self.users_support.create_user(
payload=self.create_user_payload
).json() # now you access the class variable
print(created_user_res)
During the test, a new test instance is created for each test.
If you have a default function scoped fixture, it will be called within the same instance of the test, so that the self arguments of the fixture and the current test refer to the same instance.
In the case of a class scoped fixture, the setup code is run in a separate instance before the test instances are created - this instance has to live until the end of all tests to be able to execute teardown code, so it is different to all test instances. As it is still an instance of the same test class, you can store your variables in the test class in this case.

django celery beat arguments with imported class functions from another library

I'm trying to get a function to work in my django project with celerybeat that imports a class based function from a wrapper library. I've been reading that celery doesn't work with classes too easily. my function login_mb doesn't take an argument but when I try register and call this task I get an error Couldn't apply scheduled task login_mb: login_mb() takes 0 positional arguments but 1 was given
Is this because of self in the wrapper function imported?
What could I do to get this to work with celerybeat?
settings.py
CELERY_BEAT_SCHEDULE = {
'login_mb': {
'task': 'backend.tasks.login_mb',
'schedule': timedelta(minutes=30),
} ,
tasks.py
from matchbook.apiclient import APIClient
import logging
from celery import task
log = logging.getLogger(__name__)
#shared_task(bind=True)
def login_mb():
mb = APIClient('abc', '123')
mb.login()
mb.keep_alive()
apiclient.py (wrapper library)
from matchbook.baseclient import BaseClient
from matchbook import endpoints
class APIClient(BaseClient):
def __init__(self, username, password=None):
super(APIClient, self).__init__(username, password)
self.login = endpoints.Login(self)
self.keep_alive = endpoints.KeepAlive(self)
self.logout = endpoints.Logout(self)
self.betting = endpoints.Betting(self)
self.account = endpoints.Account(self)
self.market_data = endpoints.MarketData(self)
self.reference_data = endpoints.ReferenceData(self)
self.reporting = endpoints.Reporting(self)
def __repr__(self):
return '<APIClient [%s]>' % self.username
def __str__(self):
return 'APIClient'

The error is not related to your wrapper library, there seems to be nothing wrong with your task.
The problem arises because you've defined your task with bind=True When done so, celery automatillca injects a parameter to the method containing information about the current task. So you can either remove bind=True, or add a parameter to your task method like so:
#shared_task(bind=True)
def login_mb(self):
mb = APIClient('abc', '123')
mb.login()
mb.keep_alive()

Celery: Custom Base class/child class based task not showing up under app.tasks

I'm trying to create some celery tasks as classes, but am having some difficulty. The classes are:
class BaseCeleryTask(app.Task):
def is_complete(self):
""" default method for checking if celery task has completed. """
# simply return result (since by default tasks return boolean indicating completion)
try:
return self.result
except AttributeError:
logger.error('Result not defined. Make sure task has run!')
return False
class MacroReportTask(BaseCeleryTask):
def run(self, params):
""" Override the default run method with signal factory run"""
# hold on to the factory
process = MacroCountryReport(params)
self.result = process.run()
return self.result
but when I initialize the app, and check app.tasks (or run worker), app doesn't seem to have these above tasks in its registry. Other function based tasks (using app.task() decorator) seem to be registered fine.
I run the above task as:
process = SignalFactoryTask()
process.delay(params)
Celery worker errors with the following message:
Received unregistered task of type None.
I think the issue I'm having is: how do I add custom classes to the task registry as I do with regular function based tasks?

Ran into the exact same issue, took hours to find the solution cause I'm 90% sure it's a bug. In your class tasks, try the following
class BaseCeleryTask(app.Task):
def __init__(self):
self.name = "[modulename].BaseCeleryTask"
class MacroReportTask(app.Task):
def __init__(self):
self.name = "[modulename].MacroReportTask"
It seems registering it with the app still has a bug where the name isn't automatically configured. Let me know if that works.

Celery task with multiple decorators not auto registering task name

I'm having a task that looks like this
from mybasetask_module import MyBaseTask
#task(base=MyBaseTask)
#my_custom_decorator
def my_task(*args, **kwargs):
pass
and my base task looks like this
from celery import task, Task
class MyBaseTask(Task):
abstract = True
default_retry_delay = 10
max_retries = 3
acks_late = True
The problem I'm running into is that the celery worker is registering the task with the name
'mybasetask_module.__inner'
The task is registerd fine (which is the package+module+function) when I remove #my_custom_decorator from the task or if I provide an explicit name to the task like this
from mybasetask_module import MyBaseTask
#task(base=MyBaseTask, name='an_explicit_task_name')
#my_custom_decorator
def my_task(*args, **kwargs):
pass
Is this behavior expected? Do I need to do something so that my tasks are registered with the default auto registered name in the first case when I have multiple decorators but no explicit task name?
Thanks,

Use the functools.wraps() decorator to ensure that the wrapper returned by my_custom_decorator has the correct name:
from functools import wraps
def my_custom_decorator(func):
#wraps(func)
def __inner():
return func()
return __inner
The task name is taken from the function call that the task decorator wraps, but by inserting a decorator in between, you gave task your __inner wrapping function instead. The functools.wraps() decorator copies all the necessary metadata over from func to the wrapper so that task() can pick up the proper name.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Inherit from a Custom Operator in Airflow - python

I found the issue : I was using a function that loops into locals() to set self.param = param for every param.

Related

Mock an external service instantiated in python constructor

Pytest fixture with scope “class” doesn't works with "setup_class" method

django celery beat arguments with imported class functions from another library

Celery: Custom Base class/child class based task not showing up under app.tasks

Celery task with multiple decorators not auto registering task name

Categories

Resources