Airflow SimpleHttpOperator

Airflow SimpleHttpOperator - python

Hi I am experiencing weird behavior from SimpleHttpOperator.
I have extended this operator like this:
class EPOHttpOperator(SimpleHttpOperator):
"""
Operator for retrieving data from EPO API, performs token validity check,
gets a new one, if old one close to not valid.
"""
#apply_defaults
def __init__(self, entity_code, *args, **kwargs):
super().__init__(*args, **kwargs)
self.entity_code = entity_code
self.endpoint = self.endpoint + self.entity_code
def execute(self, context):
try:
token_data = json.loads(Variable.get(key="access_token_data", deserialize_json=False))
if (datetime.now() - datetime.strptime(token_data["created_at"],
'%Y-%m-%d %H:%M:%S.%f')).seconds >= 19 * 60:
Variable.set(value=json.dumps(get_EPO_access_token(), default=str), key="access_token_data")
self.headers = {
"Authorization": f"Bearer {token_data['token']}",
"Accept": "application/json"
}
super(EPOHttpOperator, self).execute(context)
except HTTPError as http_err:
logging.error(f'HTTP error occurred during getting EPO data: {http_err}')
raise http_err
except Exception as e:
logging.error(e)
raise e
And I have written a simple unit test:
def test_get_EPO_data(requests_mock):
requests_mock.get('http://ops.epo.org/rest-services/published-data/publication/epodoc/EP1522668',
text='{"text": "test"}')
requests_mock.post('https://ops.epo.org/3.2/auth/accesstoken',
text='{"access_token":"test", "status": "we just testing"}')
dag = DAG(dag_id='test_data', start_date=datetime.now())
task = EPOHttpOperator(
xcom_push=True,
do_xcom_push=True,
http_conn_id='http_EPO',
endpoint='published-data/publication/epodoc/',
entity_code='EP1522668',
method='GET',
task_id='get_data_task',
dag=dag,
)
ti = TaskInstance(task=task, execution_date=datetime.now(), )
task.execute(ti.get_template_context())
assert ti.xcom_pull(task_ids='get_data_task') == {"text": "test"}
Test doesn't pass though, the XCOM value from HttpHook is never pushed as an XCOM, I have checked that code responsible for the push logic in the hook class gets called:
....
if self.response_check:
if not self.response_check(response):
raise AirflowException("Response check returned False.")
if self.xcom_push_flag:
return response.text
What did I do wrong? Is this a bug?

So I actually managed to make it work by setting an xcom value to the result of super(EPOHttpOperator, self).execute(context).
def execute(self, context):
try:
.
.
.
self.headers = {
"Authorization": f"Bearer {token_data['token']}",
"Accept": "application/json"
}
super(EPOHttpOperator, self).execute(context) -> Variable.set(value=super(EPOHttpOperator, self).execute(context),key='foo')
Documentation is kind of misleading on this one; or am I doing something wrong after all?

Related

FastAPI handling camelCase and PascalCase same time

How can I handling PascalCase and camelCase requests bodies to snake_case same time in FastAPI app?
I tried to use middleware and routerHandler to replace camelCase to PascalCase, but it works not so good.
class CustomRouteHandler(APIRoute):
def get_route_handler(self) -> Callable:
original_route_handler = super().get_route_handler()
async def custom_route_handler(request: Request) -> Response:
route_path = request.url.path
body = await request.body()
logger.info({"path": request.url.path, "request": request._body.decode("utf-8")})
if body:
body = ujson.dumps(humps.pascalize(ujson.loads(body.decode("utf-8")))).encode("ascii")
request._body = body
try:
return await original_route_handler(request)
except ValidationError as e:
logger.exception(e, exc_info=True)
return UJSONResponse(status_code=200, content={"Success": False, "Message": e})
return custom_route_handler
router = APIRouter(prefix="/payments", route_class=CustomRouteHandler)
When I logging this code, all fine. But it returns ValidationError:
request body: {"test": 12345}
logger after pascalize: {"Test": 12345}
ERROR: 1 validation error for Request\nbody -> Test
none is not an allowed value (type=type_error.none.not_allowed)

First of all you should understand that when we are using CamelCase in API requests, the request body will be converted into a string by default. So, if you want to handle both camelCase and PascalCase, you need to handle both cases. The easiest way is to add two routes - one for each case. For example:
class CustomRouteHandler(APIRoute):
def get_route_handler(self) -> Callable: original_route_handler = super().get_route_handler()
async def custom_route_handler(request: Request) -> Response: route_path = request.url.path body = await request.body()
logger.info(\"Path\": request.url.path, \"Request\": request._body.decode("utf-8"))
if body: body = ujson.dumps(humps.pascalize(ujson.loads(body.decode("utf-8")))).encode("ascii")
request._body = body
try: return await original_route_handler(request)
except ValidationError as e: logger.exception(e, exc_info=True)
return UJSONResponse(status_code=200, content=\"Success\", "Message": e)
return custom_route_handler router = APIRouter(prefix="/payments", route_class=CustomRouteHandler)
Then you can create a middleware which handles both cases:
middleware = RouterMiddleware(prefix="/payments", route_class="custom_route_handler")
And then you can use your custom routing handler in your main route:
main_routes = Routes.new(Router())
You can see more examples on the official docs page.

Reuse methods from locust in different tests

I currently have this locust test:
import logging
from locust import HttpUser, TaskSet, task, constant
log = logging.getLogger("rest-api-performance-test")
def get_headers():
headers = {
"accept": "application/json",
"content-type": "application/json",
}
return headers
def headers_token(auth_token):
headers = {
"accept": "application/json",
"content-type": "application/json",
"auth-token": str(auth_token),
}
return headers
class LoginTasks(TaskSet):
def post_login(self):
headers = get_headers()
login_response = self.client.post("/api/auth",
json={"key": "cstrong#tconsumer.com", "secret": "cstrong"},
headers=headers, catch_response=False, name="Login")
login_data = login_response.json()
auth_token_value = login_data["results"][0]["auth-token"]
return auth_token_value
class UserTasks(UserTasks):
#task
def test_get_list_of_workers(self):
auth_token = self.post_login()
try:
with self.client.get("/api/client/workers",
headers=headers_token(auth_token), catch_response=True,
name="Get Worker Lists") as request_response:
if request_response.status_code == 200:
assert (
'"label": "Salena Mead S"' in request_response.text
)
request_response.success()
log.info("API call resulted in success.")
else:
request_response.failure(request_response.text)
log.error("API call resulted in failed.")
except Exception as e:
log.error(f"Exception occurred! details are {e}")
class WebsiteUser(HttpUser):
host = "https://test.com"
wait_time = constant(1)
tasks = [UserTasks]
the tests runs as expected but post_login is required by multiple tests since is the one who generates the authentication token used by most of the APIs that I'm testing, is there a way to avoid use inheritance from class LoginTasks and find a better solution? The reason I want to avoid it is post_login is not the only method that is going be used many times so I don't want to use multiple inheritance on my UserTasks class.
Any help is appreciated.

Move the function out of the class and pass in the client you want it to use.
def post_login(client):
headers = get_headers()
login_response = client.post("/api/auth",
…
You can then call it when you need it the same way you call get_headers().
auth_token = post_login(self.client)

Mocking variable and replacing it with object

I'd like to test this piece of code:
modify: UserModifyPort = _ports_.user_modify_port
#_app_.route(f"/user", methods=["POST"])
#headers_check({"Accept": "application/json", "Content-Type": "application/json"})
def create_user():
body_json = request.get_json()
body = UserCreateRequest(body_json["username"], body_json["password"])
cmd = UserCreateCmd(body.username, body.password)
# modify usage
user_id = modify.create_user(cmd)
response = UserCreateResponse(user_id)
return response.to_dict(), 201
In this test I need to mock a global variable modify and replace it with object. I've been trying to do this like that:
# TEST
#mock.patch("application.user.user_rest_adapter.modify")
def test_create_user_should_create(modify_mock, db_engine, client, user_config):
modify_mock.return_value = DatabaseUserModifyAdapter(db_engine, user_config)
response = client.post("/user", headers={"Accept": "application/json", "Content-Type": "application/json"},
json={"username": "GALJO", "password": "qwerty123"})
But it isn't executing modify.create_user() function, it just returns some weird object:
<MagicMock name='modify.create_user()' id='140375141136512'>
How can I make this function work?

I solved this issue with sort of workaround. Instead of mocking entire object I've mocked just function that I use. There is no need to use another function, because it is tested in other tests so I replaced it with constant value. I've only checked if given args are correct, everything else is other test task.
#mock.patch("application.user.user_rest_adapter.modify.create_user")
def test_create_user_should_create(create_user_mock, client):
# given
user_id = "a20d7a48-7235-489b-8552-5a081d069078"
create_user_mock.return_value = UUID(user_id)
# when
response = client.post("/user", headers={"Accept": "application/json", "Content-Type": "application/json"},
json={"username": "GALJO", "password": "qwerty123"})
# then
args = create_user_mock.call_args.args
assert args[0].username == "GALJO"
assert args[0].password == "qwerty123"
assert response.json["userID"] == user_id

pytest will not raise exception

I can not get the exception to throw on this unit test.
def test_something(monkeypatch):
# Arrange
os.environ["ENTITY"] = "JURKAT" # should monkeypatch this, but ignore for now.
opgave_or_sag_id = "S7777777"
url = "https://testdriven.io"
auth = ("test", "test")
tx = 999999999
patch_called = False
def mock_get(*args, **kwargs):
nonlocal patch_called
patch_called = True
return MockResponseGet410()
monkeypatch.setattr(Session, "get", mock_get)
# Act
with pytest.raises(Exception) as exc_info:
dsu.fetch_entity_from_green(opgave_or_sag_id, url, auth, tx)
# Assert
assert patch_called
assert exc_info.value.args[0] == "Expected status code 200 (or 410 with invalid opgaver), but got 410 for entity JURKAT. Error: Invalid opgave - It's gone - https://testdriven.io"
class MockResponseGet410:
def __init__(self):
self.status_code = 410
self.text = "the opgave has status: INVALID, it's gone now."
self.reason = "It's gone"
self.url = "https://testdriven.io"
self.headers = {
"Content-Type": "application/json",
"Content-Length": "123",
"Content-Location": "tx=123",
}
# From dsu
def fetch_entity_from_green(opgave_or_sag_id, url, auth, tx):
"""Retrieve missing entity from green's api.
Parameters
----------
opgave_or_sag_id : string, required
tx : int, required
A tx number.
url : str, required
auth : AWSAuthObject, required
Returns
-------
entity : dict
A dict representing the entity from green.
status_code : int
"""
try:
ENTITY = os.environ["ENTITY"]
url_with_id = url + str(opgave_or_sag_id)
s = fetch_request_session_with_retries()
r = s.get(url_with_id, auth=auth)
handle_non_200_response_for_invalid_opgaver(r, opgave_or_sag_id, ENTITY)
# I deleted the rest, not relevant in this test as the above function should throw an exception.
except Exception as e:
print(f"An exception occured on request with id {opgave_or_sag_id}: {e}")
The exception should be thrown in handle_non_200_response_for_invalid_opgaver because mock_get returns a 410 status code and ENTITY is set to JURKAT:
def handle_non_200_response_for_invalid_opgaver(request, opgave_or_sag_id, ENTITY):
"""
Handles a non-200 response from the API but allows 410 responses on invalid opgaver.
"""
# 410 because Team Green returns this for invalid opgaver, which becomes a valid response.
if request.status_code != 200 and (
request.status_code == 410 and ENTITY != "OPGAVER"
):
print(f"Status code for request on {opgave_or_sag_id}: {request.status_code}")
raise Exception( # TODO be more explicit with exception.
f"Expected status code 200 (or 410 with invalid opgaver), but got {request.status_code} for entity {ENTITY}. Error: {request.text} - {request.reason} - {request.url}"
)
I can get an exception to throw using pytest.raises(Exception) in a different test (see below), and the test passes, so I'm on the right track:
def test_handle_non_200_response():
# Arrange
r = MockResponse()
# Act
with pytest.raises(Exception) as exc_info:
handle_non_200_response(r)
# Assert
assert (
exc_info.value.args[0]
== "Expected status code 200, but got 504. Error: Gateway Timeout - Exceeded 30 seconds - https://testdriven.io"
)
class MockResponse:
def __init__(self):
self.status_code = 504
self.text = "Gateway Timeout"
self.reason = "Exceeded 30 seconds"
self.url = "https://testdriven.io"
def json(self):
return {"id": 1}
def handle_non_200_response(request):
"""
Handles a non-200 response from the API.
"""
if request.status_code != 200:
print(f"Status code for request on {id}: {request.status_code}")
raise Exception(
f"Expected status code 200, but got {request.status_code}. Error: {request.text} - {request.reason} - {request.url}"
)
Can you see where I have gone astray?

Testing of async tornado RequestHandler method in a complex environment

I am trying to write unit testing code for a child of tornado.web.RequestHandler that runs an aggregate query to the database. I have already wasted several days trying to get the tests to work.
The tests are using pytest and factoryboy. A lot of the important tornado class have factories for the tests.
This is the class that is being tested:
class AggregateRequestHandler(StreamlyneRequestHandler):
'''
'''
SUPPORTED_METHODS = (
"GET", "POST", "OPTIONS")
def get(self):
self.aggregate()
#auth.hmac_auth
##tornado.web.asynchronous
#tornado.web.removeslash
#tornado.gen.coroutine
def aggregate(self):
'''
'''
self.logger.info('api aggregate')
data = self.data
print("Data: {0}".format(data))
pipeline = data['pipeline']
self.logger.debug('pipeline : {0}'.format(pipeline))
self.logger.debug('utc tz : {0}'.format(tz_util.utc))
# execute pipeline query
print(self.collection)
try:
cursor_future = self.collection.aggregate(pipeline, cursor={})
print(cursor_future)
cursor = yield cursor_future
print("Cursor: {0}".format(cursor))
except Exception as e:
print(e)
documents = yield cursor.to_list(length=None)
self.logger.debug('results : {0}'.format(documents))
# process MongoDB JSON extended
results = json.loads(json_util.dumps(documents))
pipeline = json.loads(json_util.dumps(pipeline))
response_data = {
'pipeline': pipeline,
'results': results
}
self.respond(response_data)
The method used to test it is here:
##tornado.testing.gen_test
def test_time_inside(self):
current_time = gen_time()
past_time = gen_time() - datetime.timedelta(minutes=20)
test_query = copy.deepcopy(QUERY)
oid = ObjectId("53a72de12fb05c0788545ed6")
test_query[0]['$match']['attribute'] = oid
test_query[0]['$match']['date_created']['$gte'] = past_time
test_query[0]['$match']['date_created']['$lte'] = current_time
request = produce.HTTPRequest(
method="GET",
headers=produce.HTTPHeaders(
kwargs = {
"Content-Type": "application/json",
"Accept": "application/json",
"X-Sl-Organization": "test",
"Hmac": "83275edec557e2a339e0ec624201db604645e1e1",
"X-Sl-Username": "test#test.co",
"X-Sl-Expires": 1602011725
}
),
uri="/api/v1/attribute-data/aggregate?{0}".format(json_util.dumps({
"pipeline": test_query
}))
)
self.ARH = produce.AggregateRequestHandler(request=request)
#io_loop = tornado.ioloop.IOLoop.instance()
self.io_loop.run_sync(self.ARH.get)
#def stop_test():
#self.stop()
#self.ARH.test_get(stop_test)
#self.wait()
output = self.ARH.get_written_output()
assert output == ""
This is the way I set up the factory for the Request Handler:
class OutputTestAggregateRequestHandler(slapi.rest.AggregateRequestHandler, tornado.testing.AsyncTestCase):
'''
'''
_written_output = []
def write(self, chunk):
print("Previously written: {0}".format(self._written_output))
print("Len: {0}".format(len(self._written_output)))
if self._finished:
raise RuntimeError("Cannot write() after finish(). May be caused "
"by using async operations without the "
"#asynchronous decorator.")
if isinstance(chunk, dict):
print("Going to encode a chunk")
chunk = escape.json_encode(chunk)
self.set_header("Content-Type", "application/json; charset=UTF-8")
chunk = escape.utf8(chunk)
print("Writing")
self._written_output = []
self._written_output.append(chunk)
print(chunk)
def flush(self, include_footers=False, callback=None):
pass
def get_written_output(self):
for_return = self._written_output
self._written_output = []
return for_return
class AggregateRequestHandler(StreamlyneRequestHandler):
'''
'''
class Meta:
model = OutputTestAggregateRequestHandler
model = slapi.model.AttributeDatum
When running the tests, the test simply stops in def aggregate(self): somewhere between print(cursor_future) and print("Cursor: {0}".format(cursor)).
The in the stdout you see
MotorCollection(Collection(Database(MongoClient([]), u'test'), u'attribute_datum'))
<tornado.concurrent.Future object at 0x7fbc737993d0>
and nothing else comes out of the test with it failing on
> assert output == ""
E AssertionError: assert [] == ''
After a lot of time looking at documentation and examples and stack overflow I managed to get a functioning test by adding the following code to OutputTestAggregateRequestHandler:
def set_io_loop(self):
self.io_loop = tornado.ioloop.IOLoop.instance()
def ioloop(f):
#functools.wraps(f)
def wrapper(self, *args, **kwargs):
print(args)
self.set_io_loop()
return f(self, *args, **kwargs)
return wrapper
def runTest(self):
pass
Then copying all of the code from AggregateRequestHandler.aggregate into OutputTestAggregateRequestHandler but with different decorators:
#ioloop
#tornado.testing.gen_test
def _aggregate(self):
......
I then received the output:
assert output == ""
E AssertionError: assert ['{\n "pipeline": [\n {\n "$match": {\n "attribute": {\n "$oid"... "$oid": "53cec0e72dc9832c4c4185f2"\n }, \n "quality": 9001\n }\n ]\n}'] == ''
which is actually a success, but I was just triggering an assertion error on purpose to see the output.
The big problem that I have, is how do I achieve the desired outcome, which is the output received by adding the extra code, and copying the aggregate method.
Obviously when copying the code out of the aggregate method the test is no longer useful after I make changes to the actual method. How can I get the actual aggregate method to function properly in the tests instead of stopping seemingly when it encounters asynchronous code?
Thanks for any help,
Cheers!
-Liam

In general, the intended way to test RequestHandlers is with AsyncHTTPTestCase, not AsyncTestCase. This will set up the HTTP client and server for you and everything will go through the HTTP plumbing. Using RequestHandlers outside of an Application and HTTP server is not fully supported, although in Tornado 4.0 it might be feasible to use a dummy HTTPConnection to avoid the full server stack. This might be faster, although it's kind of uncharted territory at this point.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Airflow SimpleHttpOperator - python

Related

FastAPI handling camelCase and PascalCase same time

Reuse methods from locust in different tests

Mocking variable and replacing it with object

pytest will not raise exception

Testing of async tornado RequestHandler method in a complex environment

Categories

Resources