Mocking code run inside an rq SimpleWorker - python

I have code which uses Python requests to kick off a task which runs in a worker that is started with rq. (Actually, the GET request results in one task which itself starts a second task. But this complexity shouldn't affect things, so I've left that out of the code below.) I already have a test which uses rq's SimpleWorker class to cause the code to run synchronously. This works fine. But now I'm adding requests_ratelimiter to the second task, and I want to be sure it's behaving correctly. I think I need to somehow mock the time.sleep() function used by the rate limiter, and I can't figure out how to patch it.
routes.py
#app.route("/do_work/", methods=["POST"])
def do_work():
rq_job = my_queue.enqueue(f"my_app.worker.do_work", job_timeout=3600, *args, **kwargs)
worker.py
from requests_ratelimiter import LimiterSession
#job('my_queue', connection=redis_conn, timeout=3600, result_ttl=24 * 60 * 60)
def do_work():
session = LimiterSession(per_second=1)
r = session.get(WORK_URL)
test.py
import requests_mock
def test_get(client):
# call the Flask function to kick off the task
client.get("/do_work/")
with requests_mock.Mocker() as m:
# mock the return value of the requests.get() call in the worker
response_success = {"result": "All good"}
m.get(WORK_URL, json=response_success)
worker = SimpleWorker([my_queue], connection=redis_conn)
worker.work(burst=True) # Work until the queue is empty
A test in requests_ratelimiter patches the sleep function using a target path of 'pyrate_limiter.limit_context_decorator.sleep', but that doesn't work for me because I'm not declaring pyrate_limiter at all. I've tried mocking the time function and then passing that into the LimiterSession, and that sort of works:
worker.py
from requests_ratelimiter import LimiterSession
from time import time
#job('my_queue', connection=redis_conn, timeout=3600, result_ttl=24 * 60 * 60)
def do_work():
session = LimiterSession(per_second=1, time_function=time)
r = session.get(WORK_URL)
test.py
import requests_mock
def test_get(client):
# call the Flask function to kick off the task
client.get("/do_work/")
with patch("my_app.worker.time", return_value=None) as mock_time:
with requests_mock.Mocker() as m:
response_success = {"result": "All good"}
m.get(URL, json=response_success)
worker = SimpleWorker([my_queue], connection=redis_conn)
worker.work(burst=True) # Work until the queue is empty
assert mock_time.call_count == 1
However, then I see time called many more times than sleep would be, so I don't get the info I need from it. And patching my_app.worker.time.sleep results in the error:
AttributeError: does not have the attribute 'sleep'
I have also tried patching the pyrate_limiter as the requests_ratelimiter testing code does:
with patch(
"my_app.worker.requests_ratelimiter.pyrate_limiter.limit_context_decorator.sleep", return_value=None
) as mock_sleep:
But this fails with:
ModuleNotFoundError: No module named 'my_app.worker.requests_ratelimiter'; 'my_app.worker' is not a package
How can I test and make sure the rate limiter is engaging properly?

The solution was indeed to use 'pyrate_limiter.limit_context_decorator.sleep', despite the fact that I wasn't importing it.
When I did that and made the mock return None, I discovered that sleep() was being called tens of thousands of times because it's in a while loop.
So in the end, I also needed to use freezegun and a side effect on my mock_sleep to get the behavior I wanted. Now time is frozen, but sleep() jumps the test clock forward synchronously and instantly by the amount of seconds passed as an argument.
from datetime import timedelta
from unittest.mock import patch
import requests_mock
from freezegun import freeze_time
from rq import SimpleWorker
def test_get(client):
with patch("pyrate_limiter.limit_context_decorator.sleep") as mock_sleep:
with freeze_time() as frozen_time:
# Make sleep operate on the frozen time
# See: https://github.com/spulec/freezegun/issues/47#issuecomment-324442679
mock_sleep.side_effect = lambda seconds: frozen_time.tick(timedelta(seconds=seconds))
with requests_mock.Mocker() as m:
m.get(URL, json=response_success)
worker = SimpleWorker([my_queue], connection=redis_conn)
worker.work(burst=True) # Work until the queue is empty
# The worker will do enough to get rate limited once
assert mock_sleep.call_count == 1

Related

pytest skip portion of function

I have one function which does the some process and if fails then i am existing the code using sys.exit(myfun()) but when i am testing using pytest i dont want to execute function myfun() inside sys.exit(). is there anyway in pytest it can be possible skip myfun()?
mypyfile.py
def process():
# do some logic
if failed:
sys.exit(myfun()) # this i dont want to execute if i test via my pytest
def myfun():
print("failed")
test_mypyfile.py
import pytest
import test_mypyfile
def test_process():
test_mypyfile.process()
You could mock myfunc in your test. This way the function is not called in the test process. I'll change a bit you're example so that it make more sense to mock (and you'll be able to see the difference):
# mypyfile.py
from time import sleep
def process():
# do some logic
if failed:
sys.exit(myfunc)
def myfun():
sleep(4000)
# test_mypyfile.py
import mock
import mypyfile
def test_process():
mock.patch('mypyfile.myfunc')
test_mypyfile.process()
Here myfunc will be called but you won't wait the 4000s.

Python Patch not resetting between tests

I have a unit test class that tests the same method twice - once with happy path and once with a failure. If I run both tests individually then they pass, but if I run them together then the patch return_value from the first test is also applied to the second, and so one test will fail. What am I missing here?
import unittest
from unittest import mock
from unittest.mock import Mock
class MainTest(unittest.TestCase):
def test_happy_path(self):
with mock.patch('google.cloud.bigquery.Client') as bq_patch:
bq_patch().insert_rows_json.return_value = None
import main
data = {'trigger': 'testval'}
req = Mock(get_json=Mock(return_value=data), args=data)
assert 200 == main.http_to_bq(req)
bq_patch.reset_mock()
req.reset_mock()
def test_bigquery_error(self):
with mock.patch('google.cloud.bigquery.Client') as bq_patch:
bq_patch().insert_rows_json.return_value = 'BigQuery connection error found'
import main
data = {'trigger': 'testval'}
req = Mock(get_json=Mock(return_value=data), args=data)
assert 500 == main.http_to_bq(req)
req.reset_mock()
I was concentrating on the patches when actually the problem was with import main. The patch is applied when main is imported for the first time, and subsequent import main calls do nothing. I imported:
from importlib import reload
and replaced both instances of
import main
with
import main
reload(main)
That allows main to be re-patched with the latest patch.

How to get tornado object?

I want to get value of a tornado object with key
This is my code :
beanstalk = beanstalkt.Client(host='host', port=port)
beanstalk.connect()
print("ok1")
beanstalk.watch('contracts')
stateTube = beanstalk.stats_tube('contracts', callback=show)
print("ok2")
ioloop = tornado.ioloop.IOLoop.instance()
ioloop.start()
print("ok3")
And this is the function `show()``
def show(s):
pprint(s['current-jobs-ready'])
ioloop.stop
When I look at the documentation I found this :
And when I excecute this code, I have this :
ok1
ok2
3
In fact I have the result I wanted "3" but I don't understand why my program continue to running? Whythe ioloop doesn't close? I don't have ok3when I compile how can I do to close the ioloop and have ok3?
beanstalk.stats_tube is async, it returns a Future which represents a future result that has not yet been resolved.
As the README says, Your callback show will be executed with a dict that contains the resolved result. So you could define show like:
def show(stateTube):
pprint(stateTube['current-job-ready'])
beanstalk.stats_tube('contracts', callback=show)
from tornado.ioloop import IOLoop
IOLoop.current().start()
Note that you pass show, not show(): you're passing the function itself, not calling the function and passing its return value.
The other way to resolve a Future, besides passing a callback, is to use it in a coroutine:
from tornado import gen
from tornado.ioloop import IOLoop
#gen.coroutine
def get_stats():
stateTube = yield beanstalk.stats_tube('contracts')
pprint(stateTube['current-job-ready'])
loop = IOLoop.current()
loop.spawn_callback(get_stats)
loop.start()

How to execute Tornado coroutine inside of synchronous environment?

I have some Tornado's coroutine related problem.
There is some python-model A, which have the abbility to execute some function. The function could be set from outside of the model. I can't change the model itself, but I can pass any function I want. I'm trying to teach it to work with Tornado's ioloop through my function, but I couldn't.
Here is the snippet:
import functools
import pprint
from tornado import gen
from tornado import ioloop
class A:
f = None
def execute(self):
return self.f()
pass
#gen.coroutine
def genlist():
raise gen.Return(range(1, 10))
#gen.coroutine
def some_work():
a = A()
a.f = functools.partial(
ioloop.IOLoop.instance().run_sync,
lambda: genlist())
print "a.f set"
raise gen.Return(a)
#gen.coroutine
def main():
a = yield some_work()
retval = a.execute()
raise gen.Return(retval)
if __name__ == "__main__":
pprint.pprint(ioloop.IOLoop.current().run_sync(main))
So the thing is that I set the function in one part of code, but execute it in the other part with the method of the model.
Now, Tornado 4.2.1 gave me "IOLoop is already running" but in Tornado 3.1.1 it works (but I don't know how exactly).
I know next things:
I can create new ioloop but I would like to use existent ioloop.
I can wrap genlist with some function which knows that genlist's result is Future, but I don't know, how to block execution until future's result will be set inside of synchronous function.
Also, I can't use result of a.execute() as an future object because a.execute() could be called from other parts of the code, i.e. it should return list instance.
So, my question is: is there any opportunity to execute asynchronous genlist from the synchronous model's method using current IOLoop?
You cannot restart the outer IOLoop here. You have three options:
Use asynchronous interfaces everywhere: change a.execute() and everything up to the top of the stack into coroutines. This is the usual pattern for Tornado-based applications; trying to straddle the synchronous and asynchronous worlds is difficult and it's better to stay on one side or the other.
Use run_sync() on a temporary IOLoop. This is what Tornado's synchronous tornado.httpclient.HTTPClient does, which makes it safe to call from within another IOLoop. However, if you do it this way the outer IOLoop remains blocked, so you have gained nothing by making genlist asynchronous.
Run a.execute on a separate thread and call back to the main IOLoop's thread for the inner function. If a.execute cannot be made asynchronous, this is the only way to avoid blocking the IOLoop while it is running.
executor = concurrent.futures.ThreadPoolExecutor(8)
#gen.coroutine
def some_work():
a = A()
def adapter():
# Convert the thread-unsafe tornado.concurrent.Future
# to a thread-safe concurrent.futures.Future.
# Note that everything including chain_future must happen
# on the IOLoop thread.
future = concurrent.futures.Future()
ioloop.IOLoop.instance().add_callback(
lambda: tornado.concurrent.chain_future(
genlist(), future)
return future.result()
a.f = adapter
print "a.f set"
raise gen.Return(a)
#gen.coroutine
def main():
a = yield some_work()
retval = yield executor.submit(a.execute)
raise gen.Return(retval)
Say, your function looks something like this:
#gen.coroutine
def foo():
# does slow things
or
#concurrent.run_on_executor
def bar(i=1):
# does slow things
You can run foo() like so:
from tornado.ioloop import IOLoop
loop = IOLoop.current()
loop.run_sync(foo)
You can run bar(..), or any coroutine that takes args, like so:
from functools import partial
from tornado.ioloop import IOLoop
loop = IOLoop.current()
f = partial(bar, i=100)
loop.run_sync(f)

How do you unit test a Celery task?

The Celery documentation mentions testing Celery within Django but doesn't explain how to test a Celery task if you are not using Django. How do you do this?
It is possible to test tasks synchronously using any unittest lib out there. I normaly do 2 different test sessions when working with celery tasks. The first one (as I'm suggesting bellow) is completely synchronous and should be the one that makes sure the algorithm does what it should do. The second session uses the whole system (including the broker) and makes sure I'm not having serialization issues or any other distribution, comunication problem.
So:
from celery import Celery
celery = Celery()
#celery.task
def add(x, y):
return x + y
And your test:
from nose.tools import eq_
def test_add_task():
rst = add.apply(args=(4, 4)).get()
eq_(rst, 8)
Here is an update to my seven years old answer:
You can run a worker in a separate thread via a pytest fixture:
https://docs.celeryq.dev/en/v5.2.6/userguide/testing.html#celery-worker-embed-live-worker
According to the docs, you should not use "always_eager" (see the top of the page of the above link).
Old answer:
I use this:
with mock.patch('celeryconfig.CELERY_ALWAYS_EAGER', True, create=True):
...
Docs: https://docs.celeryq.dev/en/3.1/configuration.html#celery-always-eager
CELERY_ALWAYS_EAGER lets you run your task synchronously, and you don't need a celery server.
Depends on what exactly you want to be testing.
Test the task code directly. Don't call "task.delay(...)" just call "task(...)" from your unit tests.
Use CELERY_ALWAYS_EAGER. This will cause your tasks to be called immediately at the point you say "task.delay(...)", so you can test the whole path (but not any asynchronous behavior).
For those on Celery 4 it's:
#override_settings(CELERY_TASK_ALWAYS_EAGER=True)
Because the settings names have been changed and need updating if you choose to upgrade, see
https://docs.celeryproject.org/en/latest/history/whatsnew-4.0.html?highlight=what%20is%20new#lowercase-setting-names
unittest
import unittest
from myproject.myapp import celeryapp
class TestMyCeleryWorker(unittest.TestCase):
def setUp(self):
celeryapp.conf.update(CELERY_ALWAYS_EAGER=True)
py.test fixtures
# conftest.py
from myproject.myapp import celeryapp
#pytest.fixture(scope='module')
def celery_app(request):
celeryapp.conf.update(CELERY_ALWAYS_EAGER=True)
return celeryapp
# test_tasks.py
def test_some_task(celery_app):
...
Addendum: make send_task respect eager
from celery import current_app
def send_task(name, args=(), kwargs={}, **opts):
# https://github.com/celery/celery/issues/581
task = current_app.tasks[name]
return task.apply(args, kwargs, **opts)
current_app.send_task = send_task
As of Celery 3.0, one way to set CELERY_ALWAYS_EAGER in Django is:
from django.test import TestCase, override_settings
from .foo import foo_celery_task
class MyTest(TestCase):
#override_settings(CELERY_ALWAYS_EAGER=True)
def test_foo(self):
self.assertTrue(foo_celery_task.delay())
Since Celery v4.0, py.test fixtures are provided to start a celery worker just for the test and are shut down when done:
def test_myfunc_is_executed(celery_session_worker):
# celery_session_worker: <Worker: gen93553#mymachine.local (running)>
assert myfunc.delay().wait(3)
Among other fixtures described on http://docs.celeryproject.org/en/latest/userguide/testing.html#py-test, you can change the celery default options by redefining the celery_config fixture this way:
#pytest.fixture(scope='session')
def celery_config():
return {
'accept_content': ['json', 'pickle'],
'result_serializer': 'pickle',
}
By default, the test worker uses an in-memory broker and result backend. No need to use a local Redis or RabbitMQ if not testing specific features.
reference
using pytest.
def test_add(celery_worker):
mytask.delay()
if you use flask, set the app config
CELERY_BROKER_URL = 'memory://'
CELERY_RESULT_BACKEND = 'cache+memory://'
and in conftest.py
#pytest.fixture
def app():
yield app # Your actual Flask application
#pytest.fixture
def celery_app(app):
from celery.contrib.testing import tasks # need it
yield celery_app # Your actual Flask-Celery application
In my case (and I assume many others), all I wanted was to test the inner logic of a task using pytest.
TL;DR; ended up mocking everything away (OPTION 2)
Example Use Case:
proj/tasks.py
#shared_task(bind=True)
def add_task(self, a, b):
return a+b;
tests/test_tasks.py
from proj import add_task
def test_add():
assert add_task(1, 2) == 3, '1 + 2 should equal 3'
but, since shared_task decorator does a lot of celery internal logic, it isn't really a unit tests.
So, for me, there were 2 options:
OPTION 1: Separate internal logic
proj/tasks_logic.py
def internal_add(a, b):
return a + b;
proj/tasks.py
from .tasks_logic import internal_add
#shared_task(bind=True)
def add_task(self, a, b):
return internal_add(a, b);
This looks very odd, and other than making it less readable, it requires to manually extract and pass attributes that are part of the request, for instance the task_id in case you need it, which make the logic less pure.
OPTION 2: mocks
mocking away celery internals
tests/__init__.py
# noinspection PyUnresolvedReferences
from celery import shared_task
from mock import patch
def mock_signature(**kwargs):
return {}
def mocked_shared_task(*decorator_args, **decorator_kwargs):
def mocked_shared_decorator(func):
func.signature = func.si = func.s = mock_signature
return func
return mocked_shared_decorator
patch('celery.shared_task', mocked_shared_task).start()
which then allows me to mock the request object (again, in case you need things from the request, like the id, or the retries counter.
tests/test_tasks.py
from proj import add_task
class MockedRequest:
def __init__(self, id=None):
self.id = id or 1
class MockedTask:
def __init__(self, id=None):
self.request = MockedRequest(id=id)
def test_add():
mocked_task = MockedTask(id=3)
assert add_task(mocked_task, 1, 2) == 3, '1 + 2 should equal 3'
This solution is much more manual, but, it gives me the control I need to actually unit test, without repeating myself, and without losing the celery scope.
I see a lot of CELERY_ALWAYS_EAGER = true in unit tests methods as a solution for unit tests, but since the version 5.0.5 is available there are a lot of changes which makes most of the old answers deprecated and for me a time consuming nonsense, so for everyone here searching a Solution, go to the Doc and read the well documented unit test examples for the new Version:
https://docs.celeryproject.org/en/stable/userguide/testing.html
And to the Eager Mode with Unit Tests, here a quote from the actual docs:
Eager mode
The eager mode enabled by the task_always_eager setting is by
definition not suitable for unit tests.
When testing with eager mode you are only testing an emulation of what
happens in a worker, and there are many discrepancies between the
emulation and what happens in reality.
Another option is to mock the task if you do not need the side effects of running it.
from unittest import mock
#mock.patch('module.module.task')
def test_name(self, mock_task): ...

Categories