I've got a Python application using pytest. For several of my tests, there are API calls to Elasticsearch (using elasticsearch-dsl-py) that slow down my tests that I'd like to:
prevent unless a Pytest marker is used.
If a marker is used, I would want that marker to execute some code before the test runs. Just like how a fixture would work if you used yield.
This is mostly inspired by pytest-django where you have to use the django_db marker in order to make a conn to the database (but they throw an error if you try to connect to the DB, whereas I just don't want a call in the first place, that's all).
For example:
def test_unintentionally_using_es():
"""I don't want a call going to Elasticsearch. But they just happen. Is there a way to "mock" the call? Or even just prevent the call from happening?"""
#pytest.mark.elastic
def test_intentionally_using_es():
"""I would like for this marker to perform some tasks beforehand (i.e. clear the indices)"""
# To replicate that second test, I currently use a fixture:
#pytest.fixture
def elastic():
# Pre-test tasks
yield something
I think that's a use-case for markers right? Mostly inspired by pytest-django.
Your initial approach with having a combination of a fixture and a custom marker is the correct one; in the code below, I took the code from your question and filled in the gaps.
Suppose we have some dummy function to test that uses the official elasticsearch client:
# lib.py
from datetime import datetime
from elasticsearch import Elasticsearch
def f():
es = Elasticsearch()
es.indices.create(index='my-index', ignore=400)
return es.index(
index="my-index",
id=42,
body={"any": "data", "timestamp": datetime.now()},
)
We add two tests, one is not marked with elastic and should operate on fake client, the other one is marked and needs access to real client:
# test_lib.py
from lib import f
def test_fake():
resp = f()
assert resp["_id"] == "42"
#pytest.mark.elastic
def test_real():
resp = f()
assert resp["_id"] == "42"
Now let's write the elastic() fixture that will mock the Elasticsearch class depending on whether the elastic marker was set:
from unittest.mock import MagicMock, patch
import pytest
#pytest.fixture(autouse=True)
def elastic(request):
should_mock = request.node.get_closest_marker("elastic") is None
if should_mock:
patcher = patch('lib.Elasticsearch')
fake_es = patcher.start()
# this is just a mock example
fake_es.return_value.index.return_value.__getitem__.return_value = "42"
else:
... # e.g. start the real server here etc
yield
if should_mock:
patcher.stop()
Notice the usage of autouse=True: the fixture will be executed on each test invocation, but only do the patching if the test is not marked. This presence of the marker is checked via request.node.get_closest_marker("elastic") is None. If you run both tests now, test_fake will pass because elastic mocks the Elasticsearch.index() response, while test_real will fail, assuming you don't have a server running on port 9200.
Related
I am using pytest to test code that creates a prometheus Python client to export metrics.
Between the test functions the prometheus client does not reset (because it has an internal state) which screws up my tests.
I am looking for a way to basically get a new Pyhon runtime before all test function calls. That way the internal state of the prometheus client would hopefully reset to a state that it had when the Python runtime started to execute my tests.
I already tried importlib.reload() but that does not work.
If you want to start each test with "clean" Prometheus client then I think best is to move it's creation and tear down to fixture with function scope (it's actually default scope), like this:
#pytest.fixture(scope=function)
def prometheus_client(arg1, arg2, etc...)
#create your client here
yield client
#remove your client here
Then you define your tests using this fixture:
def test_number_one(prometheus_client):
#test body
This way client is created from scratch in each test and deleted even if the test fails.
Straightforward approach: memorize current metric values before the test
First of all, let me show how I believe you should work with metrics in tests (and how I do it in my projects). Instead of doing resets, keep track of the metric values before the test starts. After the test finishes, collect the metrics again and analyze the diff between both values. Example of a django view test with a counter:
import copy
import prometheus_client
import pytest
from django.test import Client
def find_value(metrics, metric_name):
return next((
sample.value
for metric in metrics
for sample in metric.samples
if sample.name == metric_name
), None)
#pytest.fixture
def metrics_before():
yield copy.deepcopy(list(prometheus_client.REGISTRY.collect()))
#pytest.fixture
def client():
return Client()
def test_counter_increases_by_one(client, metrics_before):
# record the metric value before the HTTP client requests the view
value_before = find_value(metrics_before, 'http_requests_total') or 0.0
# do the request
client.get('/my-view/')
# collect the metric value again
metrics_after = prometheus_client.REGISTRY.collect()
value_after = find_value(metrics_after, 'http_requests_total')
# the value should have increased by one
assert value_after == value_before + 1.0
Now let's see what can be done with the registry itself. Note that this uses prometheus-client internals and is fragile by definition - use at your own risk!
Messing with prometheus-client internals: unregister all metrics
If you are sure your test code will invoke metrics registration from scratch, you can unregister all metrics from the registry before the test starts:
#pytest.fixture(autouse=True)
def clear_registry():
collectors = tuple(prometheus_client.REGISTRY._collector_to_names.keys())
for collector in collectors:
prometheus_client.REGISTRY.unregister(collector)
yield
Beware that this will only work if your test code invokes metrics registration again! Otherwise, you will effectively stop the metrics collection. For example, the built-in PlatformCollector will be gone until you explicitly register it again, e.g. by creating a new instance via prometheus_client.PlatformCollector().
Messing with prometheus-client internals: reset (almost) all metrics
You can also reset the values of registered metrics:
#pytest.fixture(autouse=True)
def reset_registry():
collectors = tuple(prometheus_client.REGISTRY._collector_to_names.keys())
for collector in collectors:
try:
collector._metrics.clear()
collector._metric_init()
except AttributeError:
pass # built-in collectors don't inherit from MetricsWrapperBase
yield
This will reinstantiate the values and metrics of all counters/gauges/histograms etc. The test from above could now be written as
def test_counter_increases_by_one(client):
# do the request
client.get('/my-view/')
# collect the metric value
metrics_after = prometheus_client.REGISTRY.collect()
value_after = find_value(metrics_after, 'http_requests_total')
# the value should be one
assert value_after == 1.0
Of course, this will not reset any metrics of built-in collectors, like PlatformCollector since it scrapes the values only once at instantiation, or ProcessCollector because it doesn't store any values at all, instead reading them from OS anew.
I'm writing a CLI to interact with elasticsearch using the elasticsearch-py library. I'm trying to mock elasticsearch-py functions in order to test my functions without calling my real cluster.
I read this question and this one but I still don't understand.
main.py
Escli inherits from cliff's App class
class Escli(App):
_es = elasticsearch5.Elasticsearch()
settings.py
from escli.main import Escli
class Settings:
def get(self, sections):
raise NotImplementedError()
class ClusterSettings(Settings):
def get(self, setting, persistency='transient'):
settings = Escli._es.cluster\
.get_settings(include_defaults=True, flat_settings=True)\
.get(persistency)\
.get(setting)
return settings
settings_test.py
import escli.settings
class TestClusterSettings(TestCase):
def setUp(self):
self.patcher = patch('elasticsearch5.Elasticsearch')
self.MockClass = self.patcher.start()
def test_get(self):
# Note this is an empty dict to show my point
# it will contain childs dict to allow my .get(persistency).get(setting)
self.MockClass.return_value.cluster.get_settings.return_value = {}
cluster_settings = escli.settings.ClusterSettings()
ret = cluster_settings.get('cluster.routing.allocation.node_concurrent_recoveries', persistency='transient')
# ret should contain a subset of my dict defined above
I want to have Escli._es.cluster.get_settings() to return what I want (a dict object) in order to not make the real HTTP call, but it keeps doing it.
What I know:
In order to mock an instance method I have to do something like
MagicMockObject.return_value.InstanceMethodName.return_value = ...
I cannot patch Escli._es.cluster.get_settings because Python tries to import Escli as module, which cannot work. So I'm patching the whole lib.
I desperately tried to put some return_value everywhere but I cannot understand why I can't mock that thing properly.
You should be mocking with respect to where you are testing. Based on the example provided, this means that the Escli class you are using in the settings.py module needs to be mocked with respect to settings.py. So, more practically, your patch call would look like this inside setUp instead:
self.patcher = patch('escli.settings.Escli')
With this, you are now mocking what you want in the right place based on how your tests are running.
Furthermore, to add more robustness to your testing, you might want to consider speccing for the Elasticsearch instance you are creating in order to validate that you are in fact calling valid methods that correlate to Elasticsearch. With that in mind, you can do something like this, instead:
self.patcher = patch('escli.settings.Escli', Mock(Elasticsearch))
To read a bit more about what exactly is meant by spec, check the patch section in the documentation.
As a final note, if you are interested in exploring the great world of pytest, there is a pytest-elasticsearch plugin created to assist with this.
I need to mock elasticsearch calls, but I am not sure how to mock them in my python unit tests. I saw this framework called ElasticMock. I tried using it the way indicated in the documentation and it gave me plenty of errors.
It is here :
https://github.com/vrcmarcos/elasticmock
My question is, is there any other way to mock elastic search calls?
This doesn't seem to have an answer either: Mock elastic search data.
And this just indicates to actually do integration tests rather than unit tests, which is not what I want:
Unit testing elastic search inside Django app.
Can anyone point me in the right direction? I have never mocked things with ElasticSearch.
You have to mock the attr or method you need, for example:
import mock
with mock.patch("elasticsearch.Elasticsearch.search") as mocked_search, \
mock.patch("elasticsearch.client.IndicesClient.create") as mocked_index_create:
mocked_search.return_value = "pipopapu"
mocked_index_create.return_value = {"acknowledged": True}
In order to know the path you need to mock, just explore the lib with your IDE. When you already know one you can easily find the others.
After looking at the decorator source code, the trick for me was to reference Elasticsearch with the module:
import elasticsearch
...
elasticsearch.Elasticsearch(...
instead of
from elasticsearch import Elasticsearch
...
Elasticsearch(...
I'm going to give a very abstract answer because this applies to more than ES.
class ProductionCodeIWantToTest:
def __init__(self):
pass
def do_something(data):
es = ES() #or some database or whatever
es.post(data) #or the right syntax
Now I can't test this.
With one small change, injecting a dependency:
class ProductionCodeIWantToTest:
def __init__(self, database):
self.database = database
def do_something(data):
database.save(data) #or the right syntax
Now you can use the real db:
es = ES() #or some database or whatever
thing = ProductionCodeIWantToTest(es)
or test it
mock = #... up to you - just needs a save method so far
thing = ProductionCodeIWantToTest(mock)
I am a beginner to using pytest in python and trying to write test cases for the following method which get the user address when correct Id is passed else rises custom error BadId.
def get_user_info(id: str, host='127.0.0.1', port=3000 ) -> str:
uri = 'http://{}:{}/users/{}'.format(host,port,id)
result = Requests.get(uri).json()
address = result.get('user',{}).get('address',None)
if address:
return address
else:
raise BadId
Can someone help me with this and also can you suggest me what are the best resources for learning pytest? TIA
Your test regimen might look something like this.
First I suggest creating a fixture to be used in your various method tests. The fixture sets up an instance of your class to be used in your tests rather than creating the instance in the test itself. Keeping tasks separated in this way helps to make your tests both more robust and easier to read.
from my_package import MyClass
import pytest
#pytest.fixture
def a_test_object():
return MyClass()
You can pass the test object to your series of method tests:
def test_something(a_test_object):
# do the test
However if your test object requires some resources during setup (such as a connection, a database, a file, etc etc), you can mock it instead to avoid setting up the resources for the test. See this talk for some helpful info on how to do that.
By the way: if you need to test several different states of the user defined object being created in your fixture, you'll need to parametrize your fixture. This is a bit of a complicated topic, but the documentation explains fixture parametrization very clearly.
The other thing you need to do is make sure any .get calls to Requests are intercepted. This is important because it allows your tests to be run without an internet connection, and ensures they do not fail as a result of a bad connection, which is not the thing you are trying to test.
You can intercept Requests.get by using the monkeypatch feature of pytest. All that is required is to include monkeypatch as an input parameter to the test regimen functions.
You can employ another fixture to accomplish this. It might look like this:
import Requests
import pytest
#pytest.fixture
def patched_requests(monkeypatch):
# store a reference to the old get method
old_get = Requests.get
def mocked_get(uri, *args, **kwargs):
'''A method replacing Requests.get
Returns either a mocked response object (with json method)
or the default response object if the uri doesn't match
one of those that have been supplied.
'''
_, id = uri.split('/users/', 1)
try:
# attempt to get the correct mocked json method
json = dict(
with_address1 = lambda: {'user': {'address': 123}},
with_address2 = lambda: {'user': {'address': 456}},
no_address = lambda: {'user': {}},
no_user = lambda: {},
)[id]
except KeyError:
# fall back to default behavior
obj = old_get(uri, *args, **kwargs)
else:
# create a mocked requests object
mock = type('MockedReq', (), {})()
# assign mocked json to requests.json
mock.json = json
# assign obj to mock
obj = mock
return obj
# finally, patch Requests.get with patched version
monkeypatch.setattr(Requests, 'get', mocked_get)
This looks complicated until you understand what is happening: we have simply made some mocked json objects (represented by dictionaries) with pre-determined user ids and addresses. The patched version of Requests.get simply returns an object- of type MockedReq- with the corresponding mocked .json() method when its id is requested.
Note that Requests will only be patched in tests that actually use the above fixture, e.g.:
def test_something(patched_requests):
# use patched Requests.get
Any test that does not use patched_requests as an input parameter will not use the patched version.
Also note that you could monkeypatch Requests within the test itself, but I suggest doing it separately. If you are using other parts of the requests API, you may need to monkeypatch those as well. Keeping all of this stuff separate is often going to be easier to understand than including it within your test.
Write your various method tests next. You'll need a different test for each aspect of your method. In other words, you will usually write a different test for the instance in which your method succeeds, and another one for testing when it fails.
First we test method success with a couple test cases.
#pytest.mark.parametrize('id, result', [
('with_address1', 123),
('with_address2', 456),
])
def test_get_user_info_success(patched_requests, a_test_object, id, result):
address = a_test_object.get_user_info(id)
assert address == result
Next we can test for raising the BadId exception using the with pytest.raises feature. Note that since an exception is raised, there is not a result input parameter for the test function.
#pytest.mark.parametrize('id', [
'no_address',
'no_user',
])
def test_get_user_info_failure(patched_requests, a_test_object, id):
from my_package import BadId
with pytest.raises(BadId):
address = a_test_object.get_user_info(id)
As posted in my comment, here also are some additional resources to help you learn more about pytest:
link
link
Also be sure to check out Brian Okken's book and Bruno Oliveira's book. They are both very helpful for learning pytest.
I have an issue with a my unit tests and the way django manages transactions.
In my code I have a function:
def send():
autocommit = transaction.set_autocommit(False)
try:
# stuff
finally:
transaction.rollback()
transaction.set_autocommit(autocommit)
In my test I have:
class MyTest(TransactionTestCase):
def test_send(self):
send()
The issue I am having is that my test_send passes successfully but not 80% of my other tests.
It seems the transaction of the other tests are failing
btw I am using py.test to run my tests
EDIT:
To make things more clear when I run my tests with only
myapp.test.test_module.py it runs fine and all 3 tests passes but when I run all my tests most of the fails, will try to produce a test app
Also all my tests passes with the default test runner from django
EDIT2:
Here is A minimal example to test this issue:
class ManagementTestCase(TransactionTestCase):
def test_transfer_ubl(self, MockExact):
pass
class TestTestCase(TestCase):
def test_1_user(self):
get_user_model().objects.get(username="admin")
self.assertEqual(get_user_model().objects.all().count(), 1)
Bear in mind there is a datamigration that adds an "admin" user (the TestTestCase succeeds alone but not when the ManagmentTestCase is run before)
It seems autocommit has nothing to do with it.
The TestCase class wraps the tests inside two atomic blocks. Therefore it is not possible to use transaction.set_autocommit() or transaction.rollback() if you are inheriting from TestCase.
As the docs say, you should use TransactionTestCase if you are testing specific database transaction behaviour.
having autocommit = transaction.set_autocommit(False) inside the send function feels wrong. Disabling the transaction is done here presumably for testing purposes, but the rule of thumb is to keep your test logic outside your code.
As #Alasdair has pointed out, django docs states "Django’s TestCase class also wraps each test in a transaction for performance reasons."
It is not clear from your question whether you're testing specific database transaction logic or not, if that is the case then #Alasdair's answer of using the TransactionTestCase is the way to go.
Otherwise, removing the transaction context switch from around the stuff inside your send function should help.
Since you mentioned pytest as your test runner, I would also recommend making use of pytest. Pytest-django plugin comes with nice features such selectively setting some of your tests to require transactions, using markers.
pytest.mark.django_db(transaction=False)
If installing a plugin is too much, then you could roll your own transaction manage fixture. Like
#pytest.fixture
def no_transaction(request):
autocommit = transaction.set_autocommit(False)
def rollback():
transaction.rollback()
transaction.set_autocommit(True)
request.addfinalizer(rollback)
Your test_send will then require the no_transaction fixture.
def test_send(no_transaction):
send()
For those who still looking for a solution, serialized_rollback option is a way to go:
class ManagementTestCase(TransactionTestCase):
serialized_rollback = True
def test_transfer_ubl(self, MockExact):
pass
class TestTestCase(TestCase):
def test_1_user(self):
get_user_model().objects.get(username="admin")
self.assertEqual(get_user_model().objects.all().count(), 1)
from the docs
Django can reload that data for you on a per-testcase basis by setting the serialized_rollback option to True in the body of the TestCase or TransactionTestCase, but note that this will slow down that test suite by approximately 3x.
Unfortunately, pytest-django still missing this feature.