Avoid unneeded imports with unit tests - python

The PyCharm IDE encourages me to write unit tests in the same module as my classes lie. I like the idea of every module being tested automatically as I develop, but what bothers me is that I have additional imports that are only used for these unit tests. I can live with import unittest, but consider:
from lxml import etree
class Foobar(object):
def __init__(self):
schema_root = etree.parse("schema/myschema.xsd")
schema = etree.XMLSchema(schema_root)
self.parser = etree.XMLParser(schema=schema)
def valid(self, filename):
try:
etree.parse(filename, self.parser)
return True
except etree.XMLSyntaxError:
return False
import unittest
from io import StringIO
class _FoobarTest(unittest.TestCase):
def test_empty_object_is_valid(self):
foobar = Foobar()
self.assertTrue(foobar.valid(StringIO("<object />")))
I thought about instead doing it this way:
class _FoobarTest(unittest.TestCase):
from io import StringIO as StringIO_
def test_empty_object_is_valid(self):
foobar = Foobar()
self.assertTrue(foobar.valid(self.StringIO_("<object />")))
but that does not feel very natural to me. Since Python is a language that does care about best practice a lot; is there a somewhat official statement on this? I wasn't able to find anything in the PEP documents on this, which made me wonder if it is a good idea to unit test in the same module at all.

Neither PyCharm nor the Python community encourage having unit tests in the same file. See the PyCharm tutorial on creating unit tests. As demonstrated in those instructions, the better way to do unittests is to have them in separate files with a "test_" prefix, which can be discovered and run automatically both by the built in unittest module, or other libraries.
If you look under the documentation for unittest you will find there is already a built-in system for test discovery that works great.
python -m unittest discover
will find all tests that match the pattern "test*.py" in the current directory. This is the built in default, and best practice until you need something else.

Related

Mock an entire module in python

I have an application that imports a module from PyPI.
I want to write unittests for that application's source code, but I do not want to use the module from PyPI in those tests.
I want to mock it entirely (the testing machine will not contain that PyPI module, so any import will fail).
Currently, each time I try to load the class I want to test in the unittests, I immediately get an import error. so I thought about maybe using
try:
except ImportError:
and catch that import error, then use command_module.run().
This seems pretty risky/ugly and I was wondering if there's another way.
Another idea was writing an adapter to wrap that PyPI module, but I'm still working on that.
If you know any way I can mock an entire python package, I would appreciate it very much.
Thanks.
If you want to dig into the Python import system, I highly recommend David Beazley's talk.
As for your specific question, here is an example that tests a module when its dependency is missing.
bar.py - the module you want to test when my_bogus_module is missing
from my_bogus_module import foo
def bar(x):
return foo(x) + 1
mock_bogus.py - a file in with your tests that will load a mock module
from mock import Mock
import sys
import types
module_name = 'my_bogus_module'
bogus_module = types.ModuleType(module_name)
sys.modules[module_name] = bogus_module
bogus_module.foo = Mock(name=module_name+'.foo')
test_bar.py - tests bar.py when my_bogus_module is not available
import unittest
from mock_bogus import bogus_module # must import before bar module
from bar import bar
class TestBar(unittest.TestCase):
def test_bar(self):
bogus_module.foo.return_value = 99
x = bar(42)
self.assertEqual(100, x)
You should probably make that a little safer by checking that my_bogus_module isn't actually available when you run your test. You could also look at the pydoc.locate() method that will try to import something, and return None if it fails. It seems to be a public method, but it isn't really documented.
While #Don Kirkby's answer is correct, you might want to look at the bigger picture. I borrowed the example from the accepted answer:
import pypilib
def bar(x):
return pypilib.foo(x) + 1
Since pypilib is only available in production, it is not suprising that you have some trouble when you try to unit test bar. The function requires the external library to run, therefore it has to be tested with this library. What you need is an integration test.
That said, you might want to force unit testing, and that's generally a good idea because it will improve the confidence you (and others) have in the quality of your code. To widen the unit test area, you have to inject dependencies. Nothing prevents you (in Python!) from passing a module as a parameter (the type is types.ModuleType):
try:
import pypilib # production
except ImportError:
pypilib = object() # testing
def bar(x, external_lib = pypilib):
return external_lib.foo(x) + 1
Now, you can unit test the function:
import unittest
from unittest.mock import Mock
class Test(unittest.TestCase):
def test_bar(self):
external_lib = Mock(foo = lambda x: 3*x)
self.assertEqual(10, bar(3, external_lib))
if __name__ == "__main__":
unittest.main()
You might disapprove the design. The try/except part is a bit cumbersome, especially if you use the pypilib module in several modules of your application. And you have to add a parameter to each function that relies on the external library.
However, the idea to inject a dependency to the external library is useful, because you can control the input and test the output of your class methods, even if the external library is not within your control. Especially if the imported module is stateful, the state might be difficult to reproduce in a unit test. In this case, passing the module as a parameter may be a solution.
But the usual way to deal with this situation is called dependency inversion principle (the D of SOLID): you should define the (abstract) boundaries of your application, ie what you need from the outside world. Here, this is bar and other functions, preferably grouped in one or many classes:
import pypilib
import other_pypilib
class MyUtil:
"""
All I need from outside world
"""
#staticmethod
def bar(x):
return pypilib.foo(x) + 1
#staticmethod
def baz(x, y):
return other_pypilib.foo(x, y) * 10.0
...
# not every method has to be static
Each time you need one of these functions, just inject an instance of the class in your code:
class Application:
def __init__(self, util: MyUtil):
self._util = util
def something(self, x, y):
return self._util.baz(self._util.bar(x), y)
The MyUtil class must be as slim as possible, but must remain abstract from the underlying library. It is a tradeoff. Obviously, Application can be unit tested (just inject a Mock instead of an instance of MyUtil) while, under some circumstances (like a PyPi library not available during tests, a module that runs inside a framework only, etc.), MyUtil can be only tested within an integration test. If you need to unit test the boundaries of your application, you can use #Don Kirkby's method.
Note that the second benefit, after unit testing, is that if you change the libraries you are using (deprecation, license issue, cost, ...), you just have to rewrite the MyUtil class, using some other libraries or coding it from scratch. Your application is protected from the wild outside world.
Clean Code by Robert C. Martin has a full chapter on the boundaries.
Summary Before using #Don Kirkby's method or any other method, be sure to define the boundaries of your application irrespective of the specific libraries you are using. This, of course, does not apply to the Python standard library...
For a more explicit and granular approach:
import unittest
from unittest.mock import MagicMock, patch
try:
import bogus_module
except ModuleNotFoundError:
bogus_module = MagicMock()
#patch.dict('sys.modules', bogus_module=bogus_module)
class PlatformTests(unittest.TestCase):
...
Using the patch.dict decorator gives you granular control: it only applies to the class / method it is applied to.

How to Mock Import for Read the Docs?

Problem:
I have been fighting with Read the Docs. An imported module interacts with I/O, so the documentation does not contain any text. But the build does not fail.
Attempted Solution:
I am trying to use mock or MagicMock in doc/conf.py but it isn't working.
Desired Solution
Basically, I would like to mock the entire import. So RTD does not attempt to run any of the code. Just generate documentation from the DocStrings.
I simply want to mock ALL of the elements for a module. The classes, functions, and variables. Anything with a DocString.
Currently I MUST install the project inside a virtualenv, to satisfy the import. I would like to avoid this, if it isn't necessary. Right now... If I don't, again the documentation does not contain any text. Again, the build does not fail.
Details
example.py
"""Basic DocSting Comments"""
from external.module import *
foo = module()
foo.connect()
"""
I want this to show up in RTD.
"""
My specific case can be found here.
docs/conf.py
from mock import MagicMock
MOCK_MODULES = ['external.module', 'eternal.module.module', 'external.module.module.connect']
for mod_name in MOCK_MODULES:
sys.modules[mod_name] = MagicMock()
I have tried a dozen different things, with no luck. Using mock and MagicMock, different advanced settings in RTD. All with no luck.
An Ugly Hack:
I did come across an ugly hack. But it defeats the purpose of using DocStrings. Writing the code a second time so RTD can catch the DocStings, may as well write it in a separate document.
if __name__ == "__main__":
this = foo.connect()
"""
This is where the real DocStrings go.
"""
else:
this = 'this is the connect'
"""
This is where the RTD DocStrings would go
"""
I do not want to end up with twice the code, just to add some documentation.
MySQL Connector/Python
I would also like to use this with the MySQL Connector. Since RTD also breaks when it encounters this package. And I can not fix it with requirements.txt.
import mysql.connector as db
db_connection = db.connect(**my_config)
"""
Perhaps I want to include some details here.
"""
The following solution that I found on this blog post worked for me. I wanted to mock open3d and I was able to do that with:
from unittest import mock
# Mock open3d because it fails to build in readthedocs
MOCK_MODULES = ["open3d"]
for mod_name in MOCK_MODULES:
sys.modules[mod_name] = mock.Mock()
Note that you need to import mock from unittest because unittest.mock is a builtin module starting with Python 3.3 (source).
If you wanted to mock multiple packages, you could do something like:
from unittest import mock
# Mock open3d because it fails to build in readthedocs
MOCK_MODULES = ["open3d", "numpy", "matplotlib", "matplotlib.pyplot"]
for mod_name in MOCK_MODULES:
sys.modules[mod_name] = mock.Mock()

Python unit test that uses an external data file

I have a Python project that I'm working on in Eclipse and I have the following file structure:
/Project
/projectname
module1.py
module2.py
# etc.
/test
testModule1.py
# etc.
testdata.csv
In one of my tests I create an instance of one of my classes giving 'testdata.csv' as a parameter. This object does open('testdata.csv') and reads the contents.
If I run just this single test file with unittest everything works and the file is found and read properly. However if I try to run all my unit tests (i.e. run by right clicking the test directory rather than the individual test file), I get an error that file could not be found.
Is there any way to get around this (other than providing an absolute path, which I'd prefer not to do)?
Usually what I do is define
THIS_DIR = os.path.dirname(os.path.abspath(__file__))
at the top of each test module. Then it doesn't matter what working directory you're in - the file path is always the same relative to the where the test module sits.
Then I use something like this is in my test (or test setup):
my_data_path = os.path.join(THIS_DIR, os.pardir, 'data_folder/data.csv')
Or in your case, since the data source is in the test directory:
my_data_path = os.path.join(THIS_DIR, 'testdata.csv')
Edit: for modern python
from pathlib import Path
THIS_DIR = Path(__file__).parent
my_data_path = THIS_DIR.parent / 'data_folder/data.csv'
# or if it's in the same directory
my_data_path = THIS_DIR / 'testdata.csv'
Unit test that access the file system are generally not a good idea. This is because the test should be self contained, by making your test data external to the test it's no longer immediately obvious which test the csv file belongs to or even if it's still in use.
A preferable solution is to patch open and make it return a file-like object.
from unittest import TestCase
from unittest.mock import patch, mock_open
from textwrap import dedent
class OpenTest(TestCase):
DATA = dedent("""
a,b,c
x,y,z
""").strip()
#patch("builtins.open", mock_open(read_data=DATA))
def test_open(self):
# Due to how the patching is done, any module accessing `open' for the
# duration of this test get access to a mock instead (not just the test
# module).
with open("filename", "r") as f:
result = f.read()
open.assert_called_once_with("filename", "r")
self.assertEqual(self.DATA, result)
self.assertEqual("a,b,c\nx,y,z", result)
In my opinion the best way to handle these cases is to program via inversion of control.
In the two sections below I primarily show how a no-inversion-of-control solution would look like. The second section shows a solution with inversion of control and how this code can be tested without a mocking-framework.
In the end I state some personal pros and cons that do not at all have the intend to be correct and or complete. Feel free to comment for augmentation and correction.
No inversion of control (no dependency injection)
You have a class that uses the std open method from python.
class UsesOpen(object):
def some_method(self, path):
with open(path) as f:
process(f)
# how the class is being used in the open
def main():
uses_open = UsesOpen()
uses_open.some_method('/my/path')
Here I have used open explicitly in my code, so the only way to write tests for it would be to use explicit test-data (files) or use a mocking-framework like Dunes suggests.
But there is still another way:
My suggestion: Inversion of control (with dependency injection)
Now I rewrote the class differently:
class UsesOpen(object):
def __init__(self, myopen):
self.__open = myopen
def some_method(self, path):
with self.__open(path) as f:
process(f)
# how the class is being used in the open
def main():
uses_open = UsesOpen(open)
uses_open.some_method('/my/path')
In this second example I injected the dependency for open into the constructor (Constructor Dependency Injection).
Writing tests for inversion of control
Now I can easily write tests and use my test version of open when I need it:
EXAMPLE_CONTENT = """my file content
as an example
this can be anything"""
TEST_FILES = {
'/my/long/fake/path/to/a/file.conf': EXAMPLE_CONTENT
}
class MockFile(object):
def __init__(self, content):
self.__content = content
def read(self):
return self.__content
def __enter__(self):
return self
def __exit__(self, type, value, tb):
pass
class MockFileOpener(object):
def __init__(self, test_files):
self.__test_files = test_files
def open(self, path, *args, **kwargs):
return MockFile(self.__test_files[path])
class TestUsesOpen(object):
def test_some_method(self):
test_opener = MockFileOpener(TEST_FILES)
uses_open = UsesOpen(test_opener.open)
# assert that uses_open.some_method('/my/long/fake/path/to/a/file.conf')
# does the right thing
Pro/Con
Pro Dependency Injection
no need to learn mocking framework for tests
complete control over the classes and methods that have to be faked
also changing and evolving your code is easier in general
code quality normally improves, as one of the most important
factors is being able to respond to changes as easy as possible
using dependency injection and a dependency injection framework
is generally a respected way to work on a project https://en.wikipedia.org/wiki/Dependency_injection
Con Dependency Injection
a little bit more code to write in general
in tests not as short as patching a class via #patch
constructors can get overloaded with dependencies
you need to somehow learn to use dependency-injection
For test discovery it is recommended to make your test folder a package. In this case you can access resources in the test folder using importlib.resources (mind Python version compatibility of the individual functions, there are backports available as importlib_resources), as described here, e.g. like:
import importlib.resources
test_file_path_str = str(importlib.resources.files('tests').joinpath('testdata.csv'))
test_function_expecting_filename(test_file_path_str)
Like this you do not need to rely on inferring file locations of your code.
Your tests should not open the file directly, every test should copy the file and work with its copy.

python unittest faulty module

I have a module that I need to test in python.
I'm using the unittest framework but I ran into a problem.
The module has some method definitions, one of which is used when it's imported (readConfiguration) like so:
.
.
.
def readConfiguration(file = "default.xml"):
# do some reading from xml
readConfiguration()
This is a problem because when I try to import the module it also tries to run the "readConfiguration" method which fails the module and the program (a configuration file does not exist in the test environment).
I'd like to be able to test the module independent of any configuration files.
I didn't write the module and it cannot be re-factored.
I know I can include a dummy configuration file but I'm looking for a "cleaner", more elegant, solution.
As commenters have already pointed out, imports should never have side effects, so try to get the module changed if at all possible.
If you really, absolutely, cannot do this, there might be another way: let readConfiguration() be called, but stub out its dependencies. For instance, if it uses the builtin open() function, you could mock that, as demonstrated in the mock documentation:
>>> mock = MagicMock(return_value=sentinel.file_handle)
>>> with patch('builtins.open', mock):
... import the_broken_module
... # do your testing here
Replace sentinel.file_handle with StringIO("<contents of mock config file>") if you need to supply actual content.
It's brittle as it depends on the implementation of readConfiguration(), but if there really is no other way, it might be useful as a last resort.

How do I mock the filesystem in Python unit tests?

Is there a standard way (without installing third party libraries) to do cross platform filesystem mocking in Python? If I have to go with a third party library, which library is the standard?
pyfakefs (homepage) does what you want – a fake filesystem; it’s third-party, though that party is Google. See How to replace file-access references for a module under test for discussion of use.
For mocking, unittest.mock is the standard library for Python 3.3+ (PEP 0417); for earlier version see PyPI: mock (for Python 2.5+) (homepage).
Terminology in testing and mocking is inconsistent; using the Test Double terminology of Gerard Meszaros, you’re asking for a “fake”: something that behaves like a filesystem (you can create, open, and delete files), but isn’t the actual file system (in this case it’s in-memory), so you don’t need to have test files or a temporary directory.
In classic mocking, you would instead mock out the system calls (in Python, mock out functions in the os module, like os.rm and os.listdir), but that’s much more fiddly.
pytest is gaining a lot of traction, and it can do all of this using tmpdir and monkeypatching (mocking).
You can use the tmpdir function argument which will provide a temporary directory unique to the test invocation, created in the base temporary directory (which are by default created as sub-directories of the system temporary directory).
import os
def test_create_file(tmpdir):
p = tmpdir.mkdir("sub").join("hello.txt")
p.write("content")
assert p.read() == "content"
assert len(tmpdir.listdir()) == 1
The monkeypatch function argument helps you to safely set/delete an attribute, dictionary item or environment variable or to modify sys.path for importing.
import os
def test_some_interaction(monkeypatch):
monkeypatch.setattr(os, "getcwd", lambda: "/")
You can also pass it a function instead of using lambda.
import os.path
def getssh(): # pseudo application code
return os.path.join(os.path.expanduser("~admin"), '.ssh')
def test_mytest(monkeypatch):
def mockreturn(path):
return '/abc'
monkeypatch.setattr(os.path, 'expanduser', mockreturn)
x = getssh()
assert x == '/abc/.ssh'
# You can still use lambda when passing arguments, e.g.
# monkeypatch.setattr(os.path, 'expanduser', lambda x: '/abc')
If your application has a lot of interaction with the file system, then it might be easier to use something like pyfakefs, as mocking would become tedious and repetitive.
The standard mocking framework in Python 3.3+ is unittest.mock; you can use this for the filesystem or anything else.
You could also simply hand roll it by mocking via monkey patching:
A trivial example:
import os.path
os.path.isfile = lambda path: path == '/path/to/testfile'
A bit more full (untested):
import classtobetested
import unittest
import contextlib
#contextlib.contextmanager
def monkey_patch(module, fn_name, patch):
unpatch = getattr(module, fn_name)
setattr(module, fn_name)
try:
yield
finally:
setattr(module, fn_name, unpatch)
class TestTheClassToBeTested(unittest.TestCase):
def test_with_fs_mocks(self):
with monkey_patch(classtobetested.os.path,
'isfile',
lambda path: path == '/path/to/file'):
self.assertTrue(classtobetested.testable())
In this example, the actual mocks are trivial, but you could back them with something that has state so that can represent filesystem actions, such as save and delete. Yes, this is all a bit ugly since it entails replicating/simulating basic filesystem in code.
Note that you can't monkey patch python builtins. That being said...
For earlier versions, if at all possible use a third party library, I'd go with Michael Foord's awesome Mock, which is now unittest.mock in the standard library since 3.3+ thanks to PEP 0417, and you can get it on PyPI for Python 2.5+. And, it can mock builtins!
Faking or Mocking?
Personally, I find that there are a lot of edge cases in filesystem things (like opening the file with the right permissions, string-vs-binary, read/write mode, etc), and using an accurate fake filesystem can find a lot of bugs that you might not find by mocking. In this case, I would check out the memoryfs module of pyfilesystem (it has various concrete implementations of the same interface, so you can swap them out in your code).
Mocking (and without Monkey Patching!):
That said, if you really want to mock, you can do that easily with Python's unittest.mock library:
import unittest.mock
# production code file; note the default parameter
def make_hello_world(path, open_func=open):
with open_func(path, 'w+') as f:
f.write('hello, world!')
# test code file
def test_make_hello_world():
file_mock = unittest.mock.Mock(write=unittest.mock.Mock())
open_mock = unittest.mock.Mock(return_value=file_mock)
# When `make_hello_world()` is called
make_hello_world('/hello/world.txt', open_func=open_mock)
# Then expect the file was opened and written-to properly
open_mock.assert_called_once_with('/hello/world.txt', 'w+')
file_mock.write.assert_called_once_with('hello, world!')
The above example only demonstrates creating and writing to files via mocking the open() method, but you could just as easily mock any method.
The standard unittest.mock library has a mock_open() function which provides basic mocking of the file system.
Benefits: It's part of the standard library, and inherits the various features of Mocks, including checking call parameters & usage.
Drawbacks: It doesn't maintain filesystem state the way pytest or pyfakefs or mockfs does, so it's harder to test functions that do R/W interactions or interact with multiple files simultaneously.

Categories