How do I mock the filesystem in Python unit tests? - python

Is there a standard way (without installing third party libraries) to do cross platform filesystem mocking in Python? If I have to go with a third party library, which library is the standard?

pyfakefs (homepage) does what you want – a fake filesystem; it’s third-party, though that party is Google. See How to replace file-access references for a module under test for discussion of use.
For mocking, unittest.mock is the standard library for Python 3.3+ (PEP 0417); for earlier version see PyPI: mock (for Python 2.5+) (homepage).
Terminology in testing and mocking is inconsistent; using the Test Double terminology of Gerard Meszaros, you’re asking for a “fake”: something that behaves like a filesystem (you can create, open, and delete files), but isn’t the actual file system (in this case it’s in-memory), so you don’t need to have test files or a temporary directory.
In classic mocking, you would instead mock out the system calls (in Python, mock out functions in the os module, like os.rm and os.listdir), but that’s much more fiddly.

pytest is gaining a lot of traction, and it can do all of this using tmpdir and monkeypatching (mocking).
You can use the tmpdir function argument which will provide a temporary directory unique to the test invocation, created in the base temporary directory (which are by default created as sub-directories of the system temporary directory).
import os
def test_create_file(tmpdir):
p = tmpdir.mkdir("sub").join("hello.txt")
p.write("content")
assert p.read() == "content"
assert len(tmpdir.listdir()) == 1
The monkeypatch function argument helps you to safely set/delete an attribute, dictionary item or environment variable or to modify sys.path for importing.
import os
def test_some_interaction(monkeypatch):
monkeypatch.setattr(os, "getcwd", lambda: "/")
You can also pass it a function instead of using lambda.
import os.path
def getssh(): # pseudo application code
return os.path.join(os.path.expanduser("~admin"), '.ssh')
def test_mytest(monkeypatch):
def mockreturn(path):
return '/abc'
monkeypatch.setattr(os.path, 'expanduser', mockreturn)
x = getssh()
assert x == '/abc/.ssh'
# You can still use lambda when passing arguments, e.g.
# monkeypatch.setattr(os.path, 'expanduser', lambda x: '/abc')
If your application has a lot of interaction with the file system, then it might be easier to use something like pyfakefs, as mocking would become tedious and repetitive.

The standard mocking framework in Python 3.3+ is unittest.mock; you can use this for the filesystem or anything else.
You could also simply hand roll it by mocking via monkey patching:
A trivial example:
import os.path
os.path.isfile = lambda path: path == '/path/to/testfile'
A bit more full (untested):
import classtobetested
import unittest
import contextlib
#contextlib.contextmanager
def monkey_patch(module, fn_name, patch):
unpatch = getattr(module, fn_name)
setattr(module, fn_name)
try:
yield
finally:
setattr(module, fn_name, unpatch)
class TestTheClassToBeTested(unittest.TestCase):
def test_with_fs_mocks(self):
with monkey_patch(classtobetested.os.path,
'isfile',
lambda path: path == '/path/to/file'):
self.assertTrue(classtobetested.testable())
In this example, the actual mocks are trivial, but you could back them with something that has state so that can represent filesystem actions, such as save and delete. Yes, this is all a bit ugly since it entails replicating/simulating basic filesystem in code.
Note that you can't monkey patch python builtins. That being said...
For earlier versions, if at all possible use a third party library, I'd go with Michael Foord's awesome Mock, which is now unittest.mock in the standard library since 3.3+ thanks to PEP 0417, and you can get it on PyPI for Python 2.5+. And, it can mock builtins!

Faking or Mocking?
Personally, I find that there are a lot of edge cases in filesystem things (like opening the file with the right permissions, string-vs-binary, read/write mode, etc), and using an accurate fake filesystem can find a lot of bugs that you might not find by mocking. In this case, I would check out the memoryfs module of pyfilesystem (it has various concrete implementations of the same interface, so you can swap them out in your code).
Mocking (and without Monkey Patching!):
That said, if you really want to mock, you can do that easily with Python's unittest.mock library:
import unittest.mock
# production code file; note the default parameter
def make_hello_world(path, open_func=open):
with open_func(path, 'w+') as f:
f.write('hello, world!')
# test code file
def test_make_hello_world():
file_mock = unittest.mock.Mock(write=unittest.mock.Mock())
open_mock = unittest.mock.Mock(return_value=file_mock)
# When `make_hello_world()` is called
make_hello_world('/hello/world.txt', open_func=open_mock)
# Then expect the file was opened and written-to properly
open_mock.assert_called_once_with('/hello/world.txt', 'w+')
file_mock.write.assert_called_once_with('hello, world!')
The above example only demonstrates creating and writing to files via mocking the open() method, but you could just as easily mock any method.

The standard unittest.mock library has a mock_open() function which provides basic mocking of the file system.
Benefits: It's part of the standard library, and inherits the various features of Mocks, including checking call parameters & usage.
Drawbacks: It doesn't maintain filesystem state the way pytest or pyfakefs or mockfs does, so it's harder to test functions that do R/W interactions or interact with multiple files simultaneously.

Related

Making util file not accessible in python

I am building a python library. The functions I want available for users are in stemmer.py. Stemmer.py uses stemmerutil.py
I was wondering whether there is a way to make stemmerutil.py not accessible to users.
If you want to hide implementation details from your users, there are two routes that you can go. The first uses conventions to signal what is and isn't part of the public API, and the other is a hack.
The convention for declaring an API within a python library is to add all classes/functions/names that should be exposed into an __all__-list of the topmost __init__.py. It doesn't do that many useful things, its main purpose nowadays is a symbolic "please use this and nothing else". Yours would probably look somewhat like this:
urdu/urdu/__init__.py
from urdu.stemmer import Foo, Bar, Baz
__all__ = [Foo, Bar, Baz]
To emphasize the point, you can also give all definitions within stemmerUtil.py an underscore before their name, e.g. def privateFunc(): ... becomes def _privateFunc(): ...
But you can also just hide the code from the interpreter by making it a resource instead of a module within the package and loading it dynamically. This is a hack, and probably a bad idea, but it is technically possible.
First, you rename stemmerUtil.py to just stemmerUtil - now it is no longer a python module and can't be imported with the import keyword. Next, update this line in stemmer.py
import stemmerUtil
with
import importlib.util
import importlib.resources
# in python3.7 and lower, this is importlib_resources and needs to be installed first
stemmer_util_spec = importlib.util.spec_from_loader("stemmerUtil", loader=None)
stemmerUtil = importlib.util.module_from_spec(stemmer_util_spec)
with importlib.resources.path("urdu", "stemmerUtil") as stemmer_util_path:
with open(stemmer_util_path) as stemmer_util_file:
stemmer_util_code = stemmer_util_file.read()
exec(stemmer_util_code, stemmerUtil.__dict__)
After running this code, you can use the stemmerUtil module as if you had imported it, but it is invisible to anyone who installed your package - unless they run this exact code as well.
But as I said, if you just want to communicate to your users which part of your package is the public API, the first solution is vastly preferable.

What is the best way to mock os.system for unit test (PyTest)

I have a Python script that does multiple os.system calls. Asserting against the series of them as a list of strings will be easy (and relatively elegant).
What isn't so easy is intercepting (and blocking) the actual calls. In the script in question, I could abstract os.system in the SUT (*) like so:
os_system = None
def main():
return do_the_thing(os.system)
def do_the_thing(os_sys):
global os_system
os_system = os_sys
# all other function should use os_system instead of os.system
My test invokes my_script.do_the_thing() instead of my_script.main() of course (leaving a tiny amount of untested code).
Alternate option: I could leave the SUT untouched and replace os.system globally in the test method before invoking main() in the SUT.
That leaves me with new problems in that that's a global and lasting change. Fine, so I'd use a try/finally in the same test method, and replace the original before leaving the test method. That'd work whether the test method passes or fails.
Is there a safe and elegant setup/teardown centric way of doing this for PyTest, though?
Additional complications: I want to do the same for stdout and stderr. Yes, it really is a main() script that I am testing.
SUT == System Under Test
The Python 3 (>= 3.3) standard library has a great tutorial about Mock in the official documentation. For Python 2, you can use the backported library: Mock on PyPi.
Here is a sample usage. Say you want to mock the call to os.system in this function:
import os
def my_function(src_dir):
os.system('ls ' + src_dir)
To do that, you can use the unittest.mock.patch decorator, like this:
import unittest.mock
#unittest.mock.patch('os.system')
def test_my_function(os_system):
# type: (unittest.mock.Mock) -> None
my_function("/path/to/dir")
os_system.assert_called_once_with('ls /path/to/dir')
This test function will patch the os.system call during its execution. os.system is restored at the end.
Then, there are several "assert" method to check the calls, the parameters, and the results. You can also check that an exception is raised in certain circonstances.
Just want to add an important detail.
If your code uses import system for example like this:
myls.py:
import os
def do_ls():
os.system('ls')
Then the patch in your test should look like this:
test_myls.py:
from unittest.mock import patch
#patch('os.system')
def test_do_ls(mock_system):
do_ls()
mock_system.assert_called()
However, if the code uses from os import system, e.g. like this:
myls.py:
from os import system
def do_ls():
system('ls')
Then the patch in your test should look like this:
test_myls.py:
from unittest.mock import patch
#patch('myls.system')
def test_do_ls(mock_system):
do_ls()
mock_system.assert_called()
This eluded me for a bit because I had forgotten to read the section on where to patch as I originally intended. If patching does not seem to work, this is one of the points to look at.

Mock an entire module in python

I have an application that imports a module from PyPI.
I want to write unittests for that application's source code, but I do not want to use the module from PyPI in those tests.
I want to mock it entirely (the testing machine will not contain that PyPI module, so any import will fail).
Currently, each time I try to load the class I want to test in the unittests, I immediately get an import error. so I thought about maybe using
try:
except ImportError:
and catch that import error, then use command_module.run().
This seems pretty risky/ugly and I was wondering if there's another way.
Another idea was writing an adapter to wrap that PyPI module, but I'm still working on that.
If you know any way I can mock an entire python package, I would appreciate it very much.
Thanks.
If you want to dig into the Python import system, I highly recommend David Beazley's talk.
As for your specific question, here is an example that tests a module when its dependency is missing.
bar.py - the module you want to test when my_bogus_module is missing
from my_bogus_module import foo
def bar(x):
return foo(x) + 1
mock_bogus.py - a file in with your tests that will load a mock module
from mock import Mock
import sys
import types
module_name = 'my_bogus_module'
bogus_module = types.ModuleType(module_name)
sys.modules[module_name] = bogus_module
bogus_module.foo = Mock(name=module_name+'.foo')
test_bar.py - tests bar.py when my_bogus_module is not available
import unittest
from mock_bogus import bogus_module # must import before bar module
from bar import bar
class TestBar(unittest.TestCase):
def test_bar(self):
bogus_module.foo.return_value = 99
x = bar(42)
self.assertEqual(100, x)
You should probably make that a little safer by checking that my_bogus_module isn't actually available when you run your test. You could also look at the pydoc.locate() method that will try to import something, and return None if it fails. It seems to be a public method, but it isn't really documented.
While #Don Kirkby's answer is correct, you might want to look at the bigger picture. I borrowed the example from the accepted answer:
import pypilib
def bar(x):
return pypilib.foo(x) + 1
Since pypilib is only available in production, it is not suprising that you have some trouble when you try to unit test bar. The function requires the external library to run, therefore it has to be tested with this library. What you need is an integration test.
That said, you might want to force unit testing, and that's generally a good idea because it will improve the confidence you (and others) have in the quality of your code. To widen the unit test area, you have to inject dependencies. Nothing prevents you (in Python!) from passing a module as a parameter (the type is types.ModuleType):
try:
import pypilib # production
except ImportError:
pypilib = object() # testing
def bar(x, external_lib = pypilib):
return external_lib.foo(x) + 1
Now, you can unit test the function:
import unittest
from unittest.mock import Mock
class Test(unittest.TestCase):
def test_bar(self):
external_lib = Mock(foo = lambda x: 3*x)
self.assertEqual(10, bar(3, external_lib))
if __name__ == "__main__":
unittest.main()
You might disapprove the design. The try/except part is a bit cumbersome, especially if you use the pypilib module in several modules of your application. And you have to add a parameter to each function that relies on the external library.
However, the idea to inject a dependency to the external library is useful, because you can control the input and test the output of your class methods, even if the external library is not within your control. Especially if the imported module is stateful, the state might be difficult to reproduce in a unit test. In this case, passing the module as a parameter may be a solution.
But the usual way to deal with this situation is called dependency inversion principle (the D of SOLID): you should define the (abstract) boundaries of your application, ie what you need from the outside world. Here, this is bar and other functions, preferably grouped in one or many classes:
import pypilib
import other_pypilib
class MyUtil:
"""
All I need from outside world
"""
#staticmethod
def bar(x):
return pypilib.foo(x) + 1
#staticmethod
def baz(x, y):
return other_pypilib.foo(x, y) * 10.0
...
# not every method has to be static
Each time you need one of these functions, just inject an instance of the class in your code:
class Application:
def __init__(self, util: MyUtil):
self._util = util
def something(self, x, y):
return self._util.baz(self._util.bar(x), y)
The MyUtil class must be as slim as possible, but must remain abstract from the underlying library. It is a tradeoff. Obviously, Application can be unit tested (just inject a Mock instead of an instance of MyUtil) while, under some circumstances (like a PyPi library not available during tests, a module that runs inside a framework only, etc.), MyUtil can be only tested within an integration test. If you need to unit test the boundaries of your application, you can use #Don Kirkby's method.
Note that the second benefit, after unit testing, is that if you change the libraries you are using (deprecation, license issue, cost, ...), you just have to rewrite the MyUtil class, using some other libraries or coding it from scratch. Your application is protected from the wild outside world.
Clean Code by Robert C. Martin has a full chapter on the boundaries.
Summary Before using #Don Kirkby's method or any other method, be sure to define the boundaries of your application irrespective of the specific libraries you are using. This, of course, does not apply to the Python standard library...
For a more explicit and granular approach:
import unittest
from unittest.mock import MagicMock, patch
try:
import bogus_module
except ModuleNotFoundError:
bogus_module = MagicMock()
#patch.dict('sys.modules', bogus_module=bogus_module)
class PlatformTests(unittest.TestCase):
...
Using the patch.dict decorator gives you granular control: it only applies to the class / method it is applied to.

How to mock imported library methods with unittest.mock in Python 3.5?

Is it possible to mock methods of imported modules with unittest.mock in Python 3.5?
# file my_function.py
import os
my_function():
# do something with os, e.g call os.listdir
return os.listdir(os.getcwd())
# file test_my_function.py
test_my_function():
os = MagickMock()
os.listdir = MagickMock(side_effect=["a", "b"])
self.assertEqual("a", my_function())
I expected that the the os.listdir method returns the specified side_effect "a" on the first call, but inside of my_function the unpatched os.listdir is called.
unittest.mock have two main duties:
Define Mock objects: object designed to follow your screenplay and record every access to your mocked object
patching references and recover the original state
In your example, you need both functionalities: Patching os.listdir reference used in production code by a mock where you can have complete control of how it will respond. There are a lot of ways to use patch, some details to take care on how use it and cavelets to know.
In your case you need to test my_function() behaviour and you need to patch both os.listdir() and os.getcwd(). Moreover what you need is control the return_value (take a look to the pointed documentation for return_value and side_effect differences).
I rewrote your example a little bit to make it more complete and clear:
my_function(nr=0):
l = os.listdir(os.getcwd())
l.sort()
return l[nr]
#patch("os.getcwd")
#patch("os.listdir", return_value=["a","b"])
def test_my_function(mock_ld, mock_g):
self.assertEqual("a", my_function(0))
mock_ld.assert_called_with(mock_g.return_value)
self.assertEqual("a", my_function())
self.assertEqual("b", my_function(1))
with self.assertRaises(IndexError):
my_function(3)
I used the decorator syntax because I consider it the cleaner way to do it; moreover, to avoid the introduction of too many details I didn't use autospecing
that I consider a very best practice.
Last note: mocking is a powerful tool but use it and not abuse of it, patch just you need to patch and nothing more.

python unittest faulty module

I have a module that I need to test in python.
I'm using the unittest framework but I ran into a problem.
The module has some method definitions, one of which is used when it's imported (readConfiguration) like so:
.
.
.
def readConfiguration(file = "default.xml"):
# do some reading from xml
readConfiguration()
This is a problem because when I try to import the module it also tries to run the "readConfiguration" method which fails the module and the program (a configuration file does not exist in the test environment).
I'd like to be able to test the module independent of any configuration files.
I didn't write the module and it cannot be re-factored.
I know I can include a dummy configuration file but I'm looking for a "cleaner", more elegant, solution.
As commenters have already pointed out, imports should never have side effects, so try to get the module changed if at all possible.
If you really, absolutely, cannot do this, there might be another way: let readConfiguration() be called, but stub out its dependencies. For instance, if it uses the builtin open() function, you could mock that, as demonstrated in the mock documentation:
>>> mock = MagicMock(return_value=sentinel.file_handle)
>>> with patch('builtins.open', mock):
... import the_broken_module
... # do your testing here
Replace sentinel.file_handle with StringIO("<contents of mock config file>") if you need to supply actual content.
It's brittle as it depends on the implementation of readConfiguration(), but if there really is no other way, it might be useful as a last resort.

Categories