python unittest howto - python

I`d like to know how I could unit-test the following module.
def download_distribution(url, tempdir):
""" Method which downloads the distribution from PyPI """
print "Attempting to download from %s" % (url,)
try:
url_handler = urllib2.urlopen(url)
distribution_contents = url_handler.read()
url_handler.close()
filename = get_file_name(url)
file_handler = open(os.path.join(tempdir, filename), "w")
file_handler.write(distribution_contents)
file_handler.close()
return True
except ValueError, IOError:
return False

Unit test propositioners will tell you that unit tests should be self contained, that is, they should not access the network or the filesystem (especially not in writing mode). Network and filesystem tests are beyond the scope of unit tests (though you might subject them to integration tests).
Speaking generally, for such a case, I'd extract the urllib and file-writing codes to separate functions (which would not be unit-tested), and inject mock-functions during unit testing.
I.e. (slightly abbreviated for better reading):
def get_web_content(url):
# Extracted code
url_handler = urllib2.urlopen(url)
content = url_handler.read()
url_handler.close()
return content
def write_to_file(content, filename, tmpdir):
# Extracted code
file_handler = open(os.path.join(tempdir, filename), "w")
file_handler.write(content)
file_handler.close()
def download_distribution(url, tempdir):
# Original code, after extractions
distribution_contents = get_web_content(url)
filename = get_file_name(url)
write_to_file(distribution_contents, filename, tmpdir)
return True
And, on the test file:
import module_I_want_to_test
def mock_web_content(url):
return """Some fake content, useful for testing"""
def mock_write_to_file(content, filename, tmpdir):
# In this case, do nothing, as we don't do filesystem meddling while unit testing
pass
module_I_want_to_test.get_web_content = mock_web_content
module_I_want_to_test.write_to_file = mock_write_to_file
class SomeTests(unittest.Testcase):
# And so on...
And then I second Daniel's suggestion, you should read some more in-depth material on unit testing.

Vague question. If you're just looking for a primer for unit testing in general with a Python slant, I recommend Mark Pilgrim's "Dive Into Python" which has a chapter on unit testing with Python. Otherwise you need to clear up what specific issues you are having testing that code.

To mock urllopen you can pre fetch some examples that you can then use in your unittests. Here's an example to get you started:
def urlopen(url):
urlclean = url[:url.find('?')] # ignore GET parameters
files = {
'http://example.com/foo.xml': 'foo.xml',
'http://example.com/bar.xml': 'bar.xml',
}
return file(files[urlclean])
yourmodule.urllib.urlopen = urlopen

Related

Python Unit test for method contains request

I have a method, which contains external rest-api calls.
ex:
def get_dataset():
url=requests.get("http://api:5001/get_trainingdata")
filename=url.text[0]
return filename
When I do #patch for this function, I can able to do unittest. But, coverage in not covering whole function.
How can write unittest case for this method with full coverage?
My testcase
#mock.patch('api.get_dataset')
def test_using_decorator1(self, mocked_get_dataset):
file = [{"file":"ddddd"}]
mocked_get_dataset.return_value = Mock()
mocked_get_dataset.return_value.json.return_value = file
filename = file[0]
self.assertEqual(filename, file[0])

Run pytest for each file in directory

I'm trying to build a routine that calls a Pytest class for each PDF document in current directoy... Let me explain
Lets say i have this test file
import pytest
class TestHeader:
#asserts...
class TestBody:
#asserts...
This script needs to test each pdf document in my cwd
Here is my best attempt:
import glob
import pytest
class TestHeader:
#asserts...
class TestBody:
#asserts...
filelist = glob.glob('*.pdf')
for file in filelist:
#magically call pytest for each file
How would i approach this?
EDIT: Complementing my question.
I have a huge function that extracts each document's data, lets call it extract_pdf
this function returns a tuple (header, body).
Current attempt looks like this:
import glob
import pytest
class TestHeader:
#asserts...
class TestBody:
#asserts...
filelist = glob.glob('*.pdf')
for file in filelist:
header, body = extract_pdf(file)
pytest.main(<pass header and body as args for pytest>)
I need to parse each document prior to testing. Can it be done this way?
The best way to do this through parameterization of the testcases dynamically..
This can be achieved using the pytest_generate_tests hook..
def pytest_generate_tests(metafunc):
filelist = glob.glob('*.pdf')
metafunc.parametrize("fileName", filelist )
NOTE: fileName should be one of the argument to your test function.
This will result in executing the testcase for each of the file in the directory and the testcase will be like
TestFunc[File1]
TestFunc[File2]
TestFunc[File3]
.
.
and so on..
This is expanding on the existing answer by #ArunKalirajaBaskaran.
The problem is that you have different test classes that want to use the same data, but you want to parse the data only once. If it is ok for you to read all data at once, you could read them into global variables and use these for parametrizing your tests:
def extract_data():
filenames = []
headers = []
bodies = []
for filename in glob.glob('*.pdf'):
header, body = extract_pdf(filename)
filenames.append(filename)
headers.append(header)
bodies.append(body)
return filenames, headers, bodies
filenames, headers, bodies = extract_data()
def pytest_generate_tests(metafunc):
if "header" in metafunc.fixturenames:
# use the filename as ID for better test names
metafunc.parametrize("header", headers, ids=filenames)
elif "body" in metafunc.fixturenames:
metafunc.parametrize("body", bodies, ids=filenames)
class TestHeader:
def test_1(header):
...
def test_2(header):
...
class TestBody:
def test_1(body):
...
This is the same as using
class TestHeader:
#pytest.mark.parametrize("header", headers, ids=filenames)
def test_1(header):
...
#pytest.mark.parametrize("header", headers, ids=filenames)
def test_2(header):
...
pytest_generate_tests just adds a bit of convenience so you don't have to repeat the parametrize decorator for each test.
The downside of this is of course that you will read in all of the data at once, which may cause a problem with memory usage if there is a lot of files. Your approach with pytest.main will not work, because that is the same as calling pytest on the command line with the given parameters. Parametrization can be done at the fixture level or on the test level (like here), but both need the parameters alreay evaluated at load time, so I don't see a possibility to do this lazily (apart from putting it all into one test). Maybe someone else has a better idea...

Create a gzip file like object for unit testing

I want to test a Python function that reads a gzip file and extracts something from the file (using pytest).
import gzip
def my_function(file_path):
output = []
with gzip.open(file_path, 'rt') as f:
for line in f:
output.append('something from line')
return output
Can I create a gzip file like object that I can pass to my_function? The object should have defined content and should work with gzip.open()
I know that I can create a temporary gzip file in a fixture but this depends on the filesystem and other properties of the environment. Creating a file-like object from code would be more portable.
You can use the io and gzip libraries to create in-memory file objects. Example:
import io, gzip
def inmem():
stream = io.BytesIO()
with gzip.open(stream, 'wb') as f:
f.write(b'spam\neggs\n')
stream.seek(0)
return stream
You should never try to test outside code in a unit test. Only test the code you wrote. If you're testing gzip, then gzip is doing something wrong (they should be writing their own unit tests). Instead, do something like this:
from unittest import mock
#mock.Mock('gzip', return_value=b'<whatever you expect to be returned from gzip>')
def test_my_function(mock_gzip):
file_path = 'testpath'
output = my_function(file_path=file_path)
mock_gzip.open.assert_called_with(file_path)
assert output == b'<whatever you expect to be returned from your method>'
That's your whole unit test. All you want to know is that gzip.open() was called (and you assume it works or else gzip is failing and that's their problem) and that you got back what you expected from the method being tested. You specify what gzip returns based on what you expect it to return, but you don't actually call the function in your test.
It's a bit verbose but I'd do something like this (I have assumed that you saved my_function to a file called patch_one.py):
import patch_one # this is the file with my_function in it
from unittest.mock import patch
from unittest import TestCase
class MyTestCase(TestCase):
def test_my_function(self):
# because you used "with open(...) as f", we need a mock context
class MyContext:
def __enter__(self, *args, **kwargs):
return [1, 2] # note the two items
def __exit__(self, *args, **kwargs):
return None
# in case we want to know the arguments to open()
open_args = None
def f(*args, **kwargs):
def my_open(*args, **kwargs):
nonlocal open_args
open_args = args
return MyContext()
return my_open
# patch the gzip.open in our file under test
with patch('patch_one.gzip.open', new_callable=f):
# finally, we can call the function we want to test
ret_val = patch_one.my_function('not a real file path')
# note the two items, corresponding to the list in __enter__()
self.assertListEqual(['something from line', 'something from line'], ret_val)
# check the arguments, just for fun
self.assertEqual('rt', open_args[1])
If you want to try anything more complicated, I would recommend reading the unittest mock docs because how you import the "patch_one" file matters as does the string you pass to patch().
There will definitely be a way to do this with Mock or MagicMock but I find them a bit hard to debug so I went the long way round.

pytest fixture which takes over error reporting

I'm writing a small fixture for implementing regression tests. The function under test does not contain any assert statements but produces output which is compared to a recorded output which is assumed to be correct.
This is a simplfied snippet to demonstrate what I'm doing:
#pytest.yield_fixture()
def regtest(request):
fp = cStringIO.StringIO()
yield fp
reset, full_path, id_ = _setup(request)
if reset:
_record_output(fp.getvalue(), full_path)
else:
failed = _compare_output(fp.getvalue(), full_path, request, id_)
if failed:
pytest.fail("regression test %s failed" % id_, pytrace=False)
In general my approach works works but I want to improve error reporting so that the fixture indicates the failure of a test and not the testing function itself: this implementation always prints a . because the testing function does not raise any exception, and then an extra E if pytest.fail is called in the last line.
So what I want is to supress the output of . triggered by the function under test and let my fixture code output the approriate character.
Update:
I was able to improve output, but it still I have to many "." in the output when the tests are running. It is uploaded at https://pypi.python.org/pypi/pytest-regtest
you can find the repository at https://sissource.ethz.ch/uweschmitt/pytest-regtest/tree/master
Sorry for posting links, but the files got a bit bigger now.
Solution:
I came up with a solution by implementing an hook which handles the regtest result in hook. The code is then (simplified):
#pytest.yield_fixture()
def regtest(request):
fp = cStringIO.StringIO()
yield fp
#pytest.hookimpl(hookwrapper=True)
def pytest_runtest_call(item):
try:
outcome = yield
except Exception:
raise
else:
# we only handle regtest fixture if no other other exception came up during testing:
if outcome.excinfo is not None:
return
regtest = item.funcargs.get("regtest")
if regtest is not None:
_handle_regtest_result(regtest)
And _handle_regtest_result either stores the recorded values or does the appropriate checks. The plugin is now available at https://pypi.python.org/pypi/pytest-regtest
Your are mixing two things there: the fixture itself (setting up conditions for your test) and the expected behavior _compare_output(a, b). You are probably looking for something along the lines:
import pytest
#pytest.fixture()
def file_fixture():
fp = cStringIO.StringIO()
return fp.getvalue()
#pytest.fixture()
def request_fixture(request, file_fixture):
return _setup(request)
def test_regression(request_fixture, file_fixture):
reset, full_path, id_ = request_fixture
if reset:
_record_output(file_fixture, full_path)
else:
failed = _compare_output(file_fixture, full_path, request, id_)
assert failed is True, "regression test %s failed" % id_

Proper way to organize testcases that involve a data file for each testcase?

I'm writing a module that involves parsing html for data and creating an object from it. Basically, I want to create a set of testcases where each case is an html file paired with a golden/expected pickled object file.
As I make changes to the parser, I would like to run this test suite to ensure that each html page is parsed to equal the 'golden' file (essentially a regression suite)
I can see how to code this as a single test case, where I would load all file pairs from some directory and then iterate through them. But I believe this would end up being reported as a single test case, pass or fail. But I want a report that says, for example, 45/47 pages parsed successfully.
How do I arrange this?
I've done similar things with the unittest framework by writing a function which creates and returns a test class. This function can then take in whatever parameters you want and customise the test class accordingly. You can also customise the __doc__ attribute of the test function(s) to get customised messages when running the tests.
I quickly knocked up the following example code to illustrate this. Instead of doing any actual testing, it uses the random module to fail some tests for demonstration purposes. When created, the classes are inserted into the global namespace so that a call to unittest.main() will pick them up. Depending on how you run your tests, you may wish to do something different with the generated classes.
import os
import unittest
# Generate a test class for an individual file.
def make_test(filename):
class TestClass(unittest.TestCase):
def test_file(self):
# Do the actual testing here.
# parsed = do_my_parsing(filename)
# golden = load_golden(filename)
# self.assertEquals(parsed, golden, 'Parsing failed.')
# Randomly fail some tests.
import random
if not random.randint(0, 10):
self.assertEquals(0, 1, 'Parsing failed.')
# Set the docstring so we get nice test messages.
test_file.__doc__ = 'Test parsing of %s' % filename
return TestClass
# Create a single file test.
Test1 = make_test('file1.html')
# Create several tests from a list.
for i in range(2, 5):
globals()['Test%d' % i] = make_test('file%d.html' % i)
# Create them from a directory listing.
for dirname, subdirs, filenames in os.walk('tests'):
for f in filenames:
globals()['Test%s' % f] = make_test('%s/%s' % (dirname, f))
# If this file is being run, run all the tests.
if __name__ == '__main__':
unittest.main()
A sample run:
$ python tests.py -v
Test parsing of file1.html ... ok
Test parsing of file2.html ... ok
Test parsing of file3.html ... ok
Test parsing of file4.html ... ok
Test parsing of tests/file5.html ... ok
Test parsing of tests/file6.html ... FAIL
Test parsing of tests/file7.html ... ok
Test parsing of tests/file8.html ... ok
======================================================================
FAIL: Test parsing of tests/file6.html
----------------------------------------------------------------------
Traceback (most recent call last):
File "generic.py", line 16, in test_file
self.assertEquals(0, 1, 'Parsing failed.')
AssertionError: Parsing failed.
----------------------------------------------------------------------
Ran 8 tests in 0.004s
FAILED (failures=1)
The nose testing framework supports this. http://www.somethingaboutorange.com/mrl/projects/nose/
Also see here: How to generate dynamic (parametrized) unit tests in python?
Here's what I would do (untested):
files = os.listdir("/path/to/dir")
class SomeTests(unittest.TestCase):
def _compare_files(self, file_name):
with open('/path/to/dir/%s-golden' % file_name, 'r') as golden:
with open('/path/to/dir/%s-trial' % file_name, 'r') as trial:
assert golden.read() == trial.read()
def test_generator(file_name):
def test(self):
self._compare_files(file_name):
return test
if __name__ == '__main__':
for file_name in files:
test_name = 'test_%s' % file_name
test = test_generator(file_name)
setattr(SomeTests, test_name, test)
unittest.main()

Categories