Mock_open CSV file not getting any data - python

I am trying to unit test a piece of code:
def _parse_results(self, file_name):
results_file = open(file_name)
results_data = list(csv.reader(results_file))
index = len(results_data[1])-1
results_file.close()
return float(results_data[1][index])
by using mock_open like so:
#mock.patch('path.open', mock.mock_open(read_data='test, test2, test3, test4'))
def test_parse_results(self):
cut = my_class(emulate=True)
self.assertEqual(VAL, cut._parse_results('file'))
The problem I am running into is that I do not get any data when running csv.reader. If I run results_file.readlines() I get 'test, test2, test3, test4' which means that mock_open is working properly. But when I run csv.reader(results_file) I lose all the data.

This is because mock_open doesn't implement every feature that a file has, and notably not some of the ones that csv needs.
mock_open implements the methods read(), readline() and readlines(), and works both as a function and when called as a context manager (https://docs.python.org/3/library/unittest.mock.html#mock-open), whereas csv.reader works with…
any object which supports the iterator protocol and returns a string each time its __next__() method is called — file objects and list objects are both suitable
— https://docs.python.org/3/library/csv.html#csv.reader
Note that mock_open doesn't implement the __next__() method, and doesn't raise StopIteration when the end is reached, so it won't work with csv.reader.
The solution, as #Emily points out in her answer, is to turn the file into a list of its lines. This is possible because mock_open implements readlines(), and the resulting list is suitable for reading into csv.reader as the documentation says.

This really got me too, and was a nightmare to pinpoint. To use your example code, this works
results_data = list(csv.reader(results_file.read()))
and this works
results_data = list(csv.reader(results_file.readlines()))
but this doesn't work
results_data = list(csv.reader(results_file))
using Python 3.4.
It seems counter to the documented interface of csv.reader so maybe an expert can elaborate on why.

Related

Mocking a file containing JSON data with mock.patch and mock_open

I'm trying to test a method that requires the use of json.load in Python 3.6.
And after several attempts, I tried running the test "normally" (with the usual unittest.main() from the CLI), and in the iPython REPL.
Having the following function (simplified for purpose of the example)
def load_metadata(name):
with open("{}.json".format(name)) as fh:
return json.load(fh)
with the following test:
class test_loading_metadata(unittest2.TestCase):
#patch('builtins.open', new_callable=mock_open(read_data='{"disabled":True}'))
def test_load_metadata_with_disabled(self, filemock):
result = load_metadata("john")
self.assertEqual(result,{"disabled":True})
filemock.assert_called_with("john.json")
The result of the execution of the test file, yields a heart breaking:
TypeError: the JSON object must be str, bytes or bytearray, not 'MagicMock'
While executing the same thing in the command line, gives a successful result.
I tried in several ways (patching with with, as decorator), but the only thing that I can think of, is the unittest library itself, and whatever it might be doing to interfere with mock and patch.
Also checked versions of python in the virtualenv and ipython, the versions of json library.
I would like to know why what looks like the same code, works in one place
and doesn't work in the other.
Or at least a pointer in the right direction to understand why this could be happening.
json.load() simply calls fh.read(), but fh is not a mock_open() object. It's a mock_open()() object, because new_callable is called before patching to create the replacement object:
>>> from unittest.mock import patch, mock_open
>>> with patch('builtins.open', new_callable=mock_open(read_data='{"disabled":True}')) as filemock:
... with open("john.json") as fh:
... print(fh.read())
...
<MagicMock name='open()().__enter__().read()' id='4420799600'>
Don't use new_callable, you don't want your mock_open() object to be called! Just pass it in as the new argument to #patch() (this is also the second positional argument, so you can leave off the new= here):
#patch('builtins.open', mock_open(read_data='{"disabled":True}'))
def test_load_metadata_with_disabled(self, filemock):
at which point you can call .read() on it when used as an open() function:
>>> with patch('builtins.open', mock_open(read_data='{"disabled":True}')) as filemock:
... with open("john.json") as fh:
... print(fh.read())
...
{"disabled":True}
The new argument is the object that'll replace the original when patching. If left to the default, new_callable() is used instead. You don't want new_callable() here.

Mocking "with open()"

I am trying to unit test a method that reads the lines from a file and process it.
with open([file_name], 'r') as file_list:
for line in file_list:
# Do stuff
I tried several ways described on another questions but none of them seems to work for this case. I don't quite understand how python uses the file object as an iterable on the lines, it internally use file_list.readlines() ?
This way didn't work:
with mock.patch('[module_name].open') as mocked_open: # also tried with __builtin__ instead of module_name
mocked_open.return_value = 'line1\nline2'
I got an
AttributeError: __exit__
Maybe because the with statement have this special attribute to close the file?
This code makes file_list a MagicMock. How do I store data on this MagicMock to iterate over it ?
with mock.patch("__builtin__.open", mock.mock_open(read_data="data")) as mock_file:
Best regards
The return value of mock_open (until Python 3.7.1) doesn't provide a working __iter__ method, which may make it unsuitable for testing code that iterates over an open file object.
Instead, I recommend refactoring your code to take an already opened file-like object. That is, instead of
def some_method(file_name):
with open([file_name], 'r') as file_list:
for line in file_list:
# Do stuff
...
some_method(file_name)
write it as
def some_method(file_obj):
for line in file_obj:
# Do stuff
...
with open(file_name, 'r') as file_obj:
some_method(file_obj)
This turns a function that has to perform IO into a pure(r) function that simply iterates over any file-like object. To test it, you don't need to mock open or hit the file system in any way; just create a StringIO object to use as the argument:
def test_it(self):
f = StringIO.StringIO("line1\nline2\n")
some_method(f)
(If you still feel the need to write and test a wrapper like
def some_wrapper(file_name):
with open(file_name, 'r') as file_obj:
some_method(file_obj)
note that you don't need the mocked open to do anything in particular. You test some_method separately, so the only thing you need to do to test some_wrapper is verify that the return value of open is passed to some_method. open, in this case, can be a plain old mock with no special behavior.)

How do I get python unittest to test that a function returns a csv.reader object?

I'm using python 2.7 and delving into TDD. I'm trying to test a simple function that uses the csv module and returns a csv.reader object. I want to test that the correct type of object is being returned with the assertIsInstance test however I'm having trouble figuring out how to make this work.
#!/usr/bin/python
import os, csv
def importCSV(fileName):
'''importCSV brings in the CSV transaction file to be analyzed'''
try:
if not(os.path.exists("data")):
os.makedirs("data")
except(IOError):
return "Couldn't create data directory!"
try:
fileFullName = os.path.join("data", fileName)
return csv.reader(file(fileFullName))
except(IOError):
return "File not found!"
The test currently looks like this....
#!/usr/bin/python
from finaImport import finaImport
import unittest, os, csv
class testImport(unittest.TestCase):
'''Tests for importing a CSV file'''
def testImportCSV(self):
''' Test a good file and make sure importCSV returns a csv reader object '''
readerObject = finaImport.importCSV("toe")
self.assertTrue(str(type(readerObject))), "_csv.reader")
I really don't think wrapping "toe" in a str and type function is correct. When I try something like...
self.assertIsInstance(finaImport.importCSV("toe"), csv.reader)
It returns an error like...
TypeError: isinstance() arg2 must be a class, type, or tuple of classes and types
Help???
self.assertTrue(str(type(readerObject)), "_csv.reader")
I don't think that your first test (above) is so bad (I fixed a small typo there; you had an extra closing parenthesis). It checks that the type name is exactly "_csv.reader". On the other hand, the underscore in "_csv" tells you that this object is internal to the csv module. In general, you shouldn't be concerned about that.
Your attempt at the assertIsInstance test is flawed in that csv.reader is a function object. If you try it in the REPL, you see:
>>> import csv
>>> csv.reader
<built-in function reader>
Often, we care less about the type of an object and more about whether it implements a certain interface. In this case, the help for csv.reader says:
>>> help(csv.reader)
... The returned object is an iterator. ...
So, you could do the following test (instead or in addition to your other one):
self.assertIsInstance(readerObject, collections.Iterator)
You'll need a import collections for that, of course. And, you might want to test that the iterator returns lists of strings, or something like this. That would allow you to use something else under the hood later and the test would still pass.

Dynamically instantiating objects

I'm attempting to instantiate an object from a string. Specifically, I'm trying to change this:
from node.mapper import Mapper
mapper = Mapper(file)
mapper.map(src, dst)
into something like this:
with open('C:.../node/mapper.py', 'r') as f:
mapping_script = f.read()
eval(mapping_script)
mapper = Mapper(file)
mapper.map(src, dst)
The motivation for this seemingly bizarre task is to be able to store different versions of mapping scripts in a database and then retrieve/use them as needed (with emphasis on the polymorphism of the map() method).
The above does not work. For some reason, eval() throws SyntaxError: invalid syntax. I don't understand this since it's the same file that's being imported in the first case. Is there some reason why eval() cannot be used to define classes?
I should note that I am aware of the security concerns around eval(). I would love to hear of alternative approaches if there are any. The only other thing I can think of is to fetch the script, physically save it into the node package directory, and then import it, but that seems even crazier.
You need to use exec:
exec(mapping_script)
eval() works only for expressions. exec() works for statements. A typical Python script contains statements.
For example:
code = """class Mapper: pass"""
exec(code)
mapper = Mapper()
print(mapper)
Output:
<__main__.Mapper object at 0x10ae326a0>
Make sure you either call exec() (Python 3, in Python 2 it is a statement) at the module level. When you call it in a function, you need to add globals(), for example exec(code, globals()), to make the objects available in the global scope and to the rest of the function as discussed here.

How do you check if an object is an instance of 'file'?

It used to be in Python (2.6) that one could ask:
isinstance(f, file)
but in Python 3.0 file was removed.
What is the proper method for checking to see if a variable is a file now? The What'sNew docs don't mention this...
def read_a_file(f)
try:
contents = f.read()
except AttributeError:
# f is not a file
substitute whatever methods you plan to use for read. This is optimal if you expect that you will get passed a file like object more than 98% of the time. If you expect that you will be passed a non file like object more often than 2% of the time, then the correct thing to do is:
def read_a_file(f):
if hasattr(f, 'read'):
contents = f.read()
else:
# f is not a file
This is exactly what you would do if you did have access to a file class to test against. (and FWIW, I too have file on 2.6) Note that this code works in 3.x as well.
In python3 you could refer to io instead of file and write
import io
isinstance(f, io.IOBase)
Typically, you don't need to check an object type, you could use duck-typing instead i.e., just call f.read() directly and allow the possible exceptions to propagate -- it is either a bug in your code or a bug in the caller code e.g., json.load() raises AttributeError if you give it an object that has no read attribute.
If you need to distinguish between several acceptable input types; you could use hasattr/getattr:
def read(file_or_filename):
readfile = getattr(file_or_filename, 'read', None)
if readfile is not None: # got file
return readfile()
with open(file_or_filename) as file: # got filename
return file.read()
If you want to support a case when file_of_filename may have read attribute that is set to None then you could use try/except over file_or_filename.read -- note: no parens, the call is not made -- e.g., ElementTree._get_writer().
If you want to check certain guarantees e.g., that only one single system call is made (io.RawIOBase.read(n) for n > 0) or there are no short writes (io.BufferedIOBase.write()) or whether read/write methods accept text data (io.TextIOBase) then you could use isinstance() function with ABCs defined in io module e.g., look at how saxutils._gettextwriter() is implemented.
Works for me on python 2.6... Are you in a strange environment where builtins aren't imported by default, or where somebody has done del file, or something?

Categories