Can a MagicMock object be iterated over? - python

What I would like to do is this...
x = MagicMock()
x.iter_values = [1, 2, 3]
for i in x:
i.method()
I am trying to write a unit test for this function but I am unsure about how to go about mocking all of the methods called without calling some external resource...
def wiktionary_lookup(self):
"""Looks up the word in wiktionary with urllib2, only to be used for inputting data"""
wiktionary_page = urllib2.urlopen(
"http://%s.wiktionary.org/wiki/%s" % (self.language.wiktionary_prefix, self.name))
wiktionary_page = fromstring(wiktionary_page.read())
definitions = wiktionary_page.xpath("//h3/following-sibling::ol/li")
print definitions.text_content()
defs_list = []
for i in definitions:
print i
i = i.text_content()
i = i.split('\n')
for j in i:
# Takes out an annoying "[quotations]" in the end of the string, sometimes.
j = re.sub(ur'\u2003\[quotations \u25bc\]', '', j)
if len(j) > 0:
defs_list.append(j)
return defs_list
EDIT:
I may be misusing mocks, I am not sure. I am trying to unit-test this wiktionary_lookup method without calling external services...so I mock urlopen..I mock fromstring.xpath() but as far as I can see I need to also iterate through the return value of xpath() and call a method "text_contents()" so that is what I am trying to do here.
If I have totally misunderstood how to unittest this method then please tell me where I have gone wrong...
EDIT (adding current unittest code)
#patch("lang_api.models.urllib2.urlopen")
#patch("lang_api.models.fromstring")
def test_wiktionary_lookup_2(self, fromstring, urlopen):
"""Looking up a real word in wiktionary, should return a list"""
fromstring().xpath.return_value = MagicMock(
content=["test", "test"], return_value='test\ntest2')
# A real word should give an output of definitions
output = self.things.model['word'].wiktionary_lookup()
self.assertEqual(len(output), 2)

What you actually want to do is not return a Mock with a return_value=[]. You actually want to return a list of Mock objects. Here is a snippet of your test code with the correct components and a small example to show how to test one of the iterations in your loop:
#patch('d.fromstring')
#patch('d.urlopen')
def test_wiktionary(self, urlopen_mock, fromstring_mock):
urlopen_mock.return_value = Mock()
urlopen_mock.return_value.read.return_value = "some_string_of_stuff"
mocked_xpath_results = [Mock()]
fromstring_mock.return_value.xpath.return_value = mocked_xpath_results
mocked_xpath_results[0].text_content.return_value = "some string"
So, to dissect the above code to explain what was done to correct your problem:
The first thing to help us with testing the code in the for loop is to create a list of mock objects per:
mocked_xpath_results = [Mock()]
Then, as you can see from
fromstring_mock.return_value.xpath.return_value = mocked_xpath_results
We are setting the return_value of the xpath call to our list of mocks per mocked_xpath_results.
As an example of how to do things inside your list, I added how to mock within the loop, which is shown with:
mocked_xpath_results[0].text_content.return_value = "some string"
In unittests (this might be a matter of opinion) I like to be explicit, so I'm accessing the list item explicitly and determining what should happen.
Hope this helps.

Related

Is this unit test correct for what I am trying to achieve?

I have a list CountryList[] defined in my function and I want to check that it
is not empty. I initialize it as empty but later on in the function data is put into it.
This is the unit test I have typed.
def assertEmpty(self, CountryList):
self.assertFalse(CountryList)
def assertNotEmpty(self, CountryList):
self.assertTrue(CountryList)
This is the method in my program.
def onCountry(self, doc_id):
if(doc_id==None):
return
output_list = self.findBySubjectDocId(doc_id)
country_list=[]
for x in output_list:
country_id=x["visitor_country"]
if(SHOW_FULL_NAMES):
country_id=pc.country_alpha2_to_country_name(country_id)
country_list.append(country_id)
ts=pd.Series(country_list).value_counts().plot(kind='bar',color='purple')
plt.xticks(rotation='horizontal')
plt.xlabel('Country')
plt.ylabel('Number of Viewers')
plt.title("Viewers based on Country")
ts.plot()
plt.show()
print("Countries of Visitors:")
x = []
y = []
for k,v in Counter(country_list).items():
x.append(k)
y.append(v)
print(k,"-",v)
Do you suggest I test out this code some other way? or is the above testing acceptable?
Here are several suggestions:
To properly use Python's unittest package, you need to create a class which extends unittest.TestCase.
Name your test methods test_* rather than assert*.
Tests can only take one argument self.
You can define a setUp() method which runs before each test_* method. Use this to create data that is common to all tests.
Tests are code just like any other code and follow all the same rules.
First describe in words the scenario you are trying to test. Often this is of the form "When I call function F with parameters P, then the result will be R". Then write the test to replicate this specific scenario.
Since country_list is a local variable that isn't returned, there's not an easy way for your unit test to check it directly; your test helpers seem fine, but you don't seem to have any actual testing that uses them, because the code has been written in a way that's difficult to test.
One way to make this easier to test would be to refactor the function into two pieces like this:
def build_country_list(self, doc_id):
output_list = self.findBySubjectDocId(doc_id)
country_list=[]
for x in output_list:
country_id=x["visitor_country"]
if(SHOW_FULL_NAMES):
country_id=pc.country_alpha2_to_country_name(country_id)
country_list.append(country_id)
return country_list
def onCountry(self, doc_id):
if doc_id is None:
return
country_list = self.build_country_list(doc_id)
ts=pd.Series(country_list).value_counts().plot(kind='bar',color='purple')
plt.xticks(rotation='horizontal')
plt.xlabel('Country')
plt.ylabel('Number of Viewers')
plt.title("Viewers based on Country")
ts.plot()
plt.show()
print("Countries of Visitors:")
x = []
y = []
for k,v in Counter(country_list).items():
x.append(k)
y.append(v)
print(k,"-",v)
Now you can write a unit test for build_country_list that verifies that your logic of building the list is correct, separately from the onCountry logic that also plots and prints the data, something like:
def test_build_country_list(self):
# probably need to have done some setUp that sets up a test_doc_id?
self.assertNotEmpty(self.test_instance.build_country_list(self.test_doc_id))
If you want to also test the output logic, you'll need to mock out the output functions and write a test that verifies that they get called with the correct arguments when onCountry is called.

Modify a Python AST parsed from a method code node in order to rewrite certain specific calls

I'm writing some tooling for online programming contexts.
Part of it is a test case checker which actually based on a set of pairs of (input, output) files are gonna check whether the solution method is actually working.
Basically, the solution method is expected to be defined as follow:
def solution(Nexter: inputs):
# blahblah some code here and there
n = inputs.next_int()
sub_process(inputs)
# simulating a print something
yield str(n)
can be then translated (once the AST modifications) as:
def solution():
# blahblah some code here and there
n = int(input())
sub_process()
print(str(n))
Note: Nexter is a class defined to be whether a generator of user input() calls or carry out the expected inputs + some other goodies.
I'm aware of the issues related to converting back to source code from the AST (requires to rely on 3rd party stuff). I also know that there is a NodeTransformer class:
http://greentreesnakes.readthedocs.io/en/latest/manipulating.html
https://docs.python.org/3/library/ast.html#ast.NodeTransformer
But its use remains unclear to me I don't know if I'm better off checking calls, expr, etc.
Here is below what I've ended up with:
signature = inspect.signature(iterative_greedy_solution)
if len(signature.parameters) == 1 and "inputs" in signature.parameters:
parameter = signature.parameters["inputs"]
annotation = parameter.annotation
if Nexter == annotation:
source = inspect.getsource(iterative_greedy_solution)
tree = ast.parse(source)
NexterInputsRewriter().generic_visit(tree)
class NexterInputsRewriter(ast.NodeTransformer):
def visit(self, node):
#???
This is definitely not the best design ever. Next time, I would probably go for the other way around (i.e. having a definition with simple user defined input() (and output, i.e. print(...)) and replacing them with test case inputs) when passing to a tester class asserting whether actual outputs are matching expecting ones.
To sum up this what I would like to achieve and I don't really know exactly how (apart of subclassing the NodeTransformer class):
Get rid of the solution function arguments
Modifiy the inputs calls in method body (as well as in the sub calls of methods also leveraging Nexter: inputs) in order to replace them with their actual user input() implementation, e.g. inputs.next_int() = int(input())
EDIT
Found that tool (https://python-ast-explorer.com/) that helps a lot to visualize what kind of ast.AST derivatives are used for a given function.
You can probably use NodeTransformer + ast.unparse() though it wouldn't be as effective as checking out some other 3rd party solutions considering it won't preserve any of your comments.
Here is an example transformation done by refactor (I'm the author), which is a wrapper layer around ast.unparse for doing easy source-to-source transformations through AST;
import ast
import refactor
from refactor import ReplacementAction
class ReplaceNexts(refactor.Rule):
def match(self, node):
# We need a call
assert isinstance(node, ast.Call)
# on an attribute (inputs.xxx)
assert isinstance(node.func, ast.Attribute)
# where the name for attribute is `inputs`
assert isinstance(node.func.value, ast.Name)
assert node.func.value.id == "inputs"
target_func_name = node.func.attr.removeprefix("next_")
# make a call to target_func_name (e.g int) with input()
target_func = ast.Call(
ast.Name(target_func_name),
args=[
ast.Call(ast.Name("input"), args=[], keywords=[]),
],
keywords=[],
)
return ReplacementAction(node, target_func)
session = refactor.Session([ReplaceNexts])
source = """\
def solution(Nexter: inputs):
# blahblah some code here and there
n = inputs.next_int()
sub_process(inputs)
st = inputs.next_str()
sub_process(st)
"""
print(session.run(source))
$ python t.py
def solution(Nexter: inputs):
# blahblah some code here and there
n = int(input())
sub_process(inputs)
st = str(input())
sub_process(st)

Python - multiple functions - output of one to the next

I know this is super basic and I have been searching everywhere but I am still very confused by everything I'm seeing and am not sure the best way to do this and am having a hard time wrapping my head around it.
I have a script where I have multiple functions. I would like the first function to pass it's output to the second, then the second pass it's output to the third, etc. Each does it's own step in an overall process to the starting dataset.
For example, very simplified with bad names but this is to just get the basic structure:
#!/usr/bin/python
# script called process.py
import sys
infile = sys.argv[1]
def function_one():
do things
return function_one_output
def function_two():
take output from function_one, and do more things
return function_two_output
def function_three():
take output from function_two, do more things
return/print function_three_output
I want this to run as one script and print the output/write to new file or whatever which I know how to do. Just am unclear on how to pass the intermediate outputs of each function to the next etc.
infile -> function_one -> (intermediate1) -> function_two -> (intermediate2) -> function_three -> final result/outfile
I know I need to use return, but I am unsure how to call this at the end to get my final output
Individually?
function_one(infile)
function_two()
function_three()
or within each other?
function_three(function_two(function_one(infile)))
or within the actual function?
def function_one():
do things
return function_one_output
def function_two():
input_for_this_function = function_one()
# etc etc etc
Thank you friends, I am over complicating this and need a very simple way to understand it.
You could define a data streaming helper function
from functools import reduce
def flow(seed, *funcs):
return reduce(lambda arg, func: func(arg), funcs, seed)
flow(infile, function_one, function_two, function_three)
#for example
flow('HELLO', str.lower, str.capitalize, str.swapcase)
#returns 'hELLO'
edit
I would now suggest that a more "pythonic" way to implement the flow function above is:
def flow(seed, *funcs):
for func in funcs:
seed = func(seed)
return seed;
As ZdaR mentioned, you can run each function and store the result in a variable then pass it to the next function.
def function_one(file):
do things on file
return function_one_output
def function_two(myData):
doThings on myData
return function_two_output
def function_three(moreData):
doMoreThings on moreData
return/print function_three_output
def Main():
firstData = function_one(infile)
secondData = function_two(firstData)
function_three(secondData)
This is assuming your function_three would write to a file or doesn't need to return anything. Another method, if these three functions will always run together, is to call them inside function_three. For example...
def function_three(file):
firstStep = function_one(file)
secondStep = function_two(firstStep)
doThings on secondStep
return/print to file
Then all you have to do is call function_three in your main and pass it the file.
For safety, readability and debugging ease, I would temporarily store the results of each function.
def function_one():
do things
return function_one_output
def function_two(function_one_output):
take function_one_output and do more things
return function_two_output
def function_three(function_two_output):
take function_two_output and do more things
return/print function_three_output
result_one = function_one()
result_two = function_two(result_one)
result_three = function_three(result_two)
The added benefit here is that you can then check that each function is correct. If the end result isn't what you expected, just print the results you're getting or perform some other check to verify them. (also if you're running on the interpreter they will stay in namespace after the script ends for you to interactively test them)
result_one = function_one()
print result_one
result_two = function_two(result_one)
print result_two
result_three = function_three(result_two)
print result_three
Note: I used multiple result variables, but as PM 2Ring notes in a comment you could just reuse the name result over and over. That'd be particularly helpful if the results would be large variables.
It's always better (for readability, testability and maintainability) to keep your function as decoupled as possible, and to write them so the output only depends on the input whenever possible.
So in your case, the best way is to write each function independently, ie:
def function_one(arg):
do_something()
return function_one_result
def function_two(arg):
do_something_else()
return function_two_result
def function_three(arg):
do_yet_something_else()
return function_three_result
Once you're there, you can of course directly chain the calls:
result = function_three(function_two(function_one(arg)))
but you can also use intermediate variables and try/except blocks if needed for logging / debugging / error handling etc:
r1 = function_one(arg)
logger.debug("function_one returned %s", r1)
try:
r2 = function_two(r1)
except SomePossibleExceptio as e:
logger.exception("function_two raised %s for %s", e, r1)
# either return, re-reraise, ask the user what to do etc
return 42 # when in doubt, always return 42 !
else:
r3 = function_three(r2)
print "Yay ! result is %s" % r3
As an extra bonus, you can now reuse these three functions anywhere, each on it's own and in any order.
NB : of course there ARE cases where it just makes sense to call a function from another function... Like, if you end up writing:
result = function_three(function_two(function_one(arg)))
everywhere in your code AND it's not an accidental repetition, it might be time to wrap the whole in a single function:
def call_them_all(arg):
return function_three(function_two(function_one(arg)))
Note that in this case it might be better to decompose the calls, as you'll find out when you'll have to debug it...
I'd do it this way:
def function_one(x):
# do things
output = x ** 1
return output
def function_two(x):
output = x ** 2
return output
def function_three(x):
output = x ** 3
return output
Note that I have modified the functions to accept a single argument, x, and added a basic operation to each.
This has the advantage that each function is independent of the others (loosely coupled) which allows them to be reused in other ways. In the example above, function_two() returns the square of its argument, and function_three() the cube of its argument. Each can be called independently from elsewhere in your code, without being entangled in some hardcoded call chain such as you would have if called one function from another.
You can still call them like this:
>>> x = function_one(3)
>>> x
3
>>> x = function_two(x)
>>> x
9
>>> x = function_three(x)
>>> x
729
which lends itself to error checking, as others have pointed out.
Or like this:
>>> function_three(function_two(function_one(2)))
64
if you are sure that it's safe to do so.
And if you ever wanted to calculate the square or cube of a number, you can call function_two() or function_three() directly (but, of course, you would name the functions appropriately).
With d6tflow you can easily chain together complex data flows and execute them. You can quickly load input and output data for each task. It makes your workflow very clear and intuitive.
import d6tlflow
class Function_one(d6tflow.tasks.TaskCache):
function_one_output = do_things()
self.save(function_one_output) # instead of return
#d6tflow.requires(Function_one)
def Function_two(d6tflow.tasks.TaskCache):
output_from_function_one = self.inputLoad() # load function input
function_two_output = do_more_things()
self.save(function_two_output)
#d6tflow.requires(Function_two)
def Function_three():
output_from_function_two = self.inputLoad()
function_three_output = do_more_things()
self.save(function_three_output)
d6tflow.run(Function_three()) # executes all functions
function_one_output = Function_one().outputLoad() # get function output
function_three_output = Function_three().outputLoad()
It has many more useful features like parameter management, persistence, intelligent workflow management. See https://d6tflow.readthedocs.io/en/latest/
This way function_three(function_two(function_one(infile))) would be the best, you do not need global variables and each function is completely independent of the other.
Edited to add:
I would also say that function3 should not print anything, if you want to print the results returned use:
print function_three(function_two(function_one(infile)))
or something like:
output = function_three(function_two(function_one(infile)))
print output
Use parameters to pass the values:
def function1():
foo = do_stuff()
return function2(foo)
def function2(foo):
bar = do_more_stuff(foo)
return function3(bar)
def function3(bar):
baz = do_even_more_stuff(bar)
return baz
def main():
thing = function1()
print thing

Python: can I change function behaviour whether output is assigned or not?

In Matlab, nargout is a variable that tells you if the output is assigned, so
x = f(2);
and
f(2);
can behave differently.
Is it possible to do similar in Python?
I have a function that plots to screen and returns a matplotlib figure object. I want that if output is assigned to a variable then do not plot to screen.
Here's a way you can do it (not that i'd advise it), but it has many cases where it won't work - to make it work you'd essentially need to parse the python code in the line and see what it is doing, which would be possible down to some level, but there are likely always going to be ways to get around it.
import inspect, re
def func(x, noCheck=False):
if not noCheck:
#Get the line the function was called on.
_, _, _, _, lines, _ = inspect.getouterframes(inspect.currentframe())[1]
#Now we need to search through `line` to see how the function is called.
line = lines[0].split("#")[0] #Get rid of any comments at the end of the line.
match = re.search(r"[a-zA-Z0-9]+ *= *func\(.*\)", line) #Search for instances of `func` being called after an equals sign
try:
variable, functioncall = match.group(0).split("=")
print variable, "=", functioncall, "=", eval(functioncall.strip()[:-1] + ", noCheck=True)")
except:
pass #print "not assigned to a variable"
#Actually make the function do something
return 3*x**2 + 2*x + 1
func(1) # x = func(1)
x = func(1)
Another way to do it would be to examine all of the set local variables when you call the code, and check if any of them have been set to the result of your function, then use that information to help parse the python.
Or you could look at object IDs, and try and do things that way, but that's not goign to be straightforward, as not all objects work the same way (i.e. do a=10 and c=10 and then have a look at each object's IDs, they're the same ven though a and c are seperate. The same happens with short strings too)
If you can think up a way to do this that would work universally, i'd be interested to know how you do it, I'd pressume it will need to be done by digging around in inspect though, rather than through parsing the actual code.
Others have mentioned that this is complex, but can be done with inspect. You may want a simple approach by having a separate function to plot it, or pass an extra variable that says to plot.
def create_plot(x):
return plot
def display(plot):
# show the plot
x = create_plot(2)
display(x)
Plot variable
def plot(x, show=False)
# create the plot
if show:
# show the plot
plot(2, True)
x = plot(2)
It is probably not worth the time and easier to just create the two functions.
Personally, I think this is ugly, nasty, and I do not believe that functionality should be based on something catching the return value. However, I was curious, and I found a way. You could probably turn this into a decorator if you want to use it in the future, but I still suggest that you use two separate methods instead of checking for an output.
import inspect
def f(val):
has_output = False
frame = inspect.currentframe()
name = frame.f_code.co_name
outer = inspect.getouterframes(frame)[1] # may want to loop through available frames.
for i in range(len(outer)):
item = str(outer[i]).replace(" ", "")
check = "="+name+"("
if check in item and "="+check not in item: # also check assignment vs equality
# Your method has an output
has_output = True
break
if has_output:
print("Something catches the output")
return val*val
# end f
In many cases this will not work either. You will have to make really good regex for the check if you always want it to work.
import my_lib
x = my_lib.f(2)

Unit testing objects in Python - Object is not over written in setup

I'm unit testing classes in Python using unittest. As I understand it, unittest calls the setUp function before each test so that the state of the unit test objects are the same and the order the test are executed wouldn't matter.
Now I have this class I'm testing...
#! usr/bin/python2
class SpamTest(object):
def __init__(self, numlist = []):
self.__numlist = numlist
#property
def numlist(self):
return self.__numlist
#numlist.setter
def numlist(self, numlist):
self.__numlist = numlist
def add_num(self, num):
self.__numlist.append(num)
def incr(self, delta):
self.numlist = map(lambda x: x + 1, self.numlist)
def __eq__(self, st2):
i = 0
limit = len(self.numlist)
if limit != len(st2.numlist):
return False
while i < limit:
if self.numlist[i] != st2.numlist[i]:
return False
i += 1
return True
with the following unit tests...
#! usr/bin/python2
from test import SpamTest
import unittest
class Spammer(unittest.TestCase):
def setUp(self):
self.st = SpamTest()
#self.st.numlist = [] <--TAKE NOTE OF ME!
self.st.add_num(1)
self.st.add_num(2)
self.st.add_num(3)
self.st.add_num(4)
def test_translate(self):
eggs = SpamTest([2, 3, 4, 5])
self.st.incr(1)
self.assertTrue(self.st.__eq__(eggs))
def test_set(self):
nl = [1, 4, 1, 5, 9]
self.st.numlist = nl
self.assertEqual(self.st.numlist, nl)
if __name__ == "__main__":
tests = unittest.TestLoader().loadTestsFromTestCase(Spammer)
unittest.TextTestRunner(verbosity = 2).run(tests)
This test fails for test_translate.
I can do two things to make the tests succeed:
(1) Uncomment the second line in the setUp function. Or,
(2) Change the names of the tests such that translate occurs first. I noticed that unittest executes tests in alphabetical order. Changing translate to, say, atranslate so that it executes first makes all tests succeed.
For (1), I can't imagine how this affects the tests since at the very first line of setUp, we create a new object for self.st . As for (2), my complaint is similar since, hey, on setUp I assign a new object to self.st so whatever I do to self.st in test_set shouldn't affect the outcome of test_translate.
So, what am I missing here?
Without studying the detais of your solution, you should read the Default Parameter Values in Python by Fredrik Lundh.
It is likely that it explains your problem with your empty list as a default argument. The reason is that the list is empty only for the first time unless you make it empty explicitly later. The initialy empty default list is the single instance of the list type that is reused when no explicit argument is passed.
It is good idea to read the above article to fix your thinking about the default arguments. The reasons are logical, but may be unexpected.
The generally recommended fix is to use None as the default value of the __init__ and set the empty list inside the body if the argument is not passed, like this:
class SpamTest(object):
def __init__(self, numlist=None):
if numlist is None:
numlist = [] # this is the new instance -- the empty list
self.__numlist = numlist
This is due to the way default parameters behave in Python when using Mutable objects like lists: Default Parameter Values in Python.
In the line:
def __init__(self, numlist = []):
The default parameter for numlist is only evaluated once so you only have one instance of the list which is shared across all instance of the SpamTest class.
So even though the test setUp is called for every test it never creates a fresh empty list, and your tests which work upon that list instance end up stepping on each others toes.
The fix is to have something like this instead, using a non-mutable object like None:
def __init__(self, numlist = None):
if numlist is None:
numlist = []
self.__numlist = numlist
The reason it works when setting the property is that you provide a brand new empty list there, replacing the list created in the constructor.

Categories