Using Pytest to test a Python Program

Using Pytest to test a Python Program - python

TI am quite new to Python Programming and have a question on testing using Pytest. In a high-level, I have a program that takes 3 pieces of user input and generates a text file in the end. For my tests, I want to basically compare the files my program outputted, with what it should be.
Now, I am not sure how to go about testing. The program itself takes no arguments, but just relies on 3 pieces of user input, which I'll use monkeypatch to simulate. Do I create a new python file called program_test.py and have methods in here that call the original program? I have tried this, but I'm having trouble actually calling the original program and sending in the simulated inputs. Or, do I have tests in the original program (which doesn't make much sense to me).
I want something like this:
import my_program
def test_1():
inputs = iter(['input1', 'input2', 'input3'])
monkeypatch.setattr('builtins.input', lambda x: next(inputs))
my_program
# now do some assertion with some file comparison
# pseudocode
assert filecompare.cmp(expectedfile, actualfile)
This just seems to be running the original program and I think its to do with the import statement i.e. it is never running test_1(), probably because I never call it? Any help would be appreciated!

Without providing your my_program code it's hard to tell what's going on.
Since you are mentioning import problems, I guess your not defining main() and if __name__ == "__main__".
Here's a little example of how you can test that.
First, structure your my_program to have main function which contains the code and then add if __name__ == "__main__" which will allow you to run main function if the my_program is executed directly but also to import my_program as module to other files (without running it, for more information please see: What does if name == "main": do?).
my_program:
def main():
x = input()
y = input()
z = input()
with open("test", "w") as f_out:
f_out.write(f"{x}-{y}-{z}")
if __name__ == "__main__":
main()
Now you can create a test.py file and test the main function of my_program:
import os
import filecmp
import my_program
def test_success(monkeypatch):
inputs = ["input1", "input2", "input3"]
monkeypatch.setattr("builtins.input", lambda: next(iter(inputs)))
my_program.main()
with open("expected", "w") as f_out:
f_out.write("-".join(inputs))
assert filecmp.cmp("expected", "test")
os.remove("test")
os.remove("expected")
def test_fail(monkeypatch):
inputs = ["input1", "input2", "input3"]
monkeypatch.setattr("builtins.input", lambda: next(iter(inputs)))
my_program.main()
with open("expected", "w") as f_out:
f_out.write("something-else-test")
assert not filecmp.cmp("expected", "test")
os.remove("test")
os.remove("expected")
This is an example so I used os.remove to delete the files. Ideally you would define fixtures in your tests to use tempfile and generate random temporary files which will be automatically deleted after your tests.

Related

How can I import the value of a variable into another Python file? [duplicate]

I have a Python program I'm building that can be run in either of 2 ways: the first is to call python main.py which prompts the user for input in a friendly manner and then runs the user input through the program. The other way is to call python batch.py -file- which will pass over all the friendly input gathering and run an entire file's worth of input through the program in a single go.
The problem is that when I run batch.py, it imports some variables/methods/etc from main.py, and when it runs this code:
import main
at the first line of the program, it immediately errors because it tries to run the code in main.py.
How can I stop Python from running the code contained in the main module which I'm importing?

Because this is just how Python works - keywords such as class and def are not declarations. Instead, they are real live statements which are executed. If they were not executed your module would be empty.
The idiomatic approach is:
# stuff to run always here such as class/def
def main():
pass
if __name__ == "__main__":
# stuff only to run when not called via 'import' here
main()
It does require source control over the module being imported, however.

Due to the way Python works, it is necessary for it to run your modules when it imports them.
To prevent code in the module from being executed when imported, but only when run directly, you can guard it with this if:
if __name__ == "__main__":
# this won't be run when imported
You may want to put this code in a main() method, so that you can either execute the file directly, or import the module and call the main(). For example, assume this is in the file foo.py.
def main():
print "Hello World"
if __name__ == "__main__":
main()
This program can be run either by going python foo.py, or from another Python script:
import foo
...
foo.main()

Use the if __name__ == '__main__' idiom -- __name__ is a special variable whose value is '__main__' if the module is being run as a script, and the module name if it's imported. So you'd do something like
# imports
# class/function definitions
if __name__ == '__main__':
# code here will only run when you invoke 'python main.py'

Unfortunately, you don't. That is part of how the import syntax works and it is important that it does so -- remember def is actually something executed, if Python did not execute the import, you'd be, well, stuck without functions.
Since you probably have access to the file, though, you might be able to look and see what causes the error. It might be possible to modify your environment to prevent the error from happening.

Put the code inside a function and it won't run until you call the function. You should have a main function in your main.py. with the statement:
if __name__ == '__main__':
main()
Then, if you call python main.py the main() function will run. If you import main.py, it will not. Also, you should probably rename main.py to something else for clarity's sake.

There was a Python enhancement proposal PEP 299 which aimed to replace if __name__ == '__main__': idiom with def __main__:, but it was rejected. It's still a good read to know what to keep in mind when using if __name__ = '__main__':.

You may write your "main.py" like this:
#!/usr/bin/env python
__all__=["somevar", "do_something"]
somevar=""
def do_something():
pass #blahblah
if __name__=="__main__":
do_something()

I did a simple test:
#test.py
x = 1
print("1, has it been executed?")
def t1():
print("hello")
print("2, has it been executed?")
def t2():
print("world")
print("3, has it been executed?")
def main():
print("Hello World")
print("4, has it been executed?")
print("5, has it been executed?")
print(x)
# while True:
# t2()
if x == 1:
print("6, has it been executed?")
#test2.py
import test
When executing or running test2.py, the running result:
1, has it been executed?
5, has it been executed?
1
6, has it been executed?
Conclusion: When the imported module does not add if __name__=="__main__":, the current module is run, The code in the imported module that is not in the function is executed sequentially, and the code in the function is not executed when it is not called.
in addition：
def main():
# Put all your code you need to execute directly when this script run directly.
pass
if __name__ == '__main__':
main()
else:
# Put functions you need to be executed only whenever imported

A minor error that could happen (at least it happened to me), especially when distributing python scripts/functions that carry out a complete analysis, was to call the function directly at the end of the function .py file.
The only things a user needed to modify were the input files and parameters.
Doing so when you import you'll get the function running immediately. For proper behavior, you simply need to remove the inside call to the function and reserve it for the real calling file/function/portion of code

Another option is to use a binary environment variable, e.g. lets call it 'run_code'. If run_code = 0 (False) structure main.py to bypass the code (but the temporarily bypassed function will still be imported as a module). Later when you are ready to use the imported function (now a module) set the environment variable run_code = 1 (True). Use the os.environ command to set and retrieve the binary variable, but be sure to convert it to an integer when retrieving (or restructure the if statement to read a string value),
in main.py:
import os
#set environment variable to 0 (False):
os.environ['run_code'] = '0'
def binary_module():
#retrieve environment variable, convert to integer
run_code_val = int(os.environ['run_code'] )
if run_code_val == 0:
print('nope. not doing it.')
if run_code_val == 1:
print('executing code...')
# [do something]
...in whatever script is loading main.py:
import os,main
main.binary_module()
OUTPUT: nope. not doing it.
# now flip the on switch!
os.environ['run_code'] = '1'
main.binary_module()
OUTPUT: executing code...
*Note: The above code presumes main.py and whatever script imports it exist in the same directory.

Although you cannot use import without running the code; there is quite a swift way in which you can input your variables; by using numpy.savez, which stores variables as numpy arrays in a .npz file. Afterwards you can load the variables using numpy.load.
See a full description in the scipy documentation
Please note this is only the case for variables and arrays of variable, and not for methods, etc.

Try just importing the functions needed from main.py? So,
from main import SomeFunction
It could be that you've named a function in batch.py the same as one in main.py, and when you import main.py the program runs the main.py function instead of the batch.py function; doing the above should fix that. I hope.

Function is called multiple times when importing but runs only once when used in the .py file where it was created [duplicate]

I have a Python program I'm building that can be run in either of 2 ways: the first is to call python main.py which prompts the user for input in a friendly manner and then runs the user input through the program. The other way is to call python batch.py -file- which will pass over all the friendly input gathering and run an entire file's worth of input through the program in a single go.
The problem is that when I run batch.py, it imports some variables/methods/etc from main.py, and when it runs this code:
import main
at the first line of the program, it immediately errors because it tries to run the code in main.py.
How can I stop Python from running the code contained in the main module which I'm importing?

Because this is just how Python works - keywords such as class and def are not declarations. Instead, they are real live statements which are executed. If they were not executed your module would be empty.
The idiomatic approach is:
# stuff to run always here such as class/def
def main():
pass
if __name__ == "__main__":
# stuff only to run when not called via 'import' here
main()
It does require source control over the module being imported, however.

Due to the way Python works, it is necessary for it to run your modules when it imports them.
To prevent code in the module from being executed when imported, but only when run directly, you can guard it with this if:
if __name__ == "__main__":
# this won't be run when imported
You may want to put this code in a main() method, so that you can either execute the file directly, or import the module and call the main(). For example, assume this is in the file foo.py.
def main():
print "Hello World"
if __name__ == "__main__":
main()
This program can be run either by going python foo.py, or from another Python script:
import foo
...
foo.main()

Use the if __name__ == '__main__' idiom -- __name__ is a special variable whose value is '__main__' if the module is being run as a script, and the module name if it's imported. So you'd do something like
# imports
# class/function definitions
if __name__ == '__main__':
# code here will only run when you invoke 'python main.py'

Unfortunately, you don't. That is part of how the import syntax works and it is important that it does so -- remember def is actually something executed, if Python did not execute the import, you'd be, well, stuck without functions.
Since you probably have access to the file, though, you might be able to look and see what causes the error. It might be possible to modify your environment to prevent the error from happening.

Put the code inside a function and it won't run until you call the function. You should have a main function in your main.py. with the statement:
if __name__ == '__main__':
main()
Then, if you call python main.py the main() function will run. If you import main.py, it will not. Also, you should probably rename main.py to something else for clarity's sake.

There was a Python enhancement proposal PEP 299 which aimed to replace if __name__ == '__main__': idiom with def __main__:, but it was rejected. It's still a good read to know what to keep in mind when using if __name__ = '__main__':.

You may write your "main.py" like this:
#!/usr/bin/env python
__all__=["somevar", "do_something"]
somevar=""
def do_something():
pass #blahblah
if __name__=="__main__":
do_something()

I did a simple test:
#test.py
x = 1
print("1, has it been executed?")
def t1():
print("hello")
print("2, has it been executed?")
def t2():
print("world")
print("3, has it been executed?")
def main():
print("Hello World")
print("4, has it been executed?")
print("5, has it been executed?")
print(x)
# while True:
# t2()
if x == 1:
print("6, has it been executed?")
#test2.py
import test
When executing or running test2.py, the running result:
1, has it been executed?
5, has it been executed?
1
6, has it been executed?
Conclusion: When the imported module does not add if __name__=="__main__":, the current module is run, The code in the imported module that is not in the function is executed sequentially, and the code in the function is not executed when it is not called.
in addition：
def main():
# Put all your code you need to execute directly when this script run directly.
pass
if __name__ == '__main__':
main()
else:
# Put functions you need to be executed only whenever imported

A minor error that could happen (at least it happened to me), especially when distributing python scripts/functions that carry out a complete analysis, was to call the function directly at the end of the function .py file.
The only things a user needed to modify were the input files and parameters.
Doing so when you import you'll get the function running immediately. For proper behavior, you simply need to remove the inside call to the function and reserve it for the real calling file/function/portion of code

Another option is to use a binary environment variable, e.g. lets call it 'run_code'. If run_code = 0 (False) structure main.py to bypass the code (but the temporarily bypassed function will still be imported as a module). Later when you are ready to use the imported function (now a module) set the environment variable run_code = 1 (True). Use the os.environ command to set and retrieve the binary variable, but be sure to convert it to an integer when retrieving (or restructure the if statement to read a string value),
in main.py:
import os
#set environment variable to 0 (False):
os.environ['run_code'] = '0'
def binary_module():
#retrieve environment variable, convert to integer
run_code_val = int(os.environ['run_code'] )
if run_code_val == 0:
print('nope. not doing it.')
if run_code_val == 1:
print('executing code...')
# [do something]
...in whatever script is loading main.py:
import os,main
main.binary_module()
OUTPUT: nope. not doing it.
# now flip the on switch!
os.environ['run_code'] = '1'
main.binary_module()
OUTPUT: executing code...
*Note: The above code presumes main.py and whatever script imports it exist in the same directory.

Although you cannot use import without running the code; there is quite a swift way in which you can input your variables; by using numpy.savez, which stores variables as numpy arrays in a .npz file. Afterwards you can load the variables using numpy.load.
See a full description in the scipy documentation
Please note this is only the case for variables and arrays of variable, and not for methods, etc.

Try just importing the functions needed from main.py? So,
from main import SomeFunction
It could be that you've named a function in batch.py the same as one in main.py, and when you import main.py the program runs the main.py function instead of the batch.py function; doing the above should fix that. I hope.

Writing a unit test for multiple Python files

I'm trying to write a testing program to test many(identical) student assignments. I have a test written using the unittest library. The documentation seems to indicate that each test should be associated with one file. Instead, I'd like to have one test file and use command line arguments to point the test to the file it should test.
I know I can do this by using the argparse module in my unit tests, but is there a better way? It seems like this behavior should be supported in unittest, but I can't find anything in the documentation...

Create the Main test directory and add sub test packages. Have a test runner created for you pointing to the test directory. It could act as a suite. I have attached a piece of code that I have used for my test suite.
import os
import unittest
def main(test_path, test_pattern):
print(('Discovering tests in : {}'.format(test_path)))
suite = unittest.TestLoader().discover(test_path, test_pattern)
unittest.TextTestRunner(verbosity=2).run(suite)
if __name__ == '__main__':
root_path = os.path.abspath('.')
test_path = os.path.join(root_path, 'src/tests/')
test_pattern = 'test_*'
main(test_path, test_pattern)

Generally speaking, unittest is used to test module level python code, not interactions python code has with external programs. AFAIK, writing to stdout (ie. print) means you are either debugging or passing information to another program.
In your case, I don't think unittest is really necessary, unless you want to give assignments that are to "pass this unittest" (which is common in the wild).
Instead I would simply iterate over the directory that contains the assignments, check the stdout using subprocess, then write the results to a csv file:
import subprocess
import os
import csv
ASSIGNMENT_DIR = '/path/to/assignments'
expected_stdout = 'Hello World!'
def _determine_grade(stdout):
if stdout == expected_stdout:
return '100%'
return '0%'
grades = {}
for assignment in os.listdir(ASSIGNMENT_DIR):
filepath = os.path.join(ASSIGNMENT_DIR, assignment)
stdout = subprocesss.check_output(f'python3 {filepath}', shell=True)
grade = _determine_grade(stdout)
grades.append({'assignment':assignment, 'grade':grade})
with open('/path/to/grades.csv', 'w+') as f:
w = csv.DictWriter(f, ('assignment', 'grade'))
w.writeheader()
w.writerows(grades)

Making a Python Program Update to the Version on Disk Immediately When there is a Change

So I am trying to make a python program to write some code to itself the run the code it created before the session ends, like this:
for i in range(10):
with open('test.py','a') as f:
f.writelines(' print("Hello World!")\n')
So, the 3rd line creates a 4th line which would print 'Hello World!', next iteration it would print it twice (because then it would have created yet another line saying the same thing) all the way up to 10 iterations. So, in the end, it looks like this:
for i in range(10):
with open('test.py','a') as f:
f.writelines(' print("Hello World!")\n')
print("Hello World!")
print("Hello World!")
print("Hello World!")
... Up to 10
(However, I mainly want to store outputted data into a variable from this, not prints or anything along those lines).
The problem is that it doesn't update fast enough. When you run the program for the first time you get nothing, then if you close and reopen the code you see all ten 'print('Hello World!')'s. I have no clue how to solve this...
Thanks!

The way you want your program to be dynamically written is possible. (But be mindful, that there are sure better alternatives than this answer provides)
First, you need to have an empty 'program.py' (or any other name) in your import path.
Then it is possible to modify program.py on the fly. The trick is that you can reload your program as a module in python, which will execute it.
import os
from importlib import reload
import program
from io import StringIO
from unittest import mock
def write_program(list_that_contains_strings_that_are_lines, force_new=False):
mode = 'w' if not os.path.exists('program.py') or force_new else 'a+'
with open('program.py', mode) as f:
f.writelines(list_that_contains_strings_that_are_lines)
lines = ['print("Hello!")\n' for _ in range(5)]
out_stream = StringIO()
with mock.patch('sys.stdout', out_stream):
write_program(lines)
reload(program)
out_stream.getvalue()
and program.py will have 5 print statements in the end.
Also, take a look into How can I make one python file run another? for detailed explanation.
Edit: You can redirect stdout stream to some buffer.

Pytest mock global variable in subprocess.call()

A global variable can be easily mocked following these answers. Great. However, this does not work when trying to mock a variable in a script that you call with subprocess.call() in a test with Pytest.
Here is my simplified script in a file called so_script.py:
import argparse
INCREMENTOR = 4
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('input_nr', type=int, help='An int to increment')
args = parser.parse_args()
with open('test.txt', 'w+') as f:
f.write(str(args.input_nr + INCREMENTOR))
Now, say I want to mock the value of INCREMENTOR in my tests to be 1. If I do this:
from subprocess import call
from unittest import mock
def test_increments_with_1():
with mock.patch('so_script.INCREMENTOR', 1):
call(['python', 'so_script.py', '3'])
with open('test.txt', 'r+') as f:
assert f.read() == '4'
The test will fail, because the value of INCREMENTOR remains 4, even though I tried to patch it to 1. So what gets written to the file is 7 instead of 4.
So my question is: how do I mock the INCREMENTOR global variable in my so_script.py file so that, when calling subprocess.call() on it, it remains mocked?

Because the so_script.py script and pytest are executed in different processes, one cannot mock objects in so_script.py while the latter is being called as a different process in tests.
The best solution I found was to put everything from the if __name__ == '__main__: block in a function and test that function with Pytest, mocking whatever I needed to mock. And, to have 100% test coverage (which was my initial intent with calling the script as a subprocess), I applied this solution.
So I dropped using subprocess.call() in my tests and wrote an init() function checking if __name__ == '__main__:, and then mocked __name__ in the tests to test the function, just as the article advises to do. This got me 100% test coverage and full mocking capabilities.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Pytest to test a Python Program - python

Related

How can I import the value of a variable into another Python file? [duplicate]

Function is called multiple times when importing but runs only once when used in the .py file where it was created [duplicate]

Writing a unit test for multiple Python files

Making a Python Program Update to the Version on Disk Immediately When there is a Change

Pytest mock global variable in subprocess.call()

Categories

Resources