Emulating user input in python for testing - python

I have written a simple python program that takes user input, with the use of input(). There are some different commands available.
I want to make sure that all available commands function as intended and that the program catches invalid commands. Since there are quite a few different commands, this is very time-consuming to do manually (i.e., start the program, and enter all commands, one by one). (I have separate test functions for the actual execution of all commands, but I'm struggling to find a nice way to test this functionality together with the input() loop.)
How can I automate the process of giving (predetermined) user inputs, without messing up the rest of the code? In addition to using this to test the program, it would also serve as an example to the user, in order to see the possible usage of the program.
My current solution is that I have two versions of the main() function, which is basically just an infinite loop that takes inputs until an exit command is given. The first version, "main()", is the version intended for use and takes inputs from input(), until the user decides to quit. The second version, "main_test()" is only used for testing, and takes the inputs from a predetermined list, specified in the code. This does the job, but I do not want the main_test() code in the final version. I also do not want to "pollute" main() by adding things only used for testing.
def main():
while True:
user_input = input()
...
def main_test():
test_input = [...]
test_iter = 0
while True:
user_input = test_input[test_iter]
test_iter += 1
...
I have not been able to find a nice way to do this in python, although I'm sure there must be a smart way. I'd prefer a way that does not need any additional imports. But if there is a nice way to do it with additional imports, I'm all ears.
Anyway, striking out with python, my next thought was to specify the commands in a Makefile, where I would start the program and feed the program text input, emulating the user. The main benefit of this is that I would only need the "main()" function, and I would not have to change anything in the python code. The disadvantage is that the example/test is specified outside of the *.py files, which may confuse the user.

If you are looking to confirm that X input results in Y output, then what you are looking for is called unit testing; this methodology allows you to define functions that assess the result of your function against your expected results.
Module options:
- pytest
- unittest

First of all, I would like to point out that there are many methods for testing your code and some are better practice than others, while each one can be preferred by the programmer for different reasons. You should study how to test your code using the existing ways. Depending on the structure of your code, patterns that can be exploited, and your implementation principles that you follow (dependency injections, mvc, mvvm, etc).
As for your specific case, I would recommend to unittest your fraction of the code that uses the user input and define a collection of cases with predefined outputs that your can assert. Then, use the same code (maybe add cases) whenever you edit this fraction, in order to be sure that your program continues to work fluently.
Using Python's Unittest package you can implement a nice script that checks your code consistently and incorporate it to your project. See also Wikipedia page about unit testing to learn more about the principles you have to follow. You can also check Tensorflow's testing page for directions on how to incorporate this kind of testing
in a bigger project, and use these ideas in yours.
Good luck!

The answer from here should work if you don't mind importing unittest, python mocking raw input in unittests.
The main_test() part can then be removed from you projects and taken care of in the testing script. The "return_values" then correspond to you mock inputs.

Related

Can pytest or any testing tool in python generate the unit test automatically?

I started writing pytest for the source code I have. For this I see the code and in test I check if that piece of code returns expected value.
Is there any way to automatically generate test cases for python code, positive and parameterized so that it doesn't take much time if the source code is huge
You can't generate test cases using Pytest, but you can use tools like hypothesis to randomise the particular values that your test cases are given. You still need to specify what cases you want to test though, as well as giving a definition for how it should generate the parameters.
Generally, computers aren't smart enough to know what we want a function to do, so there's no way they can tell whether we intentionally added certain behaviours into our code or whether they are just bugs. Tools like mypy and flake8 can help catch questionable code, but it's still up to you whether that code is correct or not - they can't know for certain.

How does one control all source of randomness in python reliably with the aid of the user?

tl;dr: I want to be able to seed the functions/code a user provides to me assuming the user wants this functionality and tells me all the random sources of library he/she is using.
The specific scenario I was considering was that I am receiving a function (pointer/handle) and the function is implementing some (nearly) arbitrary program f without threading. The program is using any pseudorandom functions inside of it except for the seeding functions (or functions that might change the random state). There is going to be a master program that runs f and needs to be able to essentially change the random seed of f at will. So essentially my goal is to run the following pseudocode:
seed_all_of_python_reliably(seed=current_seed)
controlled_random_output = f(seed=current_seed) # now f works properly according to my seeding
I've had some ideas to make this work properly:
have the user that wrote f also write seed_all_of_python_reliably with some code that makes sure before f is ran that its code gets seeded properly too.
have the user tell the master code (somehow the part I don't know how to implement in practice) about all the random seeding code it might be using (from numpy, scipy, tensorflow, standard random, scikit etc). Something like in the file where f is defined have tell_about_random_seeding(random.seed, numpy.random.seed, ..., etc) then the master can just go through the seeding it was told and do the seeding for them.
similar to the above but instead have it in a config file where f lies (or in reality there might be lots of f's and they might all share the same random code). Then the master can just go through the config file before it goes through all the f's
The other solution is I guess to provide an API for using random libraries and a API for seeding things for the user. However, although I don't know how to implement this (i.e. it seems I would have to "inherit" the code from many possible libraries and wrap them with my own which already seems like an ugly solution) but even if I did, it seems every time a user wants to use some library they would have to tell the master at runtime and then the master can inherit it and wrap it or before it runs in the github code the developers can do the wrapping per request. This seems the hardest (due to the massive inheritance) and ugliest (providing a duplicate API for other libraries seem unnecessarily ugly) solution of the four.
Essentially I needed help implementing 2 and 3. Is the way to write this code something as simple as sending function pointers of the seeding function and then just have the master do it? I am hoping there is nothing weird I might be overlooking.
The one thing we are assuming is that the user inside of f will not call the seeding function in case 2 and 3. I was thinking that its possible to monkey patch them with a NOP, however, this can only be done if the user tells us about the seeding functions its using. The assumption is that the user is writing f because it wants the help of the master code, so the user isn't Byzantine and won't call seeding functions. So if this assumptions is what I have I can't see why I would need to monkey patch it with NOPs if we trust the user.
Also as an addition to help users, if I decided to do the seeding for them with method 2 or 3 then we could further help the user by seeding the common libraries for them.

Understand programmatically a python code without executing it

I am implementing a workflow management system, where the workflow developer overloads a little process function and inherits from a Workflow class. The class offers a method named add_component in order to add a component to the workflow (a component is the execution of a software or can be more complex).
My Workflow class in order to display status needs to know what components have been added to the workflow. To do so I tried 2 things:
execute the process function 2 times, the first time allow to gather all components required, the second one is for the real execution. The problem is, if the workflow developer do something else than adding components (add element in a databases, create a file) this will be done twice!
parse the python code of the function to extract only the add_component lines, this works but if some components are in a if / else statement and the component should not be executed, the component apears in the monitoring!
I'm wondering if there is other solution (I thought about making my workflow being an XML or something to parse easier but this is less flexible).
You cannot know what a program does without "executing" it (could be in some context where you mock things you don't want to be modified but it look like shooting at a moving target).
If you do a handmade parsing there will always be some issues you miss.
You should break the code in two functions :
a first one where the code can only add_component(s) without any side
effects, but with the possibility to run real code to check the
environment etc. to know which components to add.
a second one that
can have side effects and rely on the added components.
Using an XML (or any static format) is similar except :
you are certain there are no side effects (don't need to rely on the programmer respecting the documentation)
much less flexibility but be sure you need it.

Use user provided python code during runtime

I'm developing a system that operates on (arbitrary) data from databases. The data may need some preprocessing before the system can work with it. To allow the user the specify possibly complex rules I though of giving the user the possibility to input Python code which is used to do this task. The system is pure Python.
My plan is to introduce the tables and columns as variables and let the user to anything Python can do (including access to the standard libs). Now to my problem:
How do I take a string (the user entered), compile it to Python (after adding code to provide the input data) and get the output. I think the easiest way would be to use the user-entered data a the body of a method and take the return value of that function a my new data.
Is this possible? If yes, how? It's unimportant that the user may enter malicious code since the worst thing that could happen is, that he screws up his own system, which is thankfully not my problem ;)
Python provides an exec() statement which should do what you want. You will want to pass in the variables that you want available as the second and/or third arguments to the function (globals and locals respectively) as those control the environment that the exec is run in.
For example:
env = {'somevar': 'somevalue'}
exec(code, env)
Alternatively, execfile() can be used in a similar way, if the code that you want executed is stored in its own file.
If you only have a single expression that you want to execute, you can also use eval.
Is this possible?
If it doesn't involve time travel, anti-gravity or perpetual motion the answer to this question is always "YES". You don't need to ask that.
The right way to proceed is as follows.
You build a framework with some handy libraries and packages.
You build a few sample applications that implement this requirement: "The data may need some preprocessing before the system can work with it."
You write documentation about how that application imports and uses modules from your framework.
You turn the framework, the sample applications and the documentation over to users to let them build these applications.
Don't waste time on "take a string (the user entered), compile it to Python (after adding code to provide the input data) and get the output".
The user should write applications like this.
from your_framework import the_file_loop
def their_function( one_line_as_dict ):
one_line_as_dict['field']= some stuff
the_file_loop( their_function )
That can actually be the entire program.
You'll have to write the_file_loop, which will look something like this.
def the_file_loop( some_function ):
with open('input') as source:
with open('output') as target:
for some_line in source:
the_data = make_a_dictionary( some_line )
some_function( the_data )
target.write( make_a_line( the_data ) )
By creating a framework, and allowing users to write their own programs, you'll be a lot happier with the results. Less magic.
2 choices:
You take his input and put it in a file, then you execute it.
You use exec()
If you just want to set some local values and then provide a python shell, check out the code module.
You can start an instance of a shell that is similar to the python shell, as well as initialize it with whatever local variables you want. This would assume that whatever functionality you want to use the resulting values is built into the classes you are passing in as locals.
Example:
shell = code.InteractiveConsole({'foo': myVar1, 'bar': myVar2})
What you actually want is exec, since eval is limited to taking an expression and returning a value. With exec, you can have code blocks (statements) and work on arbitrarily complex data, passed in as the globals and locals of the code.
The result is then returned by the code via some convention (like binding it to result).
well, you're describing compile()
But... I think I'd still implement this using regular python source files. Add a special location to the path, say '~/.myapp/plugins', and just __import__ everything there. Probably you'll want to provide some convenient base classes that expose the interface you're trying to offer, so that your users can inherit from them.

python coding speed and cleanest

Python is pretty clean, and I can code neat apps quickly.
But I notice I have some minor error someplace and I dont find the error at compile but at run time. Then I need to change and run the script again. Is there a way to have it break and let me modify and run?
Also, I dislike how python has no enums. If I were to write code that needs a lot of enums and types, should I be doing it in C++? It feels like I can do it quicker in C++.
"I don't find the error at compile but at run time"
Correct. True for all non-compiled interpreted languages.
"I need to change and run the script again"
Also correct. True for all non-compiled interpreted languages.
"Is there a way to have it break and let me modify and run?"
What?
If it's a run-time error, the script breaks, you fix it and run again.
If it's not a proper error, but a logic problem of some kind, then the program finishes, but doesn't work correctly. No language can anticipate what you hoped for and break for you.
Or perhaps you mean something else.
"...code that needs a lot of enums"
You'll need to provide examples of code that needs a lot of enums. I've been writing Python for years, and have no use for enums. Indeed, I've been writing C++ with no use for enums either.
You'll have to provide code that needs a lot of enums as a specific example. Perhaps in another question along the lines of "What's a Pythonic replacement for all these enums."
It's usually polymorphic class definitions, but without an example, it's hard to be sure.
With interpreted languages you have a lot of freedom. Freedom isn't free here either. While the interpreter won't torture you into dotting every i and crossing every T before it deems your code worthy of a run, it also won't try to statically analyze your code for all those problems. So you have a few choices.
1) {Pyflakes, pychecker, pylint} will do static analysis on your code. That settles the syntax issue mostly.
2) Test-driven development with nosetests or the like will help you. If you make a code change that breaks your existing code, the tests will fail and you will know about it. This is actually better than static analysis and can be as fast. If you test-first, then you will have all your code checked at test runtime instead of program runtime.
Note that with 1 & 2 in place you are a bit better off than if you had just a static-typing compiler on your side. Even so, it will not create a proof of correctness.
It is possible that your tests may miss some plumbing you need for the app to actually run. If that happens, you fix it by writing more tests usually. But you still need to fire up the app and bang on it to see what tests you should have written and didn't.
You might want to look into something like nosey, which runs your unit tests periodically when you've saved changes to a file. You could also set up a save-event trigger to run your unit tests in the background whenever you save a file (possible e.g. with Komodo Edit).
That said, what I do is bind the F7 key to run unit tests in the current directory and subdirectories, and the F6 key to run pylint on the current file. Frequent use of these allows me to spot errors pretty quickly.
Python is an interpreted language, there is no compile stage, at least not that is visible to the user. If you get an error, go back, modify the script, and try again. If your script has long execution time, and you don't want to stop-restart, you can try a debugger like pdb, using which you can fix some of your errors during runtime.
There are a large number of ways in which you can implement enums, a quick google search for "python enums" gives everything you're likely to need. However, you should look into whether or not you really need them, and if there's a better, more 'pythonic' way of doing the same thing.

Categories