Are there any python libraries to help test external python scripts - python

I'd like to test some python scripts.
Are there any python libraries to help test external system behaviors(running scripts, testing the contents of external files, managing input/output files, and similar actions).
Also I tried making the scripts more api like to allow imports rather then calling it directly for more unit test like tests. Changes include making scripts easier to run interactively(factor lots of stuff into functions/module values and make it less procedural, add parameter to silence stdout, passing optional args to main) also serializing results in addition to the usual output formats(even though the functions to generate the output files have a medium amount of logic in them)).
Is this a good strategy or is it better to attempt to test scripts by running them blackbox style and examining output.

Test library
I'll go ahead and suggest unittest (even though it's the top Google hit for "python unit testing" and you probably already know of it). It's a very nice, easy to use, feature-ful library for unit testing.
Test strategy
Writing testable code is hard. Testing things like side-effects, environments, and file output can take the unit right out of unit test.
What I typically try to do is structure the code so that as little of it as possible does I/O or other nasty things. Then all of that code can usually be straightforwardly unit-tested.
For the parts that are hard to break into units, such as the command-line interface, I test for file output etc.
Conclusion
use unit tests as much as possible
otherwise, use black-box tests
constantly refactor code to make writing unit tests easier & more effective

Related

Should you use error handling even if your program is static?

When designing an application that is static, where no input is coming from outside the program, is it worth while to have error handling even when using a language like python that doesn't need to be compiled?
Is it just a best practice?
I use python as an example because of its duck-typing nature.
Definitely, unit testing is to test your own program's logic. If you have a program with functions doing different tasks agnostic to data coming from a user or another function it should be tested. The two points you bring up on if it should be tested or not doesn't affect unit testability.

How to unit test helper methods in Python scheduled job?

I've read quite a few answers on here about testing helper methods (not necessarily private) in my unit tests and I'm still not quite sure what the best approach should be for my current situation.
I currently have a block of logic that runs as a scheduled job. It does a number of mostly related things like update local repositories, convert file types, commit these to other repos, clean up old repos, etc. I need all of this code to run in a specific order, so rather than setting a bunch of scheduled jobs, I took a lot of these small methods and put them into one large method that would enforce the order in which the code is run:
def mainJob():
sync_repos()
convert_files()
commit_changes()
and so on. Now I'm not sure how to write my tests for this thing. It's frustrating to test the entire mainJob() function because it does so many things and is really more of a reliability feature anyway. I see a lot of people saying I should only test the public interface, but I worry that there will potentially be code that isn't directly verified.

Setting up environment for testing in Python

I'm writing integration tests using plain unittest in Python (import unittest) and are creating stubs for some external services. Now I want to run the same tests with a real implementation; but also keep the stubs. That way I can run the tests with and without the stubs and compare behaviour.
I'm running my tests both from SetupTools and through PyCharm. Is there some generic way for me to set/inject/bootstrap a parameter which tells my code wether to use the stub or the real implementation? Command line preferrable. Any pointers appreciated. :)
It sounds like you are looking for a mocking framework. Mocking frameworks allow you to create a 'stub' for the method from within your test. This is good because you don't want to be inserting any test specific code into your actual code.
One of the more popular mocking frameworks for python 2.* is python-mock (in fact it comes with python 3) So you can write the code as:
from mock import MagicMock
test_foo_mocked():
bar = MagicMock()
bar.return_value = 'fake_val'
assertEqual(bar(), 'fake_val')
test_foo_real():
assertEqual(bar(), 'real_val')
Side Note:
I would really recommend that you think of these as completely unrelated tests. There are many benefits to keeping your integration tests separate from your unit tests. Thinking of them as two different ways of running the 'same test' may encourage you to write bad tests. Unit tests should be able to test things that would be difficult or impossible to test through integration tests and vice versa.

How do I test whether a module is imported in Python for Test-Driven Development of a game?

I am about to make a game using python and libtcod roguelike game library.
More to the point, I am using PyMock because I am just starting to learn Test-Driven Development, and I am determined not to cheat. I really want to get into the habit of doing it properly, and according to TDD I need a failing unit test before I write my first line of code.
I figure my first test of my "production" code should be that its dependency, libcotdpy, is imported.
My testing file:
#!/usr/bin/python
import pymock # for mocking and unit testing
import game # my (empty) production code file, game.py
class InitializeTest(pymock.PyMockTestCase):
def test_libtcod_is_imported(self):
# How do I test that my production file imports the libtcodpy module?
if __name__=="__main__":
import unittest
unittest.main()
Please:
1) (python people) How do I test that the module is loaded?
2) (TDD people) Should I be unit testing something this basic? If not, what is the first thing I should be testing?
1) 'your_module' in sys.modules.
Don't actually use that, though:
2)
What should your library should do?
Is it “have a dependency on libcotdpy”? I think not.
You've just made a design choice that wasn't test-driven!
Write a test that demonstrates how you want to use the library. Don't think about how you're going to implement it. For example:
player = my_lib.PlayerCharacter()
assert player.position == (0, 0) # or whatever assert syntax `pymock` uses
press_key('k')
assert player.position == (0, 1)
Or something similar. (I don't know what you want your library to do, or how much libtcod provides.)
The way I usually think about TDD (and BDD) is at two levels of development: acceptance-testing level, and unit-testing level.
First thing I would do is write stories (acceptance criteria). What is the core feature of your application? Define an end-to-end scenario that explicit one feature, and goes end-to-end with it. That's your first story. Write a test for it, using an acceptance testing (or integration testing) framework. Unfortunately, I don't know Python tools, but in Java I would use JBehave, or FITnesse. It would be something very high-level, far away from the code, that considers your application as a "black box". Something like "When my input parameters are xxx, I run my application, the expected output is yyyy".
Run this test, it will fail because the underlying application doesn't exist. Create the minimal amount of classes to make it go red (and not throw an exception anymore). That's when you need to start the second phase of TDD: unit-TDD. It's basically a "descending analysis", from top-level to details, and this phase will contain a lot of red-green-refactor cycles, bringing a lot of different units in the game.
From time to time, re-run your original acceptance test, or refine it if your growing architecture and analysis forced you to make changes to specifications (theoretically, it shouldn't happen at that stage, but in practice it does, very often). When your acceptance test is completely green, you're done with that story, rinse and repeat.
All of that brings me to my point: pure TDD (I mean unit-TDD) is not practical. I mean I really like TDD, but trying to follow it religiously will be more a hassle than a help in the long run. Sometimes you will go and spike an approach to see if that goes well with the rest of your project, without writing tests first for it, and potentially rewrite it using TDD. but as long as you have acceptance tests to cover the whole lot, you're fine.
Even if there is a way to test that, I'd recommend not doing it.
Test from the client perspective (outside-in), what behavior is provided by your SUT (Game). Your tests (or your users) don't need to know (/care) that you expose this behavior using a library. As long as the behavior isn't broken, your tests should pass.
Also like another answer says, maybe you don't need the dependency - there may be a simpler solution (e.g. a hashtable might do where you instinctively jumped on a relational database). Listen to the tests... let the tests pull in behavior.
This also leaves you free to change the dependency in the future without having to fix a bunch of tests.

Non-critical unittest failures

I'm using Python's built-in unittest module and I want to write a few tests that are not critical.
I mean, if my program passes such tests, that's great! However, if it doesn't pass, it's not really a problem, the program will still work.
For example, my program is designed to work with a custom type "A". If it fails to work with "A", then it's broken. However, for convenience, most of it should also work with another type "B", but that's not mandatory. If it fails to work with "B", then it's not broken (because it still works with "A", which is its main purpose). Failing to work with "B" is not critical, I will just miss a "bonus feature" I could have.
Another (hypothetical) example is when writing an OCR. The algorithm should recognize most images from the tests, but it's okay if some of them fails. (and no, I'm not writing an OCR)
Is there any way to write non-critical tests in unittest (or other testing framework)?
As a practical matter, I'd probably use print statements to indicate failure in that case. A more correct solution is to use warnings:
http://docs.python.org/library/warnings.html
You could, however, use the logging facility to generate a more detailed record of your test results (i.e. set your "B" class failures to write warnings to the logs).
http://docs.python.org/library/logging.html
Edit:
The way we handle this in Django is that we have some tests we expect to fail, and we have others that we skip based on the environment. Since we can generally predict whether a test SHOULD fail or pass (i.e. if we can't import a certain module, the system doesn't have it, and so the test won't work), we can skip failing tests intelligently. This means that we still run every test that will pass, and have no tests that "might" pass. Unit tests are most useful when they do things predictably, and being able to detect whether or not a test SHOULD pass before we run it makes this possible.
Asserts in unit tests are binary: they will work or they will fail, there's no mid-term.
Given that, to create those "non-critical" tests you should not use assertions when you don't want the tests to fail. You should do this carefully so you don't compromise the "usefulness" of the test.
My advice to your OCR example is that you use something to record the success rate in your tests code and then create one assertion like: "assert success_rate > 8.5", and that should give the effect you desire.
Thank you for the great answers. No only one answer was really complete, so I'm writing here a combination of all answers that helped me. If you like this answer, please vote up the people who were responsible for this.
Conclusions
Unit tests (or at least unit tests in unittest module) are binary. As Guilherme Chapiewski says: they will work or they will fail, there's no mid-term.
Thus, my conclusion is that unit tests are not exactly the right tool for this job. It seems that unit tests are more concerned about "keep everything working, no failure is expected", and thus I can't (or it's not easy) to have non-binary tests.
So, unit tests don't seem the right tool if I'm trying to improve an algorithm or an implementation, because unit tests can't tell me how better is one version when compared to the other (supposing both of them are correctly implemented, then both will pass all unit tests).
My final solution
My final solution is based on ryber's idea and code shown in wcoenen answer. I'm basically extending the default TextTestRunner and making it less verbose. Then, my main code call two test suits: the critical one using the standard TextTestRunner, and the non-critical one, with my own less-verbose version.
class _TerseTextTestResult(unittest._TextTestResult):
def printErrorList(self, flavour, errors):
for test, err in errors:
#self.stream.writeln(self.separator1)
self.stream.writeln("%s: %s" % (flavour,self.getDescription(test)))
#self.stream.writeln(self.separator2)
#self.stream.writeln("%s" % err)
class TerseTextTestRunner(unittest.TextTestRunner):
def _makeResult(self):
return _TerseTextTestResult(self.stream, self.descriptions, self.verbosity)
if __name__ == '__main__':
sys.stderr.write("Running non-critical tests:\n")
non_critical_suite = unittest.TestLoader().loadTestsFromTestCase(TestSomethingNonCritical)
TerseTextTestRunner(verbosity=1).run(non_critical_suite)
sys.stderr.write("\n")
sys.stderr.write("Running CRITICAL tests:\n")
suite = unittest.TestLoader().loadTestsFromTestCase(TestEverythingImportant)
unittest.TextTestRunner(verbosity=1).run(suite)
Possible improvements
It should still be useful to know if there is any testing framework with non-binary tests, like Kathy Van Stone suggested. Probably I won't use it this simple personal project, but it might be useful on future projects.
Im not totally sure how unittest works, but most unit testing frameworks have something akin to categories. I suppose you could just categorize such tests, mark them to be ignored, and then run them only when your interested in them. But I know from experience that ignored tests very quickly become...just that ignored tests that nobody ever runs and are therefore a waste of time and energy to write them.
My advice is for your app to do, or do not, there is no try.
From unittest documentation which you link:
Instead of unittest.main(), there are
other ways to run the tests with a
finer level of control, less terse
output, and no requirement to be run
from the command line. For example,
the last two lines may be replaced
with:
suite = unittest.TestLoader().loadTestsFromTestCase(TestSequenceFunctions)
unittest.TextTestRunner(verbosity=2).run(suite)
In your case, you can create separate TestSuite instances for the criticial and non-critical tests. You could control which suite is passed to the test runner with a command line argument. Test suites can also contain other test suites so you can create big hierarchies if you want.
Python 2.7 (and 3.1) added support for skipping some test methods or test cases, as well as marking some tests as expected failure.
http://docs.python.org/library/unittest.html#skipping-tests-and-expected-failures
Tests marked as expected failure won't be counted as failure on a TestResult.
There are some test systems that allow warnings rather than failures, but test_unit is not one of them (I don't know which ones do, offhand) unless you want to extend it (which is possible).
You can make the tests so that they log warnings rather than fail.
Another way to handle this is to separate out the tests and only run them to get the pass/fail reports and not have any build dependencies (this depends on your build setup).
Take a look at Nose : http://somethingaboutorange.com/mrl/projects/nose/0.11.1/
There are plenty of command line options for selecting tests to run, and you can keep your existing unittest tests.
Another possibility is to create a "B" branch (you ARE using some sort of version control, right?) and have your unit tests for "B" in there. That way, you keep your release version's unit tests clean (Look, all dots!), but still have tests for B. If you're using a modern version control system like git or mercurial (I'm partial to mercurial), branching/cloning and merging are trivial operations, so that's what I'd recommend.
However, I think you're using tests for something they're not meant to do. The real question is "How important to you is it that 'B' works?" Because your test suite should only have tests in it that you care whether they pass or fail. Tests that, if they fail, it means the code is broken. That's why I suggested only testing "B" in the "B" branch, since that would be the branch where you are developing the "B" feature.
You could test using logger or print commands, if you like. But if you don't care enough that it's broken to have it flagged in your unit tests, I'd seriously question whether you care enough to test it at all. Besides, that adds needless complexity (extra variables to set debug level, multiple testing vectors that are completely independent of each other yet operate within the same space, causing potential collisions and errors, etc, etc). Unless you're developing a "Hello, World!" app, I suspect your problem set is complicated enough without adding additional, unnecessary complications.
You could write your test so that they count success rate.
With OCR you could throw at code 1000 images and require that 95% is successful.
If your program must work with type A then if this fails the test fails. If it's not required to work with B, what is the value of doing such a test ?

Categories