Recently I've been experimenting with TDD while developing a GUI application in Python. I find it very reassuring to have tests that verify the functionality of my code, but it's been tricky to follow some of the recommened practices of TDD. Namely, writing tests first has been hard. And I'm finding it difficult to make my tests readable (due to extensive use of a mocking library).
I chose a mocking library called mocker. I use it a lot since much of the code I'm testing makes calls to (a) other methods in my application that depend on system state or (b) ObjC/Cocoa objects that cannot exist without an event loop, etc.
Anyway, I've got a lot of tests that look like this:
def test_current_window_controller():
def test(config):
ac = AppController()
m = Mocker()
ac.iter_window_controllers = iwc = m.replace(ac.iter_window_controllers)
expect(iwc()).result(iter(config))
with m:
result = ac.current_window_controller()
assert result == (config[0] if config else None)
yield test, []
yield test, [0]
yield test, [1, 0]
Notice that this is actually three tests; all use the same parameterized test function. Here's the code that is being tested:
def current_window_controller(self):
try:
# iter_window_controllers() iterates in z-order starting
# with the controller of the top-most window
# assumption: the top-most window is the "current" one
wc = self.iter_window_controllers().next()
except StopIteration:
return None
return wc
One of the things I've noticed with using mocker is that it's easier to write the application code first and then go back and write the tests second, since most of the time I'm mocking many method calls and the syntax to write the mocked calls is much more verbose (thus harder to write) than the application code. It's easier to write the app code and then model the test code off of that.
I find that with this testing method (and a bit of discipline) I can easily write code with 100% test coverage.
I'm wondering if these tests are good tests? Will I regret doing it this way down the road when I finally discover the secret to writing good tests?
Am I violating the core principles of TDD so much that my testing is in vain?
If you are writing your tests after you've written your code and making them pass, you are not doing TDD (nor are you getting any benefits of Test-First or Test-Driven development.. check out SO questions for definitive books on TDD)
One of the things I've noticed with
using mocker is that it's easier to
write the application code first and
then go back and write the tests
second, since most of the time I'm
mocking many method calls and the
syntax to write the mocked calls is
much more verbose (thus harder to
write) than the application code. It's
easier to write the app code and then
model the test code off of that.
Of course, its easier because you are just testing that the sky is orange after you made it orange by painting it with a specific kind of brush.
This is retrofitting tests (for self-assurance). Mocks are good but you should know how and when to use them - Like the saying goes 'When you have a hammer everything looks like a nail' It's also easy to write a whole load of unreadable and not-as-helpful-as-can-be tests. The time spent understanding what the test is about is time lost that can be used to fix broken ones.
And the point is:
Read Mocks aren't stubs - Martin Fowler if you haven't already. Google out some documented instances of good ModelViewPresenter patterned GUIs (Fake/Mock out the UIs if necessary).
Study your options and choose wisely. I'll play the guy with the halo on your left shoulder in white saying 'Don't do it.' Read this question as to my reasons - St. Justin is on your right shoulder. I believe he has also something to say:)
Unit tests are really useful when you refactor your code (ie. completely rewrite or move a module). As long as you have unit tests before you do the big changes, you'll have confidence that you havent forgotten to move or include something when you finish.
Please remember that TDD is not a panaceum. It's hard, it's supposed to be hard, and it's especially hard to write mocking tests "in advance".
So I would say - do what works for you. Even it's not "certified TDD". I do basically the same thing.
You may want to provide your own API for GUI that would sit between controller code and GUI library code. That could be easier to mock, or you can even add some testing hooks to it.
Last but not least, your code doesn't look too unreadable to me. Code using mocks is generally harder to understand. Fortunately in Python mocking is much easier and cleaner than i n other languages.
Related
A bit of a theoretical question that comes up with Python, since we can access almost anything we want even if it is underscored to sign as something "private".
def main_function():
_helper_function_()
...
_other_helper_function()
Doing it with TDD, you follow the Red-Green-Refactor cycle. A test looks like this now:
def test_main_function_for_something_only_helper_function_does():
# tedious setup
...
main_function()
assert something
The problem is that my main_function had so much setup steps that I've decided to test the helper functions for those specific cases:
from main_package import _helper_function
def test_helper_function_works_for_this_specific_input():
# no tedious setup
...
_helper_function_(some_input)
assert helper function does exactly something I expect
But this seems to be a bad practice. Should I even "know" about any inner/helper functions?
I refactored the main function to be more readable by moving out parts into these helper functions. So I've rewritten tests to actually test these smaller parts and created another test that the main function indeed calls them. This also seems counter-productive.
On the other hand I dislike the idea of a lot of lingering inner/helper functions with no dedicated unit tests to them, only happy path-like ones for the main function. I guess if I covered the original function before the refactoring, my old tests would be just as good enough.
Also if the main function breaks this would mean many additional tests for the helper ones are breaking too.
What is the better practice to follow?
The problem is that my main_function had so much setup steps that I've decided to test the helper functions for those specific cases
Excellent, that's exactly what's supposed to happen (the tests "driving" you to decompose the whole into smaller pieces that are easier to test).
Should I even "know" about any inner/helper functions?
Tradeoffs.
Yes, part of the point of modules is that they afford information hiding, allowing you to later change how the code does something without impacting clients, including test clients.
But also there are benefits to testing the internal modules directly; test design becomes simpler, with less coupling to irrelevant details. Fewer tests are coupled to each decision, which means that the blast radius is smaller when you need to change one of them.
My usual thinking goes like this: I should know that there are testable inner modules, and I can know that an outer module behaves like it is coupled to an inner module, but I shouldn't necessarily know that the outer module is coupled to the inner module.
assert X.result(A,B) == Y.sort([C,D,E])
If you squint at this, you'll see that it implies that X.result and Y.sort have some common requirement today, but it doesn't necessarily promise that X.result calls Y.sort.
So I've rewritten tests to actually test these smaller parts and created another test that the main function indeed calls them. This also seems counter-productive.
A works, and B works, and C works, and now here you are writing a test for f(A,B,C).... yeah, things go sideways.
The desired outcome of TDD is "Clean code that works" (Jeffries); and the truth of things is that you can get clean code that works without writing every test in the world.
Tests are most important in code where faults are most probable - straight line code where we are just wiring things together doesn't benefit nearly as much from the red-green-refactor cycle as code that has a lot of conditionals and branching.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies
For sections of code that are "so simple that there are obviously no deficiencies", a suite of automated programmer tests is not a great investment. Get two people to perform a manual review, and sign off on it.
Too many private/helper functions are often a sign of missing abstraction.
May be you should consider applying the 'Extract class' refactoring. This refactoring will solve your confusion, as the private members will end up becoming public members of the extracted class.
Please not, I am not suggesting here to create a class for every private member but rather to play with the model a bit to find a better design.
I have been trying to get the hang of TDD and unit testing (in python, using nose) and there are a few basic concepts which I'm stuck on. I've read up a lot on the subject but nothing seems to address my issues - probably because they're so basic they're assumed to be understood.
The idea of TDD is that unit tests are written before the code they test. Unit test should test small portions of code (e.g. functions) which, for the purposes of the test, are self-contained and isolated. However, this seems to me to be highly dependent on the implementation. During implementation, or during a later bugfix it may become necessary to abstract some of the code into a new function. Should I then go through all my tests and mock out that function to keep them isolated? Surely in doing this there is a danger of introducing new bugs into the tests, and the tests will no longer test exactly the same situation?
From my limited experience in writing unit tests, it appears that completely isolating a function sometimes results in a test that is longer and more complicated than the code it is testing. So if the test fails all it tells you is that there is either a bug in the code or in the test, but its not obvious which. Not isolating it may mean a much shorter and easier to read test, but then its not a unit test...
Often, once isolated, unit tests seem to be merely repeating the function. E.g. if there is a simple function which adds two numbers, then the test would probably look something like assert add(a, b) == a + b. Since the implementation is simply return a + b, what's the point in the test? A far more useful test would be to see how the function works within the system, but this goes against unit testing because it is no longer isolated.
My conclusion is that unit tests are good in some situations, but not everywhere and that system tests are generally more useful. The approach that this implies is to write system tests first, then, if they fail, isolate portions of the system into unit tests to pinpoint the failure. The problem with this, obviously, is that its not so easy to test corner cases. It also means that the development is not fully test driven, as unit tests are only written as needed.
So my basic questions are:
Should unit tests be used everywhere, however small and simple the function?
How does one deal with changing implementations? I.e. should the implementation of the tests change continuously too, and doesn't this reduce their usefulness?
What should be done when the test gets more complicated than the code its testing?
Is it always best to start with unit tests, or is it better to start with system tests, which at the start of development are much easier to write?
Regarding your conclusion first: both unit tests and system tests (integration tests) both have their use, and are in my opinion just as useful. During development I find it easier to start with unit tests, but for testing legacy code I find your approach where you start with the integration tests easier. I don't think there's a right or wrong way of doing this, the goal is to make a safetynet that allows you to write solid and well tested code, not the method itself.
I find it useful to think about each function as an API in this context. The unit test is testing the API, not the implementation. If the implementation changes, the test should remain the same, this is the safety net that allows you to refactor your code with confidence. Even if refactoring means taking part of the implementation out to a new function, I will say it's ok to keep the test as it is without stubbing or mocking the part that was refactored out. You will probably want a new set of tests for the new function however.
Unit tests are not a holy grail! Test code should be fairly simple in my opinion, and it should be little reason for the test code itself to fail. If the test becomes more complex than the function it tests, it probably means you need to refactor the code differently. An example from my own past: I had some code that took some input and produced some output stored as XML. Parsing the XML to verifying that the output was correct caused a lot of complexity in my tests. However realizing that the XML-representation was not the point, I was able to refactor the code so that I could test the output without messing with the details of XML.
Some functions are so trivial that a separate test for them adds no value. In your example you're not really testing your code, but that the '+' operator in your language works as expected. This should be tested by the language implementer, not you. However that function won't need to get very much more complex before adding a test for it is worthwhile.
In short, I think your observations are very relevant and point towards a pragmatic approach to testing. Following some rigorous definition too closely will often get in the way, even though the definitions themselves may be necessary for the purpose of having a way to communicate about the ideas they convey. As said, the goal is not the method, but the result; which for testing is to have confidence in your code.
1) Should unit tests be used everywhere, however small and simple the function?
No. If a function has no logic in it (if, while-loops, appends, etc...) there's nothing to test.
This means that an add function implemented like:
def add(a, b):
return a + b
It doesn't have anything to test. But if you really want to build a test for it, then:
assert add(a, b) == a + b # Worst test ever!
is the worst test one could ever write. The main problem is that the tested logic must NOT be reproduced in the testing code, because:
If there's a bug in there it will be reproduced as well.
You're no more testing the function but that a + b works in the same way in two different files.
So it would make more sense something like:
assert add(1, 2) == 3
But once again, this is just an example, and this add function shouldn't even be tested.
2) How does one deal with changing implementations?
It depends on what changes. Keep in mind that:
You're testing the API (roughly speaking, that for a given input you get a specific output/effect).
You're not repeating the production code in your testing code (as explained before).
So, unless you're changing the API of your production code, the testing code will not be affacted in any way.
3) What should be done when the test gets more complicated than the code its testing?
Yell at whoever wrote those tests! (And re-write them).
Unit tests are simple and don't have any logic in them.
4a) Is it always best to start with unit tests, or is it better to start with system tests?
If we are talking about TDD than one shouldn't even have this problem, because even before writing one little tiny function the good TDD developer would've written unit tests for it.
If you have already working code without tests whatsoever, I'd say that unit tests are easier to write.
4b) Which at the start of development are much easier to write?
Unit tests! Because you don't even have the root of your code, how could you write system tests?
I'd like to test some python scripts.
Are there any python libraries to help test external system behaviors(running scripts, testing the contents of external files, managing input/output files, and similar actions).
Also I tried making the scripts more api like to allow imports rather then calling it directly for more unit test like tests. Changes include making scripts easier to run interactively(factor lots of stuff into functions/module values and make it less procedural, add parameter to silence stdout, passing optional args to main) also serializing results in addition to the usual output formats(even though the functions to generate the output files have a medium amount of logic in them)).
Is this a good strategy or is it better to attempt to test scripts by running them blackbox style and examining output.
Test library
I'll go ahead and suggest unittest (even though it's the top Google hit for "python unit testing" and you probably already know of it). It's a very nice, easy to use, feature-ful library for unit testing.
Test strategy
Writing testable code is hard. Testing things like side-effects, environments, and file output can take the unit right out of unit test.
What I typically try to do is structure the code so that as little of it as possible does I/O or other nasty things. Then all of that code can usually be straightforwardly unit-tested.
For the parts that are hard to break into units, such as the command-line interface, I test for file output etc.
Conclusion
use unit tests as much as possible
otherwise, use black-box tests
constantly refactor code to make writing unit tests easier & more effective
I'm about to begin my third medium-sized project and would like (for the first time in my life i admit) to start using unittests.
I have no idea though, which method to use, unitests or doctests.
Which of the methods is the most efficient, or which should a beginner choose to implement?
Thanks
I happen to prefer unittests, but both are excellent and well developed methods of testing, and both are well-supported by Django (see here for details). In short, there are some key advantages and disadvantages to each:
Pros of unittests
unittests allows for easy creation of more complicated tests. If you have a test that involves calling multiple helper functions, iterations, and other analyses, doctests can feel limiting. unittests, on the other hand, is just writing Python code- anything you can do in Python you can do comfortably there. Take this code (a modified version of a unittest I once wrote):
def basic_tests(self, cacheclass, outer=10, inner=100, hit_rate=None):
c = cacheclass(lambda x: x + 1)
for n in xrange(outer):
for i in xrange(inner):
self.assertEqual(c(i), i + 1)
if hit_rate != None:
self.assertEqual(c.hit_rate(), hit_rate)
def test_single_cache(self):
self.basic_tests(SingleCache, outer=10, inner=100, hit_rate=0)
sc = SingleCache(lambda x: x + 1)
for input in [0, 1, 2, 2, 2, 2, 1, 1, 0, 0]:
self.assertEqual(sc(input), input + 1)
self.assertEqual(sc.hit_rate(), .5)
I use the basic_tests method to run some tests on a class, then run an assertion within a for loop. There are ways to do this in doctests, but they require a good deal of thought- doctests are best at checking that specific individual calls to a function return the values they should. (This is especially true within Django, which has fantastic tools for unit testing (see django.test.client).
doctests can clutter up your code. When I'm writing a class or method, I put as much documentation into the docstrings as I need to to make it clear what the method does. But if your docstrings are 20+ lines long, you can end up having as much documentation within your code as you have code. This adds to the difficulty of reading and editing it (one of my favorite things about Python as a programming language is its compactness).
Pros of docstrings
Your tests are associated with particular classes and methods. This means that if a test fails, you immediately know which class and method failed. You can also use tools to determine the coverage of your tests across your classes. (Of course, this can be limiting as well, if you want a test to cover many different parts of your code).
Your tests are right next to the code, meaning it is easier to keep them in sync. When I make changes to a class or method, I often forget to make the corresponding changes to the test cases (though of course I am soon helpfully reminded when I run them). Having the doctests right next to your method declaration and code makes this easy.
Tests serve as a kind of documentation. People who look through your code can have pre-included examples of how to call and use each method.
Conclusion: I certainly prefer unittests, but there is a great case to be made for either.
I am about to make a game using python and libtcod roguelike game library.
More to the point, I am using PyMock because I am just starting to learn Test-Driven Development, and I am determined not to cheat. I really want to get into the habit of doing it properly, and according to TDD I need a failing unit test before I write my first line of code.
I figure my first test of my "production" code should be that its dependency, libcotdpy, is imported.
My testing file:
#!/usr/bin/python
import pymock # for mocking and unit testing
import game # my (empty) production code file, game.py
class InitializeTest(pymock.PyMockTestCase):
def test_libtcod_is_imported(self):
# How do I test that my production file imports the libtcodpy module?
if __name__=="__main__":
import unittest
unittest.main()
Please:
1) (python people) How do I test that the module is loaded?
2) (TDD people) Should I be unit testing something this basic? If not, what is the first thing I should be testing?
1) 'your_module' in sys.modules.
Don't actually use that, though:
2)
What should your library should do?
Is it “have a dependency on libcotdpy”? I think not.
You've just made a design choice that wasn't test-driven!
Write a test that demonstrates how you want to use the library. Don't think about how you're going to implement it. For example:
player = my_lib.PlayerCharacter()
assert player.position == (0, 0) # or whatever assert syntax `pymock` uses
press_key('k')
assert player.position == (0, 1)
Or something similar. (I don't know what you want your library to do, or how much libtcod provides.)
The way I usually think about TDD (and BDD) is at two levels of development: acceptance-testing level, and unit-testing level.
First thing I would do is write stories (acceptance criteria). What is the core feature of your application? Define an end-to-end scenario that explicit one feature, and goes end-to-end with it. That's your first story. Write a test for it, using an acceptance testing (or integration testing) framework. Unfortunately, I don't know Python tools, but in Java I would use JBehave, or FITnesse. It would be something very high-level, far away from the code, that considers your application as a "black box". Something like "When my input parameters are xxx, I run my application, the expected output is yyyy".
Run this test, it will fail because the underlying application doesn't exist. Create the minimal amount of classes to make it go red (and not throw an exception anymore). That's when you need to start the second phase of TDD: unit-TDD. It's basically a "descending analysis", from top-level to details, and this phase will contain a lot of red-green-refactor cycles, bringing a lot of different units in the game.
From time to time, re-run your original acceptance test, or refine it if your growing architecture and analysis forced you to make changes to specifications (theoretically, it shouldn't happen at that stage, but in practice it does, very often). When your acceptance test is completely green, you're done with that story, rinse and repeat.
All of that brings me to my point: pure TDD (I mean unit-TDD) is not practical. I mean I really like TDD, but trying to follow it religiously will be more a hassle than a help in the long run. Sometimes you will go and spike an approach to see if that goes well with the rest of your project, without writing tests first for it, and potentially rewrite it using TDD. but as long as you have acceptance tests to cover the whole lot, you're fine.
Even if there is a way to test that, I'd recommend not doing it.
Test from the client perspective (outside-in), what behavior is provided by your SUT (Game). Your tests (or your users) don't need to know (/care) that you expose this behavior using a library. As long as the behavior isn't broken, your tests should pass.
Also like another answer says, maybe you don't need the dependency - there may be a simpler solution (e.g. a hashtable might do where you instinctively jumped on a relational database). Listen to the tests... let the tests pull in behavior.
This also leaves you free to change the dependency in the future without having to fix a bunch of tests.