I like using Lettuce to define test cases. In many cases, it's easy to write Lettuce scenarios in such a way that they can be run either atomically or as part of other scenarios in a feature. However, I find that Lettuce is also a useful tool to try and reason about and implement more complex integration tests. In these cases, it makes sense to break up the tests into scenarios, but define a dependency on a previous scenario. That way I can run a scenario without having to explicitly define which other scenarios need to be run. It also makes the dependency clear in the scenario definition. This might look something like this:
Scenario: Really long scenario
Given some condition
Given another condition
Then something
...
Scenario: A dependent scenario
Given the scenario "Really long scenario" has been run
Given new condition
Then some stuff
...
Then I could do something like:
#step('Given the scenario "([^"]*)" has been run')
def check_scenario(step, sentence):
scenario = get_scenario(sentence) # This what I don't know how to do
if not scenario.ran:
scenario.run()
How do you handle this situation? Are there any gotchas I'm missing with this approach? Taking a quick look through the API docs and the source code, it didn't seem like there was an easy way to retrieve a scenario by it's string.
The only thing I know about is defining new steps which call previously defined steps: Check out the tutorial about that topic. Maybe this can be a good workaround for your problem.
You can use world in order to store data between your scenarios.
#before.each_feature
def feature_setup(feature):
...
world.feature_data = dict()
...
You can access this data from everywhere you have access to world
You can mix it with your terrain.py file in order to save data between steps, scenarios, features.
Related
A bit of a theoretical question that comes up with Python, since we can access almost anything we want even if it is underscored to sign as something "private".
def main_function():
_helper_function_()
...
_other_helper_function()
Doing it with TDD, you follow the Red-Green-Refactor cycle. A test looks like this now:
def test_main_function_for_something_only_helper_function_does():
# tedious setup
...
main_function()
assert something
The problem is that my main_function had so much setup steps that I've decided to test the helper functions for those specific cases:
from main_package import _helper_function
def test_helper_function_works_for_this_specific_input():
# no tedious setup
...
_helper_function_(some_input)
assert helper function does exactly something I expect
But this seems to be a bad practice. Should I even "know" about any inner/helper functions?
I refactored the main function to be more readable by moving out parts into these helper functions. So I've rewritten tests to actually test these smaller parts and created another test that the main function indeed calls them. This also seems counter-productive.
On the other hand I dislike the idea of a lot of lingering inner/helper functions with no dedicated unit tests to them, only happy path-like ones for the main function. I guess if I covered the original function before the refactoring, my old tests would be just as good enough.
Also if the main function breaks this would mean many additional tests for the helper ones are breaking too.
What is the better practice to follow?
The problem is that my main_function had so much setup steps that I've decided to test the helper functions for those specific cases
Excellent, that's exactly what's supposed to happen (the tests "driving" you to decompose the whole into smaller pieces that are easier to test).
Should I even "know" about any inner/helper functions?
Tradeoffs.
Yes, part of the point of modules is that they afford information hiding, allowing you to later change how the code does something without impacting clients, including test clients.
But also there are benefits to testing the internal modules directly; test design becomes simpler, with less coupling to irrelevant details. Fewer tests are coupled to each decision, which means that the blast radius is smaller when you need to change one of them.
My usual thinking goes like this: I should know that there are testable inner modules, and I can know that an outer module behaves like it is coupled to an inner module, but I shouldn't necessarily know that the outer module is coupled to the inner module.
assert X.result(A,B) == Y.sort([C,D,E])
If you squint at this, you'll see that it implies that X.result and Y.sort have some common requirement today, but it doesn't necessarily promise that X.result calls Y.sort.
So I've rewritten tests to actually test these smaller parts and created another test that the main function indeed calls them. This also seems counter-productive.
A works, and B works, and C works, and now here you are writing a test for f(A,B,C).... yeah, things go sideways.
The desired outcome of TDD is "Clean code that works" (Jeffries); and the truth of things is that you can get clean code that works without writing every test in the world.
Tests are most important in code where faults are most probable - straight line code where we are just wiring things together doesn't benefit nearly as much from the red-green-refactor cycle as code that has a lot of conditionals and branching.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies
For sections of code that are "so simple that there are obviously no deficiencies", a suite of automated programmer tests is not a great investment. Get two people to perform a manual review, and sign off on it.
Too many private/helper functions are often a sign of missing abstraction.
May be you should consider applying the 'Extract class' refactoring. This refactoring will solve your confusion, as the private members will end up becoming public members of the extracted class.
Please not, I am not suggesting here to create a class for every private member but rather to play with the model a bit to find a better design.
I'd like to write a set of "fuzzy" unit tests in python. So far I've been using testtools, but switching to a different framework would be fine.
My test suite is aiming to test the performance of image-processing algorithms. I'd like to be able to have tests report fuzzy pass states. In other words, the results are "good enough" but it might be useful to investigate.
I have something like this:
suite = unittest.TestLoader().loadTestsFromTestCase(TestMyAlgorithm)
result = testtools.TestResult()
result.startTestRun()
try:
suite.run(result)
finally:
result.stopTestRun()
I'd like to use information in the result object to generate a report, but it looks like all of the information associated with passed tests has been tossed.
I'm wondering if I'm abusing the notion of a unit test to fit this sort of investigation.
Is there a standard way to perform this sort of testing in python?
Assuming your goal here is really reporting, get a tool which can generate a detailed report in xml format (e.g. nosetests; py.test likely has similar support), and process the report however you want in a second step.
My Python project has the ability to perform operations on two different destinations, let's call them SF and LA. Which is the better way to accomplish this?
Option A:
destinations.py
LA = 1
SF = 2
example_operation.py
import destinations
run_operation(destination=destinations.LA)
def run_operation(destination):
assert destination in [destinations.LA, destinations.SF]
...
OR
Option B:
example_operation.py
run_operation(destination='LA')
def run_operation(destination):
assert destination in ['LA', 'SF']
...
I realize I can also use a dictionary or many other methods to accomplish this. I'd like to know which is the best practice for declaring and validating these.
Since it’s very subjective, I’ll avoid commenting on which way would be better. You could try to argue from a performance point (integers are faster than strings, but variable lookups are slower than constants?), or from a code completion view (editors could auto-complete the variables), or extensibility (you can easily use a new string for something new), but in the end, it doesn’t really matter much: It’s mostly personal preference.
What I want to try to comment on however is the validation question: How to validate such arguments? My usual answer to that is: Don’t.
Python is usually used without many fail-safes. For example, Python doesn’t have true private members, and large parts of the stdlib even go without hiding their internals completely. If you want, you can use those internals, mess with all the things—but if something breaks, it’s your problem. So often, you would just expect users to use your code correctly: If they pass in a parameter your function doesn’t expect, then, well, something will fail. Of course it is not bad to have some kind of validation but you usually don’t need to put asserts everywhere.
For those who know perl, I'm looking for something similar to Test::Deep::is_deeply() in Python.
In Python's unittest I can conveniently compare nested data structures already, if I expect them to be equal:
self.assertEqual(os.walk('some_path'),
my.walk('some_path'),
"compare os.walk with my own implementation")
However, in the wanted test, the order of files in the respective sublist of the os.walk tuple shall be of no concern.
If it was just this one test it would be ok to code an easy solution. But I envision several tests on differently structured nested data. And I am hoping for a general solution.
I checked Python's own unittest documentation, looked at pyUnit, and at nose and it's plugins. Active maintenance would also be an important aspect for usage.
The ultimate goal for me would be to have a set of descriptive types like UnorderedIterable, SubsetOf, SupersetOf, etc which can be called to describe a nested data structure, and then use that description to compare two actual sets of data.
In the os.walk example I'd like something like:
comparison = OrderedIterable(
OrderedIterable(
str,
UnorderedIterable(),
UnorderedIterable()
)
)
The above describes the kind of data structure that list(os.walk()) would return. For comparison of data A and data B in a unit test, the current path names would be cast into a str(), and the dir and file lists would be compared ignoring the order with:
self.assertDeep(A, B, comparison, msg)
Is there anything out there? Or is it such a trivial task that people write their own? I feel comfortable doing it, but I don't want to reinvent, and especially would not want to code the full orthogonal set of types, tests for those, etc. In short, I wouldn't publish it and thus the next one has to rewrite again...
Python Deep seems to be a project to reimplement perl's Test::Deep. It is written by the author of Test::Deep himself. Last development happened in early 2016.
Update (2018/Aug): Latest release (2016/Feb) is located on PyPi/Deep
I have done some P3k porting work on github
Not a solution, but the currently implemented workaround to solve the particular example listed in the question:
os_walk = list(os.walk('some_path'))
dt_walk = list(my.walk('some_path'))
self.assertEqual(len(dt_walk), len(os_walk), "walk() same length")
for ((osw, osw_dirs, osw_files), (dt, dt_dirs, dt_files)) in zip(os_walk, dt_walk):
self.assertEqual(dt, osw, "walk() currentdir")
self.assertSameElements(dt_dirs, osw_dirs, "walk() dirlist")
self.assertSameElements(dt_files, osw_files, "walk() fileList")
As we can see from this example implementation that's quite a bit of code. As we can also see, Python's unittest has most of the ingredients required.
Recently I've been experimenting with TDD while developing a GUI application in Python. I find it very reassuring to have tests that verify the functionality of my code, but it's been tricky to follow some of the recommened practices of TDD. Namely, writing tests first has been hard. And I'm finding it difficult to make my tests readable (due to extensive use of a mocking library).
I chose a mocking library called mocker. I use it a lot since much of the code I'm testing makes calls to (a) other methods in my application that depend on system state or (b) ObjC/Cocoa objects that cannot exist without an event loop, etc.
Anyway, I've got a lot of tests that look like this:
def test_current_window_controller():
def test(config):
ac = AppController()
m = Mocker()
ac.iter_window_controllers = iwc = m.replace(ac.iter_window_controllers)
expect(iwc()).result(iter(config))
with m:
result = ac.current_window_controller()
assert result == (config[0] if config else None)
yield test, []
yield test, [0]
yield test, [1, 0]
Notice that this is actually three tests; all use the same parameterized test function. Here's the code that is being tested:
def current_window_controller(self):
try:
# iter_window_controllers() iterates in z-order starting
# with the controller of the top-most window
# assumption: the top-most window is the "current" one
wc = self.iter_window_controllers().next()
except StopIteration:
return None
return wc
One of the things I've noticed with using mocker is that it's easier to write the application code first and then go back and write the tests second, since most of the time I'm mocking many method calls and the syntax to write the mocked calls is much more verbose (thus harder to write) than the application code. It's easier to write the app code and then model the test code off of that.
I find that with this testing method (and a bit of discipline) I can easily write code with 100% test coverage.
I'm wondering if these tests are good tests? Will I regret doing it this way down the road when I finally discover the secret to writing good tests?
Am I violating the core principles of TDD so much that my testing is in vain?
If you are writing your tests after you've written your code and making them pass, you are not doing TDD (nor are you getting any benefits of Test-First or Test-Driven development.. check out SO questions for definitive books on TDD)
One of the things I've noticed with
using mocker is that it's easier to
write the application code first and
then go back and write the tests
second, since most of the time I'm
mocking many method calls and the
syntax to write the mocked calls is
much more verbose (thus harder to
write) than the application code. It's
easier to write the app code and then
model the test code off of that.
Of course, its easier because you are just testing that the sky is orange after you made it orange by painting it with a specific kind of brush.
This is retrofitting tests (for self-assurance). Mocks are good but you should know how and when to use them - Like the saying goes 'When you have a hammer everything looks like a nail' It's also easy to write a whole load of unreadable and not-as-helpful-as-can-be tests. The time spent understanding what the test is about is time lost that can be used to fix broken ones.
And the point is:
Read Mocks aren't stubs - Martin Fowler if you haven't already. Google out some documented instances of good ModelViewPresenter patterned GUIs (Fake/Mock out the UIs if necessary).
Study your options and choose wisely. I'll play the guy with the halo on your left shoulder in white saying 'Don't do it.' Read this question as to my reasons - St. Justin is on your right shoulder. I believe he has also something to say:)
Unit tests are really useful when you refactor your code (ie. completely rewrite or move a module). As long as you have unit tests before you do the big changes, you'll have confidence that you havent forgotten to move or include something when you finish.
Please remember that TDD is not a panaceum. It's hard, it's supposed to be hard, and it's especially hard to write mocking tests "in advance".
So I would say - do what works for you. Even it's not "certified TDD". I do basically the same thing.
You may want to provide your own API for GUI that would sit between controller code and GUI library code. That could be easier to mock, or you can even add some testing hooks to it.
Last but not least, your code doesn't look too unreadable to me. Code using mocks is generally harder to understand. Fortunately in Python mocking is much easier and cleaner than i n other languages.