Ordering tests using trial twisted

Ordering tests using trial twisted - python

We are thinking of using trial from twisted for our testing. Looking at the documentation, there does not seem to be a way to order the tests specifically, it follows its own based on names. Is there any way to order the tests without using naming conventions?

One can only order alphabetically or "toptobottom" (which executes your tests in the order they're in the code) using trial --order. So if you wanted to execute the tests based on the order they're written in your code, then you'd execute something similar to:
trial --order=toptobottom myunittests.py
This is the extent of trial's ordering functionality though. You may be able to use py.test for some other kinds of ordering, but I have now clue how to do that.

Related

Can pytest or any testing tool in python generate the unit test automatically?

I started writing pytest for the source code I have. For this I see the code and in test I check if that piece of code returns expected value.
Is there any way to automatically generate test cases for python code, positive and parameterized so that it doesn't take much time if the source code is huge

You can't generate test cases using Pytest, but you can use tools like hypothesis to randomise the particular values that your test cases are given. You still need to specify what cases you want to test though, as well as giving a definition for how it should generate the parameters.
Generally, computers aren't smart enough to know what we want a function to do, so there's no way they can tell whether we intentionally added certain behaviours into our code or whether they are just bugs. Tools like mypy and flake8 can help catch questionable code, but it's still up to you whether that code is correct or not - they can't know for certain.

Is it wise to use two completely separate unit testing suites?

My project has existing (relatively low-coverage; maybe 50%, and a couple of them can't actually test the result, only that the process completes) tests using Python's built-in unittest suite. I've worked with hypothesis before and I'd like to use that as well - but I'm not sure I want to throw out the existing tests.
Has anyone tried having two completely separate testing frameworks and test sets on a project? Is this a good idea, or is it going to cause unexpected problems down the line?

IMO, If current framework supports the attribute based categorization then you can separate them by adding separate categories to have separate results from old and new tests.
On the other hand you can also go for multiple framework if they're supported and have no conflict of interest(E.g. asserts, test reports) by the test runner in your project. But in this case you'll end up having two separate reports from your test executions.

Order of tests in python unittest

I was looking at similar questions and I couldn't find an answer to my problem.
I wrote Tests in a python class that derives from unittest.TestCase
class TestEffortFormula(unittest.TestCase)
I need to give an order to the tests (please, do not tell me that I shouldn't rely on test's order, I just do).
Before I needed to give order to the tests the command I used to run the tests was:
unittest.main(testRunner=TeamcityTestRunner())
Then I wanted to make the order dissappear, so I tried the following:
loader = unittest.TestLoader()
loader.sortTestMethodsUsing(None)
loader.loadTestsFromTestCase(TestEffortFormula)
suite = loader.suiteClass()
but from here I don't know how to run the tests, specially with testRunner=TeamcityTestRunner()
as I did before.
Appreciate your help

Option 1.
One solution to this (as a workaround) was given here - which suggests writing the tests in numbered methods step1, step2, etc., then collecting and storing them via dir(self) and yielding them to one test_ method which trys each.
Not ideal but does what you expect. Each test sequence has to be a single TestClass (or adapt the method given there to have more than one sequence generating method).
Option 2.
Another solution, also in the linked question, is you name your tests alphabetically+numerically sorted so that they will execute in that order.
But in both cases, write monolithic tests, each in their own Test Class.
P.S. I agree with all the comments that say unit testing shouldn't be done this way; but there are situations where unit test frameworks (like unittest and pytest) get used to do integration tests which need modular independent steps to be useful. Also, if QA can't influence Dev to write modular code, these kinds of things have to be done.

I've searched a long time to solve this problem myself.
One of the answers in this question does exactly what you need.
Applied to your code:
ln = lambda f: getattr(TestEffortFormula, f).im_func.func_code.co_firstlineno
lncmp = lambda _, a, b: cmp(ln(a), ln(b))
unittest.TestLoader.sortTestMethodsUsing = lncmp
suite = unittest.TestLoader().loadTestsFromTestCase(TestEffortFormula)
unittest.TextTestRunner(failfast=True).run(suite)
Unfortunately, setting unittest.TestLoader.sortTestMethodsUsing=None does not work, although it is documented that this should avoid sorting the tests alphabetically.

How do I compare two nested data structures for unittesting?

For those who know perl, I'm looking for something similar to Test::Deep::is_deeply() in Python.
In Python's unittest I can conveniently compare nested data structures already, if I expect them to be equal:
self.assertEqual(os.walk('some_path'),
my.walk('some_path'),
"compare os.walk with my own implementation")
However, in the wanted test, the order of files in the respective sublist of the os.walk tuple shall be of no concern.
If it was just this one test it would be ok to code an easy solution. But I envision several tests on differently structured nested data. And I am hoping for a general solution.
I checked Python's own unittest documentation, looked at pyUnit, and at nose and it's plugins. Active maintenance would also be an important aspect for usage.
The ultimate goal for me would be to have a set of descriptive types like UnorderedIterable, SubsetOf, SupersetOf, etc which can be called to describe a nested data structure, and then use that description to compare two actual sets of data.
In the os.walk example I'd like something like:
comparison = OrderedIterable(
OrderedIterable(
str,
UnorderedIterable(),
UnorderedIterable()
)
)
The above describes the kind of data structure that list(os.walk()) would return. For comparison of data A and data B in a unit test, the current path names would be cast into a str(), and the dir and file lists would be compared ignoring the order with:
self.assertDeep(A, B, comparison, msg)
Is there anything out there? Or is it such a trivial task that people write their own? I feel comfortable doing it, but I don't want to reinvent, and especially would not want to code the full orthogonal set of types, tests for those, etc. In short, I wouldn't publish it and thus the next one has to rewrite again...

Python Deep seems to be a project to reimplement perl's Test::Deep. It is written by the author of Test::Deep himself. Last development happened in early 2016.
Update (2018/Aug): Latest release (2016/Feb) is located on PyPi/Deep
I have done some P3k porting work on github

Not a solution, but the currently implemented workaround to solve the particular example listed in the question:
os_walk = list(os.walk('some_path'))
dt_walk = list(my.walk('some_path'))
self.assertEqual(len(dt_walk), len(os_walk), "walk() same length")
for ((osw, osw_dirs, osw_files), (dt, dt_dirs, dt_files)) in zip(os_walk, dt_walk):
self.assertEqual(dt, osw, "walk() currentdir")
self.assertSameElements(dt_dirs, osw_dirs, "walk() dirlist")
self.assertSameElements(dt_files, osw_files, "walk() fileList")
As we can see from this example implementation that's quite a bit of code. As we can also see, Python's unittest has most of the ingredients required.

Non-critical unittest failures

I'm using Python's built-in unittest module and I want to write a few tests that are not critical.
I mean, if my program passes such tests, that's great! However, if it doesn't pass, it's not really a problem, the program will still work.
For example, my program is designed to work with a custom type "A". If it fails to work with "A", then it's broken. However, for convenience, most of it should also work with another type "B", but that's not mandatory. If it fails to work with "B", then it's not broken (because it still works with "A", which is its main purpose). Failing to work with "B" is not critical, I will just miss a "bonus feature" I could have.
Another (hypothetical) example is when writing an OCR. The algorithm should recognize most images from the tests, but it's okay if some of them fails. (and no, I'm not writing an OCR)
Is there any way to write non-critical tests in unittest (or other testing framework)?

As a practical matter, I'd probably use print statements to indicate failure in that case. A more correct solution is to use warnings:
http://docs.python.org/library/warnings.html
You could, however, use the logging facility to generate a more detailed record of your test results (i.e. set your "B" class failures to write warnings to the logs).
http://docs.python.org/library/logging.html
Edit:
The way we handle this in Django is that we have some tests we expect to fail, and we have others that we skip based on the environment. Since we can generally predict whether a test SHOULD fail or pass (i.e. if we can't import a certain module, the system doesn't have it, and so the test won't work), we can skip failing tests intelligently. This means that we still run every test that will pass, and have no tests that "might" pass. Unit tests are most useful when they do things predictably, and being able to detect whether or not a test SHOULD pass before we run it makes this possible.

Asserts in unit tests are binary: they will work or they will fail, there's no mid-term.
Given that, to create those "non-critical" tests you should not use assertions when you don't want the tests to fail. You should do this carefully so you don't compromise the "usefulness" of the test.
My advice to your OCR example is that you use something to record the success rate in your tests code and then create one assertion like: "assert success_rate > 8.5", and that should give the effect you desire.

Thank you for the great answers. No only one answer was really complete, so I'm writing here a combination of all answers that helped me. If you like this answer, please vote up the people who were responsible for this.
Conclusions
Unit tests (or at least unit tests in unittest module) are binary. As Guilherme Chapiewski says: they will work or they will fail, there's no mid-term.
Thus, my conclusion is that unit tests are not exactly the right tool for this job. It seems that unit tests are more concerned about "keep everything working, no failure is expected", and thus I can't (or it's not easy) to have non-binary tests.
So, unit tests don't seem the right tool if I'm trying to improve an algorithm or an implementation, because unit tests can't tell me how better is one version when compared to the other (supposing both of them are correctly implemented, then both will pass all unit tests).
My final solution
My final solution is based on ryber's idea and code shown in wcoenen answer. I'm basically extending the default TextTestRunner and making it less verbose. Then, my main code call two test suits: the critical one using the standard TextTestRunner, and the non-critical one, with my own less-verbose version.
class _TerseTextTestResult(unittest._TextTestResult):
def printErrorList(self, flavour, errors):
for test, err in errors:
#self.stream.writeln(self.separator1)
self.stream.writeln("%s: %s" % (flavour,self.getDescription(test)))
#self.stream.writeln(self.separator2)
#self.stream.writeln("%s" % err)
class TerseTextTestRunner(unittest.TextTestRunner):
def _makeResult(self):
return _TerseTextTestResult(self.stream, self.descriptions, self.verbosity)
if __name__ == '__main__':
sys.stderr.write("Running non-critical tests:\n")
non_critical_suite = unittest.TestLoader().loadTestsFromTestCase(TestSomethingNonCritical)
TerseTextTestRunner(verbosity=1).run(non_critical_suite)
sys.stderr.write("\n")
sys.stderr.write("Running CRITICAL tests:\n")
suite = unittest.TestLoader().loadTestsFromTestCase(TestEverythingImportant)
unittest.TextTestRunner(verbosity=1).run(suite)
Possible improvements
It should still be useful to know if there is any testing framework with non-binary tests, like Kathy Van Stone suggested. Probably I won't use it this simple personal project, but it might be useful on future projects.

Im not totally sure how unittest works, but most unit testing frameworks have something akin to categories. I suppose you could just categorize such tests, mark them to be ignored, and then run them only when your interested in them. But I know from experience that ignored tests very quickly become...just that ignored tests that nobody ever runs and are therefore a waste of time and energy to write them.
My advice is for your app to do, or do not, there is no try.

From unittest documentation which you link:
Instead of unittest.main(), there are
other ways to run the tests with a
finer level of control, less terse
output, and no requirement to be run
from the command line. For example,
the last two lines may be replaced
with:
suite = unittest.TestLoader().loadTestsFromTestCase(TestSequenceFunctions)
unittest.TextTestRunner(verbosity=2).run(suite)
In your case, you can create separate TestSuite instances for the criticial and non-critical tests. You could control which suite is passed to the test runner with a command line argument. Test suites can also contain other test suites so you can create big hierarchies if you want.

Python 2.7 (and 3.1) added support for skipping some test methods or test cases, as well as marking some tests as expected failure.
http://docs.python.org/library/unittest.html#skipping-tests-and-expected-failures
Tests marked as expected failure won't be counted as failure on a TestResult.

There are some test systems that allow warnings rather than failures, but test_unit is not one of them (I don't know which ones do, offhand) unless you want to extend it (which is possible).
You can make the tests so that they log warnings rather than fail.
Another way to handle this is to separate out the tests and only run them to get the pass/fail reports and not have any build dependencies (this depends on your build setup).

Take a look at Nose : http://somethingaboutorange.com/mrl/projects/nose/0.11.1/
There are plenty of command line options for selecting tests to run, and you can keep your existing unittest tests.

Another possibility is to create a "B" branch (you ARE using some sort of version control, right?) and have your unit tests for "B" in there. That way, you keep your release version's unit tests clean (Look, all dots!), but still have tests for B. If you're using a modern version control system like git or mercurial (I'm partial to mercurial), branching/cloning and merging are trivial operations, so that's what I'd recommend.
However, I think you're using tests for something they're not meant to do. The real question is "How important to you is it that 'B' works?" Because your test suite should only have tests in it that you care whether they pass or fail. Tests that, if they fail, it means the code is broken. That's why I suggested only testing "B" in the "B" branch, since that would be the branch where you are developing the "B" feature.
You could test using logger or print commands, if you like. But if you don't care enough that it's broken to have it flagged in your unit tests, I'd seriously question whether you care enough to test it at all. Besides, that adds needless complexity (extra variables to set debug level, multiple testing vectors that are completely independent of each other yet operate within the same space, causing potential collisions and errors, etc, etc). Unless you're developing a "Hello, World!" app, I suspect your problem set is complicated enough without adding additional, unnecessary complications.

You could write your test so that they count success rate.
With OCR you could throw at code 1000 images and require that 95% is successful.
If your program must work with type A then if this fails the test fails. If it's not required to work with B, what is the value of doing such a test ?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.