overriding the way Python unittest prints results - python

I am utterly confused by the unittest documentation: TestResult, TestLoader, testing framework, etc.
I just want to tweak the way the final results of a test run are printed out.
I have a specific thing I need to do: I am in fact using Jython, so when a bit of code raises an ExecutionException I need to dig down into the cause of this exception (ExecutionException.getCause()) to find the "real" exception which occurred, where it occurred, etc. At the moment I am just getting the location of the Future.get() which raises such an exception, and the message from the original exception (with no location). Useful, but could be improved.
Shouldn't it (in principle) be really simple to find out the object responsible for outputting the results of the testing and override some method like "print_result"...
There is another question here: Overriding Python Unit Test module for custom output? [code updated]
... this has no answers, and although the questioner said 9 months ago that he had "solved" it, he hasn't provided an answer. In any event it looks horribly complicated for what is a not unreasonable way of wishing to tweak things mildly... isn't there a simple way to do this?
later, answer to MartinBroadhurst's question about documenting during the run:
In fact I could laboriously surround all sorts of bits of code with try...except followed by a documentation function ... but if I don't do that any unexpected exceptions obviously get ejected, ultimately being caught by the testing framework.
In fact I have a decorator which I've made, #vigil( is_EDT ) (boolean param), which I use to decorate most methods and functions, the primary function of which is to check that the decorated method is being called in the "right kind of thread" (i.e. either the EDT or a non-EDT thread). This could be extended to trap any kinds of exceptions ... which is something I did previously as a solution to this problem of mine. This then printed out the exception details there and then, which was fine: the stuff was obviously not printed out at the same time as the results of the unittest run, but it was useful.
But in fact I shouldn't need to resort to my vigil function in this "make-and-mend" way! It really should be possible to tweak the unittest classes to override the way an exception is handled! Ultimately, unless some unittest guru can answer this question of mine, I'm going to have to examine the unittest source code and find out a way that way.
In a previous question of mine I asked about what appear to be a couple of non-functioning methods of unittest.TestResult... and it does regretfully appear this is not implemented as the Python documentation claims. Similarly, a little bit of additional experimentation just now seems to suggest more misdocumentation : on the python documentation page for unittest they appear to have incorrectly documented TestResult.startTest(), stopTest(), etc.: the parameter "test" should not be there (the convention in this documentation appears to be to omit the self param, and each of these methods takes only the self param).
In short, the whole unittest module is surprisingly unwieldy and dodgy... I'm surprised not least because I would have thought others in more influential positions than me would have got things changed...

Related

PyTest and controlled test failure for GitHub Workflow/Automation

Although I've been using Python for a number of years now, I realised that working predominantly on personal projects, I never needed to do Unit testing before, so apologies for the obvious questions or wrong assumptions I might make.
My goal is to understand how I can make tests and possibly combine everything with the GitHub workflow to create some automation.
I've seen Failures/Errors (which are conceptually different) thrown locally are not treated differently once online. But before I go, I have some doubts that I want to clarify.
From reading online, my initial understanding seems to be that a test should always SUCCEED, even if it contains errors or failure.
But if it succeeds, how can then I record a failure or an error? So I'm tempted to say I'm capturing this in the wrong way?
I appreciate that in an Agile environment, some would like to say it's a controlled process, and errors can be intercepted while looking into the code. But I'm not sure this is the best approach.And this leads me to the second question.
Say I have a function accepting dates, and I know that it cannot accept anything else than that.
Would it make sense to do a test to say pass in strings (and get
a failure)?
Or should I test only for the expected circumstances?
Say case 1) is a best practice; what should I do in the context of running these tests? Should I let the test fail and get a long list of errors? Or should I decorate functions with a #pytest.mark.xfail() (a sort of Soft fail, where I can use a try ... catch)?
And last question (for now): would an xfail decorator let the workflow automation consider the test as "passed". Probably not, but at this stage, I've so much confusion in my head that any clarity from experienced users could help.
Thanks for your patience in reading.
The question is a bit fuzzy, but I will have a shot.
The notion that tests should always succeed even if they have errors is probably a misunderstanding. Failing tests are errors and should be shown as such (with the exception of tests known to fail, but that is a special case, see below). From the comment I guess what was actually meant was that other tests shall continue to run, even if one test failed - that makes certainly sense, especially in CI tests, where you want to get the whole picture.
If you have a function accepting dates and nothing else, it shall be tested that it indeed only accepts dates, and raises an exception or something in the case an invalid date is given. What I meant in the comment is if your software ensures that only a date can be passed to that function, and this is also ensured via tests, it would not be needed to test this again, but in general - yes, this should be tested.
So, to give a few examples: if your function is specified to raise an exception on invalid input, this has to be tested using something like pytest.raises - it would fail, if no exception is raised. If your function shall handle invalid dates by logging an error, the test shall verify that the error is logged. If an invalid input should just be ignored, the test shall ensure that no exception is raised and the state does not change.
For xfail, I just refer you to the pytest documentation, where this is described nicely:
An xfail means that you expect a test to fail for some reason. A common example is a test for a feature not yet implemented, or a bug not yet fixed. When a test passes despite being expected to fail (marked with pytest.mark.xfail), it’s an xpass and will be reported in the test summary.
So a passing xfail test will be shown as passed indeed. You can easily test this yourself:
import pytest
#pytest.mark.xfail
def test_fails():
assert False
#pytest.mark.xfail
def test_succeeds():
assert True
gives something like:
============================= test session starts =============================
collecting ... collected 2 items
test_xfail.py::test_fails
test_xfail.py::test_succeeds
======================== 1 xfailed, 1 xpassed in 0.35s ========================
and the test is considered passed (e.g. has the exit code 0).

Is there any good reason to catch exceptions in unittest transactions?

The unittest module is extremely good to detect problems in code.
I understand the idea of isolating and testing parts of code with assertions:
self.assertEqual(web_page_view.func, web_page_url)
But besides these assertions you also might have some logic before it, in the same test method, that could have problems.
I am wondering if manual exception handling is something to take in account ever inside methods of a TestCase subclass.
Because if I wrap a block in a try-catch, if something fails, the test returns OK and does not fail:
def test_simulate_requests(self):
"""
Simulate requests to a url
"""
try:
response = self.client.get('/adress/of/page/')
self.assertEqual(response.status_code, 200)
except Exception as e:
print("error: ", e)
Should exception handling be always avoided in such tests?
First part of answer:
As you correctly say, there needs to be some logic before the actual test. The code belonging to a unit-test can be clustered into four parts (I use Meszaros' terminology in the following): setup, exercise, verify, teardown. Often the code of a test case is structured such that the code for the four parts are cleanly separated and come in that precise order - this is called the four phase test pattern.
The exercise phase is the heart of the test, where the functionality is executed that shall be checked in the test. The setup ensures that this happens in a well defined context. So, what you have described is in this terminology the situation that during setup something fails. Which means, that the preconditions are not met which are required for a meaningful execution of the functionality that is to be tested.
This is a common situation and it means that you in fact need to be able to distinguish three outcomes of a test: A test can pass successfully, it can fail, or it can just be meaningless.
Fortunately, there is an answer for this in python: You can skip tests, and if a test is skipped this is recorded, but neither as a failure nor as a success. Skipping of tests would probably a better way to handle the situation that you have shown in your example. Here is a small code piece that demonstrates one way of skipping tests:
import unittest
class TestException(unittest.TestCase):
def test_skipTest_shallSkip(self):
self.skipTest("Skipped because skipping shall be demonstrated.")
Second part of answer:
Your test seems to have some non-deterministic elements. The self.client.get can throw exceptions (but only sometimes - sometimes it doesn't). This means you do not have the context during the test execution under control. In unit-testing this is a situation you should try to avoid. Your tests should have a deterministic behavior.
One typical way to achieve this is to isolate your code from the components that are responsible for the nondeterminism and during testing replace these components by mocks. The behaviour of the mocks is under full control of the test code. Thus, if your code uses some component for network accesses, you would mock that component. Then, in some test cases you can instruct the mock to simulate a successful network communication to see how your component handles this, and in other tests you instruct the mock to simulate a network failure to see how your component copes with this situation.
There are two "bad" states of a test: Failure (when one of the assertions fails) and Error (when the test itself fails - your case).
First of all, it goes without saying that it's better to build your test in such a way that it reaches its assertions.
If you need to assert some tested code raises an exception, you should use with self.assertRaises(ExpectedError)
If some code inside the test raises an exception - it's better to know it from 'Error' result than seeing 'OK all tests have passed'
If your test logic really assumes that something can fail in the test itself and it is normal behaviour - probably the test is wrong. May be you should use mocks (https://docs.python.org/3/library/unittest.mock.html) to imitate an api call or something else.
In your case, even if the test fails, you catch it with bare except and say "Ok, continue". Anyway the implementation is wrong.
Finally: no, there shouldn't be except in your test cases
P.S. it's better to call your test functions with test_what_you_want_to_test_name, in this case probably test_successful_request would be ok.

Using exception handler to extend the functionality of default methods (Python taken as example)

So, I'm new to programming and my question is:
Is it considered a bad practice to use an exception handler to override error-message-behaviour of default methods of a programming language with custom functionality? I mean, is it ethically correct to use something like this (Python):
def index(p, val):
try:
return p.index(val)
except ValueError:
return -1
Maybe I wasn't precise enough. What I meant is: is it a normal or not-recommended practice to consider thrown exceptions (well, I guess it's not applicable everywhere) as legit and valid case-statements?
Like, the idea of the example given above is not to make a custom error message, but to suppress possible errors happening without warning neither users nor other program modules, that something is going wrong.
I think that doing something like this is OK as long as you use function names which make it clear that the user isn't using a built-in. If the user thinks they're using a builtin and all of a sudden index returns -1, imagine the bugs that could happen ... They do:
a[index(a,'foo')]
and all of a sudden they get the last element in the list (which isn't foo).
As a very important rule though, Only handle exceptions that you know what to do with. Your example above does this nicely. Kudos.
This is perfectly fine but depends on what kind of condition you are checking. It is the developers responsibility to check for these conditions. Some exceptions are fatal for the program and some may not. All depends on the context of the method.
With a language like python, I would argue it is much better to give a custom error message for the function than the generic ValueError exception.
However, for your own applications, having this functionality inside your methods can make code easier to read and maintain.
For other languages, the same is true, but you should try and make sure that you don't mimick another function with a different behaviour, whilst hiding the Exceptions.
If you know where exactly your errors will occur and the cause of error too, then there is nothing wrong with such kind of handling. Becaues you are just taking appropriate action for something wrong happening, that you know can happen .
So, For E.g: - If you are trying to divide two numbers, and you know that if the denominator is 0, then you can't divide, then in that case you can use a custom message to denote the problem.

How often are builtin exceptions dealt with in comparison to user defined ones?

I'm doing a bit of research for my final year project. It's mostly about creating a more convenient way of dealing with Exceptions thrown in programs. It does this by creating a custom handler for each type of Exception. I was wondering how often are builtin/standard library Exceptions are dealt with in comparison to Exception by you/3rd party software?
Why I'm asking is two fold:
I would like my demonstration as to more realistic. My project has the chance to be more help than just dealing with Exceptions so given the chance, I would rather work on giving the tool far more abilities. Given this, I would like my sample handlers to be bias in the "right" direction.
It will influence how detailed I can make the API to help create more detailed Exceptions and Exception Handlers.
Thanks for taking the time to read this crap.
EDIT:
I'll break it down because I dont think I'm explaining it properly.
The nice little stack trace you get when errors thrown about? I want to try and improve it and see if something before could indicate when it all started to go wrong(for some errors might need a different strategy for others and that's where defining handlers come in). I think I could do this. In order to do this, I need to divide my time accordingly. I want to know whether I should focus on tackling builtin errors or helping people define their handlers for their Exceptions(maybe this second is pointless but I can't know until I ask people). I'll do this by asking people about their experiences.
EDIT2:
I'm a dumbass, I mean errors not exceptions. I need sleep.
Regardless of what you're trying to do with the answer, I'll answer your specific question:
how often are builtin/standard library
Exceptions are dealt with in
comparison to Exception by you/3rd
party software?
It depends on the domain. Some areas lend themselves to defining specific exceptions (e.g. web programming), and others tend to rely on builtins (e.g. mathematical and scientific computing). The amount of exceptions handled probably leans more towards "standard" exceptions like TypeError or NameError, but harder errors usually aren't contained in the builtins (it's easy to fix an incorrect argument, invalid input, or a typo, which are common causes of exceptions like NameError or TypeError, but it's hard to fix an error that doesn't come from something so simple).
So, IMO, standard exceptions are more prevalent but the ones defined by users, frameworks, etc. are more useful/important (or whatever you want to call it) because they represent more complex and relevant issues.
You can always look at some popular 3rd party Python modules (Google code is a good place to look) and see how many of them define exceptions.

Python - doctest vs. unittest [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm trying to get started with unit testing in Python and I was wondering if someone could explain the advantages and disadvantages of doctest and unittest.
What conditions would you use each for?
Both are valuable. I use both doctest and nose taking the place of unittest. I use doctest for cases where the test is giving an example of usage that is actually useful as documentation. Generally I don't make these tests comprehensive, aiming solely for informative. I'm effectively using doctest in reverse: not to test my code is correct based on my doctest, but to check that my documentation is correct based on the code.
The reason is that I find comprehensive doctests will clutter your documentation far too much, so you will either end up with either unusable docstrings, or incomplete testing.
For actually testing the code, the goal is to thoroughly test every case, rather than illustrate what is does by example, which is a different goal which I think is better met by other frameworks.
I use unittest almost exclusively.
Once in a while, I'll put some stuff in a docstring that's usable by doctest.
95% of the test cases are unittest.
Why? I like keeping docstrings somewhat shorter and more to the point. Sometimes test cases help clarify a docstring. Most of the time, the application's test cases are too long for a docstring.
Another advantage of doctesting is that you get to make sure your code does what your documentation says it does. After a while, software changes can make your documentation and code do different things. :-)
I work as a bioinformatician, and most of the code I write is "one time, one task" scripts, code that will be run only once or twice and that execute a single specific task.
In this situation, writing big unittests may be overkill, and doctests are an useful compromise. They are quicker to write, and since they are usually incorporated in the code, they allow to always keep an eye on how the code should behave, without having to have another file open. That's useful when writing small script.
Also, doctests are useful when you have to pass your script to a researcher that is not expert in programming. Some people find it very difficult to understand how unittests are structured; on the other hand, doctests are simple examples of usage, so people can just copy and paste them to see how to use them.
So, to resume my answer: doctests are useful when you have to write small scripts, and when you have to pass them or show them to researchers that are not computer scientists.
If you're just getting started with the idea of unit testing, I would start with doctest because it is so simple to use. It also naturally provides some level of documentation. And for more comprehensive testing with doctest, you can place tests in an external file so it doesn't clutter up your documentation.
I would suggest unittest if you're coming from a background of having used JUnit or something similar, where you want to be able to write unit tests in generally the same way as you have been elsewhere.
I don't use doctest as a replacement for unittest. Although they overlap a bit, the two modules don't have the same function:
I use unittest as a unit testing framework, meaning it helps me determine quickly the impact of any modification on the rest of the code.
I use doctest as a guarantee that comments (namely docstrings) are still relevant to current version of the code.
The widely documented benefits of test driven development I get from unittest. doctest solves the far more subtle danger of having outdated comments misleading the maintenance of the code.
I use unittest exclusively; I think doctest clutters up the main module too much. This probably has to do with writing thorough tests.
Using both is a valid and rather simple option. The doctest module provides the DoctTestSuite and DocFileSuite methods which create a unittest-compatible testsuite from a module or file, respectively.
So I use both and typically use doctest for simple tests with functions that require little or no setup (simple types for arguments). I actually think a few doctest tests help document the function, rather than detract from it.
But for more complicated cases, and for a more comprehensive set of test cases, I use unittest which provides more control and flexibility.
I almost never use doctests. I want my code to be self documenting, and the docstrings provide the documentation to the user. IMO adding hundreds of lines of tests to a module makes the docstrings far less readable. I also find unit tests easier to modify when needed.
Doctest can some times lead to wrong result. Especially when output contains escape sequences. For example
def convert():
"""
>>> convert()
'\xe0\xa4\x95'
"""
a = '\xe0\xa4\x95'
return a
import doctest
doctest.testmod()
gives
**********************************************************************
File "hindi.py", line 3, in __main__.convert
Failed example:
convert()
Expected:
'क'
Got:
'\xe0\xa4\x95'
**********************************************************************
1 items had failures:
1 of 1 in __main__.convert
***Test Failed*** 1 failures.
Also doesn't check the type of the output. It just compares the output strings. For example it have made some type rational which prints just like integer if it is a whole number. Then suppose you have function which return rational. So, a doctest won't differentiate if the output is rational whole number or a integer number.
I prefer the discovery based systems ("nose" and "py.test", using the former currently).
doctest is nice when the test is also good as a documentation, otherwise they tend to clutter the code too much.

Categories