python unittest fail caused by rounding error - python

I build the class for geometric transformation. When I run a Unit test it fails because of rounding errors coming from the operations inside my methods.
In my test I compare the result from one of the method which should return the point (2,2,0), but because of rounding errors it returns (1.9999999999999996, 1.9999999999999996, 0.0)
Finding files... done.
Importing test modules ... done.
** DEBUG_45 from the method point=(1.9999999999999996, 1.9999999999999996, 0.0)
======================================================================
FAIL: testPointCoord (vectOper.test.TestNearestPoint.TestNearestPoint)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\Users\src\vectOper\test\TestNearestPoint.py", line 14, in testPointCoord
self.assertEqual(pointCoord, (2,2,0), "nearest point failed")
AssertionError: nearest point failed
----------------------------------------------------------------------
Ran 1 test in 0.001s
FAILED (failures=1)
From the calculation point of view it is acceptable, but I don't want my code to fail on the simple unit test.
import unittest
from vectOper.nearestPoint import NearestPoint
class TestNearestPoint(unittest.TestCase):
def testPointCoord(self):
nearestPoint = NearestPoint()
pointCoord = nearestPoint.pointCoord(samplePoint=(2,2,2),lineStart=(0,0,0), lineVect=(1,1,0))
self.assertEqual(pointCoord, (2,2,0), "nearest point failed")
What is a correct way to resolve problem like that? Obviously I cannot round up the output numbers or convert them to integers as normally it is not the case.
Is there a way to code unit test to ignore rounding error?
Is there any other way to resolve problem?
Edit:
The question can be solved by using self.assertAlmostEqual as rightly suggested in another answer but the problem is that I need to test entrance of a tuple. After all suggestions I try to do:
def testPointCoord(self):
nearestPoint = NearestPoint()
pointCoord = nearestPoint.pointCoord(samplePoint=(2,2,2),lineStart=(0,0,0), lineVect=(1,1,0))
self.assertAlmostEqual(pointCoord[0], 2, places=7, msg="nearest point x-coodr failed")
self.assertAlmostEqual(pointCoord[1], 2, places=7, msg="nearest point y-coodr failed")
self.assertAlmostEqual(pointCoord[2], 0, places=7, msg="nearest point z-coodr failed")
but I need to automatise it somehow as later I need to test a list of tuples as the sample points' coordinates for a vector field.
The solution suggested as a duplicate is only a half measure as it would be a bit tedious write 300 more comparisons if there is 100 tuples in the list.

Why don't you use assertAlmostEqual in each dimension using map?
I don`t have access to your class, so i wrote a similar example here:
from unittest import TestCase
class Test_Tuple_Equality(TestCase):
def test_tuple_equality_True(self):
p1 = (0.00000001, 0.00000000001, 0)
p2 = (0,0,0)
map(lambda x, y: self.assertAlmostEqual(x,y), p1, p2)
def test_tuple_equality_False(self):
p1 = (0.00000001, 0.00000000001, 0)
p2 = (1,0,0)
map(lambda x, y: self.assertAlmostEqual(x,y), p1, p2)
Map will transform your a n-dimension tuple comparisson into n floats comparissons.
You can even create a compare_points function, like:
def compare_points(self, p1, p2):
map(lambda x,y: self.assertAlmostEqual(x,y), p1,p2)
And then use it in your tests
Another solution is to use numpy`s method for that:
import numpy
>>>numpy.testing.assert_almost_equal((2,2,0), (1.9999999999,2,0), decimal=7, err_msg='', verbose=True)
Numpy is a pain to install, but, if you already use it, it would be the best fit.

Related

How to write multi-threaded version of recursive Fibonacci algorithm in Python

I am trying to understand parallelization in Python better. I'm having trouble implementing a parallelized Fibonacci algorithm.
I've been reading the docs on multiprocessing and threading, but haven't had luck.
# TODO Figure out how threads work
# TODO Do a Fibonacci counter
import concurrent.futures
def fib(pos, _tpe):
"""Return the Fibonacci number at position."""
if pos < 2:
return pos
x = fib(pos - 1, None)
y = fib(pos - 2, None)
return x + y
def fibp(pos, tpe):
"""Return the Fibonacci number at position."""
if pos < 2:
return pos
x = tpe.submit(fib, (pos - 1), tpe).result()
y = tpe.submit(fib, (pos - 2), tpe).result()
return x + y
if __name__ == '__main__':
import sys
with concurrent.futures.ThreadPoolExecutor() as cftpe:
fun = fibp if len(sys.argv) is 3 else fib
position = int(sys.argv[1])
print(fun(position, cftpe))
print(fun.__name__)
Command line output:
$ time python test_fib_parallel.py 35
9227465
fib
real 0m3.778s
user 0m3.746s
sys 0m0.017s
$ time python test_fib_parallel.py 35 dummy-var
9227465
fibp
real 0m3.776s
user 0m3.749s
sys 0m0.018s
I thought the timing should be significantly different, but it is not.
UPDATE
Okay, so the tips helped and I got it to work (much faster than the original version). Ignoring the toy code in the original post, I'm putting a snippet of the actual problem I was trying to solve, which was to grow a decision tree (for homework). This was a CPU bound problem. It was solved using a ProcessPoolExecutor and a Manager.
def induce_tree(data, splitter, out=stdout, print_bool=True):
"""Greedily induce a decision tree from the given training data."""
from multiprocessing import Manager
from concurrent.futures import ProcessPoolExecutor
output_root = Node(depth=1)
input_root = Node(depth=1, data=data)
proxy_stack = Manager().list([(input_root, output_root)])
pool_exec = ProcessPoolExecutor()
while proxy_stack:
# while proxy_stack:
# pass proxy stack to function
# return data tree
current = proxy_stack.pop()
if current[0] is not ():
pool_exec.submit(grow_tree, proxy_stack, current, splitter)
finish_output_tree(input_root, output_root)
return output_root
UPDATE AGAIN
This is still buggy because I'm not managing the splitter correctly between processes. I'm going to try something else.
LAST UPDATE
I was trying to parallelize at too high of a level in my code. I had to drill down to the part in my splitter object where the bottleneck was. The parallel version is about twice as fast. (The error is high because this is just a test sample without enough data to actually train a good tree.)
The code ended up looking like this
inputs = [(data, idx, depth) for idx in range(data.shape[1] - 1)]
if self._splittable(data, depth):
with ProcessPoolExecutor() as executor:
for output in executor.map(_best_split_help, inputs):
# ... ...
SEQUENTIAL
$ time python main.py data/spambase/spam50.data
Test Fold: 1, Error: 0.166667
Test Fold: 2, Error: 0.583333
Test Fold: 3, Error: 0.583333
Test Fold: 4, Error: 0.230769
Average training error: 0.391026
real 0m29.178s
user 0m28.924s
sys 0m0.106s
PARALLEL
$ time python main.py data/spambase/spam50.data
Test Fold: 1, Error: 0.166667
Test Fold: 2, Error: 0.583333
Test Fold: 3, Error: 0.583333
Test Fold: 4, Error: 0.384615
Average training error: 0.429487
real 0m14.419s
user 0m50.748s
sys 0m1.396s

python self calling unit test function not raising an error

I have a background in C and Fortran programming, however I have been trying to learn Python and object orientation. To help with some of my projects I have been trying to define some additional unit tests.
I have used the AssertAlmostEqual unit test, but I found that for large numbers it doesn't work so well, as it works to 7 decimal places (I think). When testing large exponents this becomes a bit useless. So I tried to define an assertEqualSigFig test for significant figures in stead of decimal places. This test was inspired by a stack over flow post, however I cannot find the original post I'm afraid.
This test works for integers floats and booleans however I wanted to see if it would also work with complex numbers. By splitting the numbers into the real and imaginary components and then calling itself. When this happens, no assertion Error is raised and I'm not sure why.
Here is my code:
import unittest
import math
class MyTestClass(unittest.TestCase):
"""
MyTestClass
Adds additional tests to the unit test module:
defines:
- AssertEqualSigFig
description:
- Used in place of the assertAlmostEqualTest, this tests two values
are the same to 7 significant figures (instead of decimal places)
args:
- any two integers, booleans, floats or complex number
returns:
- assertion error if not equal to defined significant figures
"""
def AssertEqualSigFig(self, expected, actual, sig_fig = 7):
if sig_fig < 1:
msg = "sig fig must be more than 1"
raise ValueError(msg)
try:
if isinstance(expected, bool):
if expected != actual:
raise AssertionError
else:
return
elif isinstance(expected, (int,float)):
pow_ex = int(math.floor(math.log(expected,10)))
pow_ac = int(math.floor(math.log(actual,10)))
tolerance = pow_ex - sig_fig + 1
tolerance = (10** tolerance)/2.0
if abs(expected - actual) > tolerance:
raise AssertionError
else:
return
elif isinstance(expected, complex):
#this part doesnt raise an error when it should
a_real = actual.real
a_imag = actual.imag
e_real = expected.real
e_imag = expected.imag
self.AssertEqualSigFig(self, a_imag, e_imag)
self.AssertEqualSigFig(self, a_real, e_real)
except AssertionError:
msg = "{0} ! = {1} to {2} sig fig".format(expected, actual, sig_fig)
raise AssertionError(msg)
This test fails when complex numbers are involved. Here are the unit tests of the unit test that it fails:
import unittest
from MyTestClass import MyTestClass
class TestMyTestClass(MyTestClass):
def test_comlex_imag_NE(self):
a = complex(10,123455)
b = complex(10,123333)
self.assertRaises(AssertionError, self.AssertEqualSigFig, a, b)
def test_complex_real_NE(self):
a = complex(2222222,10)
b = complex(1111111,10)
self.assertRaises(AssertionError, self.AssertEqualSigFig, a, b)
if __name__ == "__main__":
unittest.main()
I think it is because the self.AssertEqualSigFig call does not raise an error. I'm sure there is a silly thing I have missed, But I am still learning. Can anybody help?
I was being an idiot, I have found the solution
I should have been using
MyTestClass.assertEqualSigFig
and not
self.assertEqualSigFig

Error with quantifier in Z3Py

I would like Z3 to check whether it exists an integer t that satisfies my formula. I'm getting the following error:
Traceback (most recent call last):
File "D:/z3-4.6.0-x64-win/bin/python/Expl20180725.py", line 18, in <module>
g = ForAll(t, f1(t) == And(t>=0, t<10, user[t].rights == ["read"] ))
TypeError: list indices must be integers or slices, not ArithRef
Code:
from z3 import *
import random
from random import randrange
class Struct:
def __init__(self, **entries): self.__dict__.update(entries)
user = [Struct() for i in range(10)]
for i in range(10):
user[i].uid = i
user[i].rights = random.choice(["create","execute","read"])
s=Solver()
f1 = Function('f1', IntSort(), BoolSort())
t = Int('t')
f2 = Exists(t, f1(t))
g = ForAll(t, f1(t) == And(t>=0, t<10, user[t].rights == ["read"] ))
s.add(g)
s.add(f2)
print(s.check())
print(s.model())
You are mixing and matching Python and Z3 expressions, and while that is the whole point of Z3py, it definitely does not mean that you can mix/match them arbitrarily. In general, you should keep all the "concrete" parts in Python, and relegate the symbolic parts to "z3"; carefully coordinating the interaction in between. In your particular case, you are accessing a Python list (your user) with a symbolic z3 integer (t), and that is certainly not something that is allowed. You have to use a Z3 symbolic Array to access with a symbolic index.
The other issue is the use of strings ("create"/"read" etc.) and expecting them to have meanings in the symbolic world. That is also not how z3py is intended to be used. If you want them to mean something in the symbolic world, you'll have to model them explicitly.
I'd strongly recommend reading through http://ericpony.github.io/z3py-tutorial/guide-examples.htm which is a great introduction to z3py including many of the advanced features.
Having said all that, I'd be inclined to code your example as follows:
from z3 import *
import random
Right, (create, execute, read) = EnumSort('Right', ('create', 'execute', 'read'))
users = Array('Users', IntSort(), Right)
for i in range(10):
users = Store(users, i, random.choice([create, execute, read]))
s = Solver()
t = Int('t')
s.add(t >= 0)
s.add(t < 10)
s.add(users[t] == read)
r = s.check()
if r == sat:
print s.model()[t]
else:
print r
Note how the enumerated type Right in the symbolic land is used to model your "permissions."
When I run this program multiple times, I get:
$ python a.py
5
$ python a.py
9
$ python a.py
unsat
$ python a.py
6
Note how unsat is produced, if it happens that the "random" initialization didn't put any users with a read permission.

fminbound for a simple equation

def profits(q):
range_price = range_p(q)
range_profits = [(x-c(q))*demand(q,x) for x in range_price]
price = range_price[argmax(range_profits)] # recall from above that argmax(V) gives
# the position of the greatest element in a vector V
# further V[i] the element in position i of vector V
return (price-c(q))*demand(q,price)
print profits(0.6)
print profits(0.8)
print profits(1)
0.18
0.2
0.208333333333
With q (being quality) in [0,1], we know that the maximizing quality is 1. Now the question is, how can I solve such an equation? I keep getting the error that either q is not defined yet (which is only natural as we are looking for it) or I get the error that some of the arguments are wrong.
q_firm = optimize.fminbound(-profits(q),0,1)
This is what I've tried, but I get this error:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-99-b0a80dc20a3d> in <module>()
----> 1 q_firm = optimize.fminbound(-profits(q),0,1)
NameError: name 'q' is not defined
Can someone help me out? If I need to supply you guys with more information to the question let me know, it's my first time using this platform. Thanks in advance!
fminbound needs a callable, while profits(q) tries to calculate a single value. Use
fminbound(lambda q: -profits(q), 0, 1)
Note that the lambda above is only needed to generate a function for negative profits. Better define a function for -profits and feed it to fminbound.
Better still, use minimize_scalar instead of fminbound.

Dynamic Semantic errors in Python

i came across this as an interview question. This question seemed interesting. So, i am posting it here.
Consider the operation which gives semantic error like division by zero. By default, python compiler gives output like "Invalid Operation" or something. Can we control the output that is given out by Python compiler, like print some other error message, skip that division by zero operation, and carry on with rest of the instructions?
And also, how can i evaluate the cost of run-time semantic checks?
There are many python experts here. I am hoping someone will throw some light on this. Thanks in advance.
Can we control the output that is given out by Python compiler, like print some other error message, skip that division by zero operation, and carry on with rest of the instructions?
No, you cannot. You can manually wrap every dangerous command with a try...except block, but I'm assuming you're talking about an automatic recovery to specific lines within a try...except block, or even completely automatically.
By the time the error has fallen through such that sys.excepthook is called, or whatever outer scope if you catch it early, the inner scopes are gone. You can change line numbers with sys.settrace in CPython although that is only an implementation detail, but since the outer scopes are gone there is no reliable recorvery mechanism.
If you try to use the humorous goto April fools module (that uses the method I just described) to jump blocks even within a file:
from goto import goto, label
try:
1 / 0
label .foo
print("recovered")
except:
goto .foo
you get an error:
Traceback (most recent call last):
File "rcv.py", line 9, in <module>
goto .foo
File "rcv.py", line 9, in <module>
goto .foo
File "/home/joshua/src/goto-1.0/goto.py", line 272, in _trace
frame.f_lineno = targetLine
ValueError: can't jump into the middle of a block
so I'm pretty certain it's impossible.
And also, how can i evaluate the cost of run-time semantic checks?
I don't know what that is, but you're probably looking for a line_profiler:
import random
from line_profiler import LineProfiler
profiler = LineProfiler()
def profile(function):
profiler.add_function(function)
return function
#profile
def foo(a, b, c):
if not isinstance(a, int):
raise TypeError("Is this what you mean by a 'run-time semantic check'?")
d = b * c
d /= a
return d**a
profiler.enable()
for _ in range(10000):
try:
foo(random.choice([2, 4, 2, 5, 2, 3, "dsd"]), 4, 2)
except TypeError:
pass
profiler.print_stats()
output:
Timer unit: 1e-06 s
File: rcv.py
Function: foo at line 11
Total time: 0.095197 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
11 #profile
12 def foo(a, b, c):
13 10000 29767 3.0 31.3 if not isinstance(a, int):
14 1361 4891 3.6 5.1 raise TypeError("Is this what you mean by a 'run-time semantic check'?")
15
16 8639 20192 2.3 21.2 d = b * c
17 8639 20351 2.4 21.4 d /= a
18
19 8639 19996 2.3 21.0 return d**a
So the "run-time semantic check", in this case would be taking 36.4% of the time of running foo.
If you want to time specific blocks manually that are larger than you'd use timeit on but smaller than you'd want for a profiler, instead of using two time.time() calls (which is quite an inaccurate method) I suggest Steven D'Aprano's Stopwatch context manager.
I would just use an exception, this example is using python 3. For Python 2, simple remove the annotations after the function parameters. So you function signature would look like this -> f(a,b):
def f(a: int, b: int):
"""
#param a:
#param b:
"""
try:
c = a / b
print(c)
except ZeroDivisionError:
print("You idiot, you can't do that ! :P")
if __name__ == '__main__':
f(1, 0)
>>> from cheese import f
>>> f(0, 0)
You idiot, you can't do that ! :P
>>> f(0, 1)
0.0
>>> f(1, 0)
You idiot, you can't do that ! :P
>>> f(1, 1)
1.0
This is an example of how you could catch Zero Division, by making an exception case using ZeroDivisionError.
I won't go into any specific tools for making loggers, but you can indeed understand the costs associated with this kind of checking. You can put a start = time.time() at the start of the function and end = time.time() at the end. If you take the difference, you will get the execution time in seconds.
I hope that helps.

Categories