How to enforce relative constraints on hypothesis strategies? - python

Say I have 2 variables a and b where it is given that b > a, how then can I enforce this relative constraint on the hypothesis strategies?
from hypothesis import given, strategies as st
#given(st.integers(), st.integers())
def test_subtraction(a, b):
# only evaluates to true if b > a
# hence I'd like to enforce this constraint on the strategy
assert abs(b - a) == -(a - b)

(I've no idea how to do this in Python, but it's a common enough problem, so I hope you can use these F# FsCheck examples instead. The ideas are universal.)
Filtering
Most property-based frameworks come with an ability to filter values based on a predicate. In FsCheck it's the ==> operator. In QuickCheck the equivalent is called suchThat.
Using the ==> operator in FsCheck, you can write the property like this:
[<Property>]
let property_using_filtering (a : int) (b : int) =
b > a ==> lazy
Assert.Equal (abs (b - a), -(a - b))
(It's possible to write the test in a more terse and idiomatic style, but since I'm assuming that you may not be familiar with F#, I chose to be more explicit than usual.)
Notice that the predicate b > a precedes the filtering operator ==>. This means that the rest of the code to the right of, and below, the operator only runs when the predicate is true.
The framework is still going to generate entirely random values, so (assuming a uniform random distribution) it'll be throwing half of the generated values away.
Thus, to generate 100 (the default) valid test cases, it'll have to generate on average 200 test cases (i.e. 400 integers). Generating 400 integers instead of 200 integers probably isn't a big deal, but in general, this kind of filtering can be prohibitively wasteful.
Therefore, it's always useful to be aware of alternatives.
Seed and diff
When faced with this sort of problem, it usually helps to take an alternative look at how to generate values. How do you generate two values where one is strictly greater than the other?
You can generate a random value (the seed), which in this case will also serve as the first value itself. Then a second value will indicate the difference between the two.
Some property-based frameworks come with features where you can tell it to generate strictly positive numbers. FsCheck comes with those features, but assuming for the moment that not all frameworks can do this, you can still use an unconstrained random value.
In that case, the difference, being any random number, may be both negative, zero, or positive. In this case, we can take the absolute value of the number and then add one to ensure that it's strictly greater than zero. Now you have a number that's guaranteed to be greater than zero. If you add that to the first number, you're guaranteed to have a number greater than the first one:
[<Property>]
let property_using_seed_and_diff (seed : int) (diff : int) =
let a = seed
let b = a + 1 + abs diff
Assert.Equal (abs (b - a), -(a - b))
Here, (somewhat redundantly) we set a = seed and then b = a + 1 + abs diff according to the above description.
(I only included the redundant seed function parameter to illustrate the general idea. Sometimes, you need one or more values calculated from a seed, but not the seed itself. In the present case, however, the value and seed coincide.)

In addition to the filtering and seed-plus-diff approaches shown above, the "fix-it-up" approach can be useful: try to generate something valid, and just patch the object if it's not satisfied. In this case, that might look like:
#given(st.integers(), st.integers())
def test_subtraction(a, b):
a, b = sorted([a, b])
....
The advantage here is that the minimal failing example tends to look a bit more natural, and might have a nicer distribution than a seed-and-diff (or "constructive") approach. It also combines well with the other approaches, especially if you're defining your own strategy with #st.composite.

You can add these constrains using assume in hypothesis:
from hypothesis import assume, given, strategies as st
#given(st.integers(), st.integers())
def test_subtraction(a, b):
assume(b > a)
assert abs(b - a) == -(a - b)
See: https://hypothesis.readthedocs.io/en/latest/details.html#making-assumptions for more details

Related

Matching multiple floats in an IF statement [duplicate]

It's well known that comparing floats for equality is a little fiddly due to rounding and precision issues.
For example: Comparing Floating Point Numbers, 2012 Edition
What is the recommended way to deal with this in Python?
Is a standard library function for this somewhere?
Python 3.5 adds the math.isclose and cmath.isclose functions as described in PEP 485.
If you're using an earlier version of Python, the equivalent function is given in the documentation.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
rel_tol is a relative tolerance, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal.
abs_tol is an absolute tolerance that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal.
Something as simple as the following may be good enough:
return abs(f1 - f2) <= allowed_error
I would agree that Gareth's answer is probably most appropriate as a lightweight function/solution.
But I thought it would be helpful to note that if you are using NumPy or are considering it, there is a packaged function for this.
numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)
A little disclaimer though: installing NumPy can be a non-trivial experience depending on your platform.
Use Python's decimal module, which provides the Decimal class.
From the comments:
It is worth noting that if you're
doing math-heavy work and you don't
absolutely need the precision from
decimal, this can really bog things
down. Floats are way, way faster to
deal with, but imprecise. Decimals are
extremely precise but slow.
The common wisdom that floating-point numbers cannot be compared for equality is inaccurate. Floating-point numbers are no different from integers: If you evaluate "a == b", you will get true if they are identical numbers and false otherwise (with the understanding that two NaNs are of course not identical numbers).
The actual problem is this: If I have done some calculations and am not sure the two numbers I have to compare are exactly correct, then what? This problem is the same for floating-point as it is for integers. If you evaluate the integer expression "7/3*3", it will not compare equal to "7*3/3".
So suppose we asked "How do I compare integers for equality?" in such a situation. There is no single answer; what you should do depends on the specific situation, notably what sort of errors you have and what you want to achieve.
Here are some possible choices.
If you want to get a "true" result if the mathematically exact numbers would be equal, then you might try to use the properties of the calculations you perform to prove that you get the same errors in the two numbers. If that is feasible, and you compare two numbers that result from expressions that would give equal numbers if computed exactly, then you will get "true" from the comparison. Another approach is that you might analyze the properties of the calculations and prove that the error never exceeds a certain amount, perhaps an absolute amount or an amount relative to one of the inputs or one of the outputs. In that case, you can ask whether the two calculated numbers differ by at most that amount, and return "true" if they are within the interval. If you cannot prove an error bound, you might guess and hope for the best. One way of guessing is to evaluate many random samples and see what sort of distribution you get in the results.
Of course, since we only set the requirement that you get "true" if the mathematically exact results are equal, we left open the possibility that you get "true" even if they are unequal. (In fact, we can satisfy the requirement by always returning "true". This makes the calculation simple but is generally undesirable, so I will discuss improving the situation below.)
If you want to get a "false" result if the mathematically exact numbers would be unequal, you need to prove that your evaluation of the numbers yields different numbers if the mathematically exact numbers would be unequal. This may be impossible for practical purposes in many common situations. So let us consider an alternative.
A useful requirement might be that we get a "false" result if the mathematically exact numbers differ by more than a certain amount. For example, perhaps we are going to calculate where a ball thrown in a computer game traveled, and we want to know whether it struck a bat. In this case, we certainly want to get "true" if the ball strikes the bat, and we want to get "false" if the ball is far from the bat, and we can accept an incorrect "true" answer if the ball in a mathematically exact simulation missed the bat but is within a millimeter of hitting the bat. In that case, we need to prove (or guess/estimate) that our calculation of the ball's position and the bat's position have a combined error of at most one millimeter (for all positions of interest). This would allow us to always return "false" if the ball and bat are more than a millimeter apart, to return "true" if they touch, and to return "true" if they are close enough to be acceptable.
So, how you decide what to return when comparing floating-point numbers depends very much on your specific situation.
As to how you go about proving error bounds for calculations, that can be a complicated subject. Any floating-point implementation using the IEEE 754 standard in round-to-nearest mode returns the floating-point number nearest to the exact result for any basic operation (notably multiplication, division, addition, subtraction, square root). (In case of tie, round so the low bit is even.) (Be particularly careful about square root and division; your language implementation might use methods that do not conform to IEEE 754 for those.) Because of this requirement, we know the error in a single result is at most 1/2 of the value of the least significant bit. (If it were more, the rounding would have gone to a different number that is within 1/2 the value.)
Going on from there gets substantially more complicated; the next step is performing an operation where one of the inputs already has some error. For simple expressions, these errors can be followed through the calculations to reach a bound on the final error. In practice, this is only done in a few situations, such as working on a high-quality mathematics library. And, of course, you need precise control over exactly which operations are performed. High-level languages often give the compiler a lot of slack, so you might not know in which order operations are performed.
There is much more that could be (and is) written about this topic, but I have to stop there. In summary, the answer is: There is no library routine for this comparison because there is no single solution that fits most needs that is worth putting into a library routine. (If comparing with a relative or absolute error interval suffices for you, you can do it simply without a library routine.)
math.isclose() has been added to Python 3.5 for that (source code). Here is a port of it to Python 2. It's difference from one-liner of Mark Ransom is that it can handle "inf" and "-inf" properly.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
'''
Python 2 implementation of Python 3.5 math.isclose()
https://github.com/python/cpython/blob/v3.5.10/Modules/mathmodule.c#L1993
'''
# sanity check on the inputs
if rel_tol < 0 or abs_tol < 0:
raise ValueError("tolerances must be non-negative")
# short circuit exact equality -- needed to catch two infinities of
# the same sign. And perhaps speeds things up a bit sometimes.
if a == b:
return True
# This catches the case of two infinities of opposite sign, or
# one infinity and one finite number. Two infinities of opposite
# sign would otherwise have an infinite relative tolerance.
# Two infinities of the same sign are caught by the equality check
# above.
if math.isinf(a) or math.isinf(b):
return False
# now do the regular computation
# this is essentially the "weak" test from the Boost library
diff = math.fabs(b - a)
result = (((diff <= math.fabs(rel_tol * b)) or
(diff <= math.fabs(rel_tol * a))) or
(diff <= abs_tol))
return result
I'm not aware of anything in the Python standard library (or elsewhere) that implements Dawson's AlmostEqual2sComplement function. If that's the sort of behaviour you want, you'll have to implement it yourself. (In which case, rather than using Dawson's clever bitwise hacks you'd probably do better to use more conventional tests of the form if abs(a-b) <= eps1*(abs(a)+abs(b)) + eps2 or similar. To get Dawson-like behaviour you might say something like if abs(a-b) <= eps*max(EPS,abs(a),abs(b)) for some small fixed EPS; this isn't exactly the same as Dawson, but it's similar in spirit.
If you want to use it in testing/TDD context, I'd say this is a standard way:
from nose.tools import assert_almost_equals
assert_almost_equals(x, y, places=7) # The default is 7
In terms of absolute error, you can just check
if abs(a - b) <= error:
print("Almost equal")
Some information of why float act weird in Python:
Python 3 Tutorial 03 - if-else, logical operators and top beginner mistakes
You can also use math.isclose for relative errors.
This is useful for the case where you want to make sure two numbers are the same 'up to precision', and there isn't any need to specify the tolerance:
Find minimum precision of the two numbers
Round both of them to minimum precision and compare
def isclose(a, b):
astr = str(a)
aprec = len(astr.split('.')[1]) if '.' in astr else 0
bstr = str(b)
bprec = len(bstr.split('.')[1]) if '.' in bstr else 0
prec = min(aprec, bprec)
return round(a, prec) == round(b, prec)
As written, it only works for numbers without the 'e' in their string representation (meaning 0.9999999999995e-4 < number <= 0.9999999999995e11)
Example:
>>> isclose(10.0, 10.049)
True
>>> isclose(10.0, 10.05)
False
For some of the cases where you can affect the source number representation, you can represent them as fractions instead of floats, using integer numerator and denominator. That way you can have exact comparisons.
See Fraction from fractions module for details.
I liked Sesquipedal's suggestion, but with modification (a special use case when both values are 0 returns False). In my case, I was on Python 2.7 and just used a simple function:
if f1 ==0 and f2 == 0:
return True
else:
return abs(f1-f2) < tol*max(abs(f1),abs(f2))
If you want to do it in a testing or TDD context using the pytest package, here's how:
import pytest
PRECISION = 1e-3
def assert_almost_equal():
obtained_value = 99.99
expected_value = 100.00
assert obtained_value == pytest.approx(expected_value, PRECISION)
I found the following comparison helpful:
str(f1) == str(f2)
To compare up to a given decimal without atol/rtol:
def almost_equal(a, b, decimal=6):
return '{0:.{1}f}'.format(a, decimal) == '{0:.{1}f}'.format(b, decimal)
print(almost_equal(0.0, 0.0001, decimal=5)) # False
print(almost_equal(0.0, 0.0001, decimal=4)) # True
This maybe is a bit ugly hack, but it works pretty well when you don't need more than the default float precision (about 11 decimals).
The round_to function uses the format method from the built-in str class to round up the float to a string that represents the float with the number of decimals needed, and then applies the eval built-in function to the rounded float string to get back to the float numeric type.
The is_close function just applies a simple conditional to the rounded up float.
def round_to(float_num, prec):
return eval("'{:." + str(int(prec)) + "f}'.format(" + str(float_num) + ")")
def is_close(float_a, float_b, prec):
if round_to(float_a, prec) == round_to(float_b, prec):
return True
return False
>>>a = 10.0
10.0
>>>b = 10.0001
10.0001
>>>print is_close(a, b, prec=3)
True
>>>print is_close(a, b, prec=4)
False
Update:
As suggested by #stepehjfox, a cleaner way to build a rount_to function avoiding "eval" is using nested formatting:
def round_to(float_num, prec):
return '{:.{precision}f}'.format(float_num, precision=prec)
Following the same idea, the code can be even simpler using the great new f-strings (Python 3.6+):
def round_to(float_num, prec):
return f'{float_num:.{prec}f}'
So, we could even wrap it up all in one simple and clean 'is_close' function:
def is_close(a, b, prec):
return f'{a:.{prec}f}' == f'{b:.{prec}f}'
If you want to compare floats, the options above are great, but in my case, I ended up using Enum's, since I only had few valid floats my use case was accepting.
from enum import Enum
class HolidayMultipliers(Enum):
EMPLOYED_LESS_THAN_YEAR = 2.0
EMPLOYED_MORE_THAN_YEAR = 2.5
Then running:
testable_value = 2.0
HolidayMultipliers(testable_value)
If the float is valid, it's fine, but otherwise it will just throw an ValueError.
Use == is a simple good way, if you don't care about tolerance precisely.
# Python 3.8.5
>>> 1.0000000000001 == 1
False
>>> 1.00000000000001 == 1
True
But watch out for 0:
>>> 0 == 0.00000000000000000000000000000000000000000001
False
The 0 is always the zero.
Use math.isclose if you want to control the tolerance.
The default a == b is equivalent to math.isclose(a, b, rel_tol=1e-16, abs_tol=0).
If you still want to use == with a self-defined tolerance:
>>> class MyFloat(float):
def __eq__(self, another):
return math.isclose(self, another, rel_tol=0, abs_tol=0.001)
>>> a == MyFloat(0)
>>> a
0.0
>>> a == 0.001
True
So far, I didn't find anywhere to config it globally for float. Besides, mock is also not working for float.__eq__.

Z3 BitVec extraction using symbolic high and low

I've been playing around with proving certain SIMD vectorizations using Z3 and I'm running into a problem trying to model SIMD operations that conditionally move around bits or lanes (such as Intel _mm_shuffle_epi8 for example)
The problem occurs when I try to use symbolic high and low with Extract which does not seem supported:
assert a.sort() == BitVecSort(128)
assert b.sort() == BitVecSort(128)
Extract( Extract(i+3,i,b)*8+7, Extract(i+3,i,b)*8, a)
results in
z3.z3types.Z3Exception: Symbolic expressions cannot be cast to concrete Boolean values.
The problem appears to be two-fold:
Z3 does not appear to support symbolically sized BitVecs
>>> a = Int('a')
>>> b = BitVec('b', a)
ctypes.ArgumentError: argument 2: <class 'TypeError'>: wrong type
Would be neat, but alas. As a result, Extract needs to be able to know the precise BitVec sort of its return value, and demands that both high and low are concrete, even though it appears the only real requirement should be that simplify(high - low) results in a concrete value.
What's the correct way of doing this?
SMTLib bit-vector logic is only defined for concrete bit-sizes. This is not just an oversight: It is a fundamental limitation of the logic: There's no decision procedure that can decide correctness of bit-vector formulae that can involve symbolic sizes, since truth of bit-vector formulae can change depending on the size. The classic example is:
x <= 7
if x is a bitvector of size <= 3, then the above is true, otherwise it isn't. If that looks contrived, consider the following:
x*x <= x+x+x
Again, this is true if x is 2-bits wide, but not true if it is 3-bits wide. Thus, SMTLib requires all bit-vector sizes to be concrete at specification time. Note that you can write higher-level programs that work for arbitrary bit-sizes, but once they are rendered and sent to the solver, all bit-vector sizes must be known concrete constants.
Regarding your question about Extract. You're right, strictly speaking, concreteness of the final length is sufficient. But z3py is a thin-layer on top of SMTLib, and it doesn't do such simplifications. The "concreteness" requirement follows from the similar limitation on the corresponding SMTLib function:
All function symbols with declaration of the form
((_ extract i j) (_ BitVec m) (_ BitVec n))
where
- i, j, m, n are numerals
- m > i ≥ j ≥ 0,
- n = i - j + 1 "
see here: http://smtlib.cs.uiowa.edu/theories-FixedSizeBitVectors.shtml Note that even the logic itself is called "FixedSizeBitVectors" for this very reason, not just "BitVectors".
However, it isn't really hard to extract a fixed sized chunk, simply right shift by lo, and mask/extract the required amount of bits:
((_ extract 7 0) (bvlshr x lo))
If your chunk size is not constant, then again you land in the world of symbolic bit-vector sizes and SMTLib avoids this for the reasons I mentioned above. (And this is also the reason why extract takes concrete integers as arguments and written in that funny SMTLib notation to indicate the arguments are concrete values.)
If you do have to work with "symbolic" word sizes, your best bet is to write your program and prove for each "symbolic" size of interest separately, by making sure the sizes are concrete in each case. (Essentially a case split for all the sizes you are interested in.)
Yep, high/low need to be constant so that the resulting type is known statically.
You can use shifts and/or masks to do what you want, though you'll need to fix a maximum size for the output.

Tuples and Ternary and positional parameters

Given:
>>> a,b=2,3
>>> c,d=3,2
>>> def f(x,y): print(x,y)
I have an existing (as in cannot be changed) 2 positional parameter function where I want the positional parameters to always be in ascending order; i.e., f(2,3) no matter what two arguments I use (f(a,b) is the same as f(c,d) in the example)
I know that I could do:
>>> f(*sorted([c,d]))
2 3
Or I could do:
>>> f(*((a,b) if a<b else (b,a)))
2 3
(Note the need for tuple parenthesis in this form because , is lower precedence than the ternary...)
Or,
def my_f(a,b):
return f(a,b) if a<b else f(b,a)
All these seem kinda kludgy. Is there another syntax that I am missing?
Edit
I missed an 'old school' Python two member tuple method. Index a two member tuple based on the True == 1, False == 0 method:
>>> f(*((a,b),(b,a))[a>b])
2 3
Also:
>>> f(*{True:(a,b), False:(b,a)}[a<b])
2 3
Edit 2
The reason for this silly exercise: numpy.isclose has the following usage note:
For finite values, isclose uses the following equation to test whether
two floating point values are equivalent.
absolute(a - b) <= (atol + rtol * absolute(b))
The above equation is not symmetric in a and b, so that isclose(a, b)
might be different from isclose(b, a) in some rare cases.
I would prefer that not happen.
I am looking for the fastest way to make sure that arguments to numpy.isclose are in a consistent order. That is why I am shying away from f(*sorted([c,d]))
Implemented my solution in case anyone else is looking.
def sort(f):
def wrapper(*args):
return f(*sorted(args))
return wrapper
#sort
def f(x, y):
print(x, y)
f(3, 2)
>>> (2, 3)
Also since #Tadhg McDonald-Jensen mention that you may not be able to change the function yourself that you could wrap the function as such
my_func = sort(f)
You mention that your use-case is np.isclose. However your approach isn't a good way to solve the real issue. But it's understandable given the poor argument naming of that function - it sort of implies that both arguments are interchangable. If it were: numpy.isclose(measured, expected, ...) (or something like it) it would be much clearer.
For example if you expect the value 10 and measure 10.51 and you allow for 5% deviation, then in order to get a useful result you must use np.isclose(10.51, 10, ...), otherwise you would get wrong results:
>>> import numpy as np
>>> measured = 10.51
>>> expected = 10
>>> err_rel = 0.05
>>> err_abs = 0.0
>>> np.isclose(measured, expected, err_rel, err_abs)
False
>>> np.isclose(expected, measured, err_rel, err_abs)
True
It's clear to see that the first one gives the correct result because the actually measured value is not within the tolerance of the expected value. That's because the relative uncertainty is an "attribute" of the expected value, not of the value you compare it with!
So solving this issue by "sorting" the parameters is just wrong. That's a bit like changing the numerator and denominator for division because the denominator contains zeros and dividing by zero could give NaN, Inf, a Warning or an Exception... it definetly avoids the problem but just by giving an incorrect result (the comparison isn't perfect because with division it will almost always give a wrong result; with isclose it's rare).
This was a somewhat artificial example designed to trigger that behaviour and most of the time it's not important if you use measured, expected or expected, measured but in the few cases where it does matter you can't solve it by swapping the arguments (except when you have no "expected" result, but that rarely happens - at least it shouldn't).
There was some discussion about this topic when math.isclose was added to the python library:
Symmetry (PEP 485)
[...]
Which approach is most appropriate depends on what question is being asked. If the question is: "are these two numbers close to each other?", there is no obvious ordering, and a symmetric test is most appropriate.
However, if the question is: "Is the computed value within x% of this known value?", then it is appropriate to scale the tolerance to the known value, and an asymmetric test is most appropriate.
[...]
This proposal [for math.isclose] uses a symmetric test.
So if your test falls into the first category and you like a symmetric test - then math.isclose could be a viable alternative (at least if you're dealing with scalars):
math.isclose(a, b, *, rel_tol=1e-09, abs_tol=0.0)
[...]
rel_tol is the relative tolerance – it is the maximum allowed difference between a and b, relative to the larger absolute value of a or b. For example, to set a tolerance of 5%, pass rel_tol=0.05. The default tolerance is 1e-09, which assures that the two values are the same within about 9 decimal digits. rel_tol must be greater than zero.
[...]
Just in case this answer couldn't convince you and you still want to use a sorted approach - then you should order by the absolute of you values (i.e. *sorted([a, b], key=abs)). Otherwise you might get surprising results when comparing negative numbers:
>>> np.isclose(-10.51, -10, err_rel, err_abs) # -10.51 is smaller than -10!
False
>>> np.isclose(-10, -10.51, err_rel, err_abs)
True
For only two elements in the tuple, the second one is the preferred idiom -- in my experience. It's fast, readable, etc.
No, there isn't really another syntax. There's also
(min(a,b), max(a,b))
... but this isn't particularly superior to the other methods; merely another way of expressing it.
Note after comment by dawg:
A class with custom comparison operators could return the same object for both min and max.

Comparing two floats in python [duplicate]

It's well known that comparing floats for equality is a little fiddly due to rounding and precision issues.
For example: Comparing Floating Point Numbers, 2012 Edition
What is the recommended way to deal with this in Python?
Is a standard library function for this somewhere?
Python 3.5 adds the math.isclose and cmath.isclose functions as described in PEP 485.
If you're using an earlier version of Python, the equivalent function is given in the documentation.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
rel_tol is a relative tolerance, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal.
abs_tol is an absolute tolerance that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal.
Something as simple as the following may be good enough:
return abs(f1 - f2) <= allowed_error
I would agree that Gareth's answer is probably most appropriate as a lightweight function/solution.
But I thought it would be helpful to note that if you are using NumPy or are considering it, there is a packaged function for this.
numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)
A little disclaimer though: installing NumPy can be a non-trivial experience depending on your platform.
Use Python's decimal module, which provides the Decimal class.
From the comments:
It is worth noting that if you're
doing math-heavy work and you don't
absolutely need the precision from
decimal, this can really bog things
down. Floats are way, way faster to
deal with, but imprecise. Decimals are
extremely precise but slow.
The common wisdom that floating-point numbers cannot be compared for equality is inaccurate. Floating-point numbers are no different from integers: If you evaluate "a == b", you will get true if they are identical numbers and false otherwise (with the understanding that two NaNs are of course not identical numbers).
The actual problem is this: If I have done some calculations and am not sure the two numbers I have to compare are exactly correct, then what? This problem is the same for floating-point as it is for integers. If you evaluate the integer expression "7/3*3", it will not compare equal to "7*3/3".
So suppose we asked "How do I compare integers for equality?" in such a situation. There is no single answer; what you should do depends on the specific situation, notably what sort of errors you have and what you want to achieve.
Here are some possible choices.
If you want to get a "true" result if the mathematically exact numbers would be equal, then you might try to use the properties of the calculations you perform to prove that you get the same errors in the two numbers. If that is feasible, and you compare two numbers that result from expressions that would give equal numbers if computed exactly, then you will get "true" from the comparison. Another approach is that you might analyze the properties of the calculations and prove that the error never exceeds a certain amount, perhaps an absolute amount or an amount relative to one of the inputs or one of the outputs. In that case, you can ask whether the two calculated numbers differ by at most that amount, and return "true" if they are within the interval. If you cannot prove an error bound, you might guess and hope for the best. One way of guessing is to evaluate many random samples and see what sort of distribution you get in the results.
Of course, since we only set the requirement that you get "true" if the mathematically exact results are equal, we left open the possibility that you get "true" even if they are unequal. (In fact, we can satisfy the requirement by always returning "true". This makes the calculation simple but is generally undesirable, so I will discuss improving the situation below.)
If you want to get a "false" result if the mathematically exact numbers would be unequal, you need to prove that your evaluation of the numbers yields different numbers if the mathematically exact numbers would be unequal. This may be impossible for practical purposes in many common situations. So let us consider an alternative.
A useful requirement might be that we get a "false" result if the mathematically exact numbers differ by more than a certain amount. For example, perhaps we are going to calculate where a ball thrown in a computer game traveled, and we want to know whether it struck a bat. In this case, we certainly want to get "true" if the ball strikes the bat, and we want to get "false" if the ball is far from the bat, and we can accept an incorrect "true" answer if the ball in a mathematically exact simulation missed the bat but is within a millimeter of hitting the bat. In that case, we need to prove (or guess/estimate) that our calculation of the ball's position and the bat's position have a combined error of at most one millimeter (for all positions of interest). This would allow us to always return "false" if the ball and bat are more than a millimeter apart, to return "true" if they touch, and to return "true" if they are close enough to be acceptable.
So, how you decide what to return when comparing floating-point numbers depends very much on your specific situation.
As to how you go about proving error bounds for calculations, that can be a complicated subject. Any floating-point implementation using the IEEE 754 standard in round-to-nearest mode returns the floating-point number nearest to the exact result for any basic operation (notably multiplication, division, addition, subtraction, square root). (In case of tie, round so the low bit is even.) (Be particularly careful about square root and division; your language implementation might use methods that do not conform to IEEE 754 for those.) Because of this requirement, we know the error in a single result is at most 1/2 of the value of the least significant bit. (If it were more, the rounding would have gone to a different number that is within 1/2 the value.)
Going on from there gets substantially more complicated; the next step is performing an operation where one of the inputs already has some error. For simple expressions, these errors can be followed through the calculations to reach a bound on the final error. In practice, this is only done in a few situations, such as working on a high-quality mathematics library. And, of course, you need precise control over exactly which operations are performed. High-level languages often give the compiler a lot of slack, so you might not know in which order operations are performed.
There is much more that could be (and is) written about this topic, but I have to stop there. In summary, the answer is: There is no library routine for this comparison because there is no single solution that fits most needs that is worth putting into a library routine. (If comparing with a relative or absolute error interval suffices for you, you can do it simply without a library routine.)
math.isclose() has been added to Python 3.5 for that (source code). Here is a port of it to Python 2. It's difference from one-liner of Mark Ransom is that it can handle "inf" and "-inf" properly.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
'''
Python 2 implementation of Python 3.5 math.isclose()
https://github.com/python/cpython/blob/v3.5.10/Modules/mathmodule.c#L1993
'''
# sanity check on the inputs
if rel_tol < 0 or abs_tol < 0:
raise ValueError("tolerances must be non-negative")
# short circuit exact equality -- needed to catch two infinities of
# the same sign. And perhaps speeds things up a bit sometimes.
if a == b:
return True
# This catches the case of two infinities of opposite sign, or
# one infinity and one finite number. Two infinities of opposite
# sign would otherwise have an infinite relative tolerance.
# Two infinities of the same sign are caught by the equality check
# above.
if math.isinf(a) or math.isinf(b):
return False
# now do the regular computation
# this is essentially the "weak" test from the Boost library
diff = math.fabs(b - a)
result = (((diff <= math.fabs(rel_tol * b)) or
(diff <= math.fabs(rel_tol * a))) or
(diff <= abs_tol))
return result
I'm not aware of anything in the Python standard library (or elsewhere) that implements Dawson's AlmostEqual2sComplement function. If that's the sort of behaviour you want, you'll have to implement it yourself. (In which case, rather than using Dawson's clever bitwise hacks you'd probably do better to use more conventional tests of the form if abs(a-b) <= eps1*(abs(a)+abs(b)) + eps2 or similar. To get Dawson-like behaviour you might say something like if abs(a-b) <= eps*max(EPS,abs(a),abs(b)) for some small fixed EPS; this isn't exactly the same as Dawson, but it's similar in spirit.
If you want to use it in testing/TDD context, I'd say this is a standard way:
from nose.tools import assert_almost_equals
assert_almost_equals(x, y, places=7) # The default is 7
In terms of absolute error, you can just check
if abs(a - b) <= error:
print("Almost equal")
Some information of why float act weird in Python:
Python 3 Tutorial 03 - if-else, logical operators and top beginner mistakes
You can also use math.isclose for relative errors.
This is useful for the case where you want to make sure two numbers are the same 'up to precision', and there isn't any need to specify the tolerance:
Find minimum precision of the two numbers
Round both of them to minimum precision and compare
def isclose(a, b):
astr = str(a)
aprec = len(astr.split('.')[1]) if '.' in astr else 0
bstr = str(b)
bprec = len(bstr.split('.')[1]) if '.' in bstr else 0
prec = min(aprec, bprec)
return round(a, prec) == round(b, prec)
As written, it only works for numbers without the 'e' in their string representation (meaning 0.9999999999995e-4 < number <= 0.9999999999995e11)
Example:
>>> isclose(10.0, 10.049)
True
>>> isclose(10.0, 10.05)
False
For some of the cases where you can affect the source number representation, you can represent them as fractions instead of floats, using integer numerator and denominator. That way you can have exact comparisons.
See Fraction from fractions module for details.
I liked Sesquipedal's suggestion, but with modification (a special use case when both values are 0 returns False). In my case, I was on Python 2.7 and just used a simple function:
if f1 ==0 and f2 == 0:
return True
else:
return abs(f1-f2) < tol*max(abs(f1),abs(f2))
If you want to do it in a testing or TDD context using the pytest package, here's how:
import pytest
PRECISION = 1e-3
def assert_almost_equal():
obtained_value = 99.99
expected_value = 100.00
assert obtained_value == pytest.approx(expected_value, PRECISION)
I found the following comparison helpful:
str(f1) == str(f2)
To compare up to a given decimal without atol/rtol:
def almost_equal(a, b, decimal=6):
return '{0:.{1}f}'.format(a, decimal) == '{0:.{1}f}'.format(b, decimal)
print(almost_equal(0.0, 0.0001, decimal=5)) # False
print(almost_equal(0.0, 0.0001, decimal=4)) # True
This maybe is a bit ugly hack, but it works pretty well when you don't need more than the default float precision (about 11 decimals).
The round_to function uses the format method from the built-in str class to round up the float to a string that represents the float with the number of decimals needed, and then applies the eval built-in function to the rounded float string to get back to the float numeric type.
The is_close function just applies a simple conditional to the rounded up float.
def round_to(float_num, prec):
return eval("'{:." + str(int(prec)) + "f}'.format(" + str(float_num) + ")")
def is_close(float_a, float_b, prec):
if round_to(float_a, prec) == round_to(float_b, prec):
return True
return False
>>>a = 10.0
10.0
>>>b = 10.0001
10.0001
>>>print is_close(a, b, prec=3)
True
>>>print is_close(a, b, prec=4)
False
Update:
As suggested by #stepehjfox, a cleaner way to build a rount_to function avoiding "eval" is using nested formatting:
def round_to(float_num, prec):
return '{:.{precision}f}'.format(float_num, precision=prec)
Following the same idea, the code can be even simpler using the great new f-strings (Python 3.6+):
def round_to(float_num, prec):
return f'{float_num:.{prec}f}'
So, we could even wrap it up all in one simple and clean 'is_close' function:
def is_close(a, b, prec):
return f'{a:.{prec}f}' == f'{b:.{prec}f}'
If you want to compare floats, the options above are great, but in my case, I ended up using Enum's, since I only had few valid floats my use case was accepting.
from enum import Enum
class HolidayMultipliers(Enum):
EMPLOYED_LESS_THAN_YEAR = 2.0
EMPLOYED_MORE_THAN_YEAR = 2.5
Then running:
testable_value = 2.0
HolidayMultipliers(testable_value)
If the float is valid, it's fine, but otherwise it will just throw an ValueError.
Use == is a simple good way, if you don't care about tolerance precisely.
# Python 3.8.5
>>> 1.0000000000001 == 1
False
>>> 1.00000000000001 == 1
True
But watch out for 0:
>>> 0 == 0.00000000000000000000000000000000000000000001
False
The 0 is always the zero.
Use math.isclose if you want to control the tolerance.
The default a == b is equivalent to math.isclose(a, b, rel_tol=1e-16, abs_tol=0).
If you still want to use == with a self-defined tolerance:
>>> class MyFloat(float):
def __eq__(self, another):
return math.isclose(self, another, rel_tol=0, abs_tol=0.001)
>>> a == MyFloat(0)
>>> a
0.0
>>> a == 0.001
True
So far, I didn't find anywhere to config it globally for float. Besides, mock is also not working for float.__eq__.

Testing floating point equality

Is there a function to test floating point approximate equality in python? Something like,
def approx_equal(a, b, tol):
return abs(a - b) < tol
My use case is similar to how Google's C++ testing library, gtest.h, defines EXPECT_NEAR.
Here is an example:
def bernoulli_fraction_to_angle(fraction):
return math.asin(sqrt(fraction))
def bernoulli_angle_to_fraction(angle):
return math.sin(angle) ** 2
def test_bernoulli_conversions():
assert(approx_equal(bernoulli_angle_to_fraction(pi / 4), 0.5, 1e-4))
assert(approx_equal(
bernoulli_fraction_to_angle(bernoulli_angle_to_fraction(0.1)),
0.1, 1e-4))
For comparing numbers, there is math.isclose.
For comparing numbers or arrays, there is numpy.allclose.
For testing numbers or arrays, there is numpy.testing.assert_allclose
Another approach is to compute the relative change (or relative difference) of the two numbers, which is "used to compare two quantities while taking into account the 'sizes' of the things being compared". The two formulas mentioned in the Wikipedia article could be used in comparisons like the following in Python, which also handle cases where one or both of the values being compared are zero:
def approx_equal(a, b, tol):
return abs(a-b) <= max(abs(a), abs(b)) * tol
def approx_equal(a, b, tol):
return abs(a-b) <= (abs(a)+abs(b))/2 * tol
The calculated value in either case is a unitless fraction. In the first case the baseline value is the maximum absolute value of the two numbers and in the second it's their mean absolute value. The article discusses each in more detail as well as their pros and cons. The latter can turned into a percentage difference if multiplied by 100 before the comparison (with tol becoming a percentage value). Note that the article suggests that if the changing value "is a percentage itself, it is better to talk about its change by using percentage points" — i.e. absolute change.
Both of these methods (obviously) require a little more computation than simply taking the absolute value of the difference of the two numbers, which might be a consideration.
Is there a function to test floating point approximate equality in python?
There can't be a function, since the definition depends on context.
def eq( a, b, eps=0.0001 ):
return abs(a - b) <= eps
Doesn't always work. There are circumstances where
def eq( a, b, eps=0.0001 ):
return abs( a - b ) / abs(a) <= eps
could be more appropriate.
Plus, there's the always popular.
def eq( a, b, eps=0.0001 ):
return abs(math.log( a ) - math.log(b)) <= eps
Which might be more appropriate.
I can't see how you can ask for a (single) function do combine all the mathematical alternatives. Since it depends on the application.
If I were you, I'd just use what you wrote, and either put it in a separate module (perhaps with other utilities you need that Python doesn't have an implementation for) or at the top of whatever code requires it.
You can also use a lambda expression (one of my favorite language features, but probably less clear):
approx_equal = lambda a, b, t: abs(a - b) < t
Comparing floats for equality is just usually a bad idea. Even with the tolerance feature you're using, this isn't really what you want to do.
If you want to use floats, a reasonable option is to refactor your algorithm to use inequalities, a < b because this is more likely to do what you expect, with far fewer false negatives or positives, and most importantly, it means you don't have to guess how equal they must be for them to be equal.
If you can't do that, another option is to use an exact representation. If your algorithm is composed only of arithmetic operations (+, -, * and /) then you can use a rational represntation, as provided by fractions.Fraction, or maybe decimal.Decimal is what you want (for instance, with financial calculations).
If your algorithm cannot be expressed easily with an arbitrary precision representation, another choice is to manage the roundoff error explicitly with interval arithmetic, for instance with this module.
According to the tutorial:
... Though the numbers cannot be made closer to their intended exact values, the round() function can be useful for post-rounding so that results with inexact values become comparable to one another...
Therefore, this is the way that I define "isclose" functions in Python:
def isclose(a, b, ndigits):
return round(a-b, ndigits) == 0
I usually use 5 as ndigits; However, it depends on the precision that you expect.

Categories