Numpy where and division by zero - python

I need to compute x in the following way (legacy code):
x = numpy.where(b == 0, a, 1/b)
I suppose it worked in python-2.x (as it was in a python-2.7 code), but it does not work in python-3.x (if b = 0 it returns an error).
How do I make it work in python-3.x?
EDIT: error message (Python 3.6.3):
ZeroDivisionError: division by zero

numpy.where is not conditional execution; it is conditional selection. Python function parameters are always completely evaluated before a function call, so there is no way for a function to conditionally or partially evaluate its parameters.
Your code:
x = numpy.where(b == 0, a, 1/b)
tells Python to invert every element of b and then select elements from a or 1/b based on elements of b == 0. Python never even reaches the point of selecting elements, because computing 1/b fails.
You can avoid this problem by only inverting the nonzero parts of b. Assuming a and b have the same shape, it could look like this:
x = numpy.empty_like(b)
mask = (b == 0)
x[mask] = a[mask]
x[~mask] = 1/b[~mask]

A old trick for handling 0 elements in an array division is to add a conditional value:
In [63]: 1/(b+(b==0))
Out[63]: array([1. , 1. , 0.5 , 0.33333333])
(I used this years ago in apl).
x = numpy.where(b == 0, a, 1/b) is evaluated in the same way as any other Python function. Each function argument is evaluated, and the value passed to the where function. There's no 'short-circuiting' or other method of bypassing bad values of 1/b.
So if 1/b returns a error you need to either change b so it doesn't do that, calculate it in context that traps traps the ZeroDivisionError, or skips the 1/b.
In [53]: 1/0
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
<ipython-input-53-9e1622b385b6> in <module>()
----> 1 1/0
ZeroDivisionError: division by zero
In [54]: 1.0/0
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
<ipython-input-54-99b9b9983fe8> in <module>()
----> 1 1.0/0
ZeroDivisionError: float division by zero
In [55]: 1/np.array(0)
/usr/local/bin/ipython3:1: RuntimeWarning: divide by zero encountered in true_divide
#!/usr/bin/python3
Out[55]: inf
What are a and b? Scalars, arrays of some size?
where makes most sense if b (and maybe a) is an array:
In [59]: b = np.array([0,1,2,3])
The bare division gives me a warning, and an inf element:
In [60]: 1/b
/usr/local/bin/ipython3:1: RuntimeWarning: divide by zero encountered in true_divide
#!/usr/bin/python3
Out[60]: array([ inf, 1. , 0.5 , 0.33333333])
I could use where to replace that inf with something else, for example a nan:
In [61]: np.where(b==0, np.nan, 1/b)
/usr/local/bin/ipython3:1: RuntimeWarning: divide by zero encountered in true_divide
#!/usr/bin/python3
Out[61]: array([ nan, 1. , 0.5 , 0.33333333])
The warning can be silenced as #donkopotamus shows.
An alternative to seterr is errstate in a with context:
In [64]: with np.errstate(divide='ignore'):
...: x = np.where(b==0, np.nan, 1/b)
...:
In [65]: x
Out[65]: array([ nan, 1. , 0.5 , 0.33333333])
How to suppress the error message when dividing 0 by 0 using np.divide (alongside other floats)?

If you wish to disable warnings in numpy while you divide by zero, then do something like:
>>> existing = numpy.seterr(divide="ignore")
>>> # now divide by zero in numpy raises no sort of exception
>>> 1 / numpy.zeros( (2, 2) )
array([[ inf, inf],
[ inf, inf]])
>>> numpy.seterr(*existing)
Of course this only governs division by zero in an array. It will not prevent an error when doing a simple 1 / 0.
In your particular case, if we wish to ensure that we work whether b is a scalar or a numpy type, do as follows:
# ignore division by zero in numpy
existing = numpy.seterr(divide="ignore")
# upcast `1.0` to be a numpy type so that numpy division will always occur
x = numpy.where(b == 0, a, numpy.float64(1.0) / b)
# restore old error settings for numpy
numpy.seterr(*existing)

I solved it using this:
x = (1/(np.where(b == 0, np.nan, b))).fillna(a)

The numpy.where documentation states:
If x and y are given and input arrays are 1-D, where is
equivalent to::
[xv if c else yv for (c,xv,yv) in zip(condition,x,y)]
So why do you see the error? Take this trivial example:
c = 0
result = (1 if c==0 else 1/c)
# 1
So far so good. if c==0 is checked first and the result is 1. The code does not attempt to evaluate 1/c. This is because the Python interpreter processes a lazy ternary operator and so only evaluates the appropriate expression.
Now let's translate this into numpy.where approach:
c = 0
result = (xv if c else yv for (c, xv, yv) in zip([c==0], [1], [1/c]))
# ZeroDivisionError
The error occurs in evaluating zip([c==0], [1], [1/c]) before even the logic is applied. The generator expression itself can't be evaluated. As a function, numpy.where does not, and indeed cannot, replicate the lazy computation of Python's ternary expression.

Related

Catching 'division by zero' in a function with array parameters

I pass an array of values as arguments to a function. This function divides somewhere by the values given as arguments. I want to bypass the calculation for zero-value values so that I don't have to divide by zero.
import numpy as np
def test(t):
e = np.where(t==0,0,10/t)
return e
i = np.arange(0, 5, 1)
print('in: ',i)
o = test(i)
print('out:',o)
Output is
in: [0 1 2 3 4]
out: [ 0. 10. 5. 3.33333333 2.5 ]
<ipython-input-50-321938d419be>:4: RuntimeWarning: divide by zero encountered in true_divide
e = np.where(t==0,0,10/t)
I thought np.where would be the appropriate function for this, but unfortunately I always get a runtime warning 'divide by zero'. So, it does the right thing, but the warning is annoying. I could of course suppress the warning, but I wonder if there is a cleaner solution to the problem?
Use np.divide(10, t, where=t!=0):
import numpy as np
def test(t):
e = np.divide(10, i, where=i!=0)
return e
i = np.arange(0, 5, 1)
print('in: ',i)
o = test(i)
print('out:',o)
Yes, you can test if the divisor is zero, like you do but consider testing if it is equal to a floating point zero 0.0 , or otherwise provide floating point arguments.
Change test
def test(t):
e = np.where(t==0.0,0,10.0/t) # Note : t == 0.0 (a float comparison)
return e
And/or use floating point arguments:
i = np.arange(0., 5., 1.)
Will give you the result with out an exception too.

Strange numpy divide behaviour for scalars

I have been trying to upgrade a library which has a bunch of geometric operations for scalars so they will work with numpy arrays as well. While doing this I noticed some strange behaviour with numpy divide.
In original code checks a normalised difference between to variables if neither variable is zero, swapping across to numpy this ended up looking something like:
import numpy as np
a = np.array([0, 1, 2, 3, 4])
b = np.array([1, 2, 3, 0, 4])
o = np.zeros(len(a))
o = np.divide(np.subtract(a, b), b, out=o, where=np.logical_and(a != 0, b != 0))
print(f'First implementation: {o}')
where I passed in a output buffer initialised to zero for instances which could not be calculated; this returns:
First implementation: [ 0. -0.5 -0.33333333 0. 0. ]
I had to slightly modify this for scalars as out required an array, but it seemed fine.
a = 0
b = 4
o = None if np.isscalar(a) else np.zeros(len(a))
o = np.divide(np.subtract(a, b), b, out=o, where=np.logical_and(b != 0, a != 0))
print(f'Modified out for scalar: {o}')
returns
Modified out for scalar: 0.0.
Then ran this through some test functions and found a lot of them failed. Digging into this, I found that the first time I call the divide with a scalar with where set to False the function returns zero, but if I called it again, the second time it returns something unpredictable.
a = 0
b = 4
print(f'First divide: {np.divide(b, a, where=False)}')
print(f'Second divide: {np.divide(b, a, where=False)}')
returns
First divide: 0.0
Second divide: 4.0
Looking at the documentation, it says "locations within it where the condition is False will remain uninitialized", so I guess numpy as some internal buffer which is initially set to zero then subsequently it ends up carrying over an earlier intermediate value.
I am struggling to see how I can use divide with or without a where clause; if I use where I get an unpredictable output and if I don't I can't protect against divide by zero. Am I missing something or do I just need to have a different code path in these cases? I realise I'm half way to a different code path already with the out variable.
I would be grateful for any advice.
It looks like a bug to me. But I think you'd want to short-circuit the calls to ufuncs in the case of scalars for performance reasons anyway, so its a question of trying to keep it from being too messy. Since either a or b could be scalars, you need to check them both. Put that check into a function that conditionally returns an output array or None, and you could do
def scalar_test_np_zeros(a, b):
"""Return np.zeros for the length of arguments unless both
arguments are scalar, then None."""
if a_is:=np.isscalar(a) and np.isscalar(b):
return None
else:
return np.zeros(len(a) if a_is else len(b))
a = 0
b = 4
if o := scalar_test_np_zeros(a, b) is None:
o = (a-b)/b if a and b else 0.0
else:
np.divide(np.subtract(a, b), b, out=o,
where=np.logical_and(b != 0, a != 0))
The scalar test would be useful in other code with similar problems.
For what it's worth, if I helps anyone I have come to the conclusion I need to wrap np.divide to use it safely in functions which can take arrays and scalars. This is my wrapping function:
import numpy as np
def divide_where(a, b, where, out=None, fill=0):
""" wraps numpy divide to safely handle where clause for both arrays and scalars
- a: dividend array or scalar
- b: divisor array or scalar
- where: locations where is True a/b will be set
- out: location where data is written to; if None, an output array will be created using fill value
- fill: defines fill value. if scalar and where True value will used; if out not set fill value is used creating output array
"""
if (a_is_scalar := np.isscalar(a)) and np.isscalar(b):
return fill if not where else a / b
if out is None:
out = np.full_like(b if a_is_scalar else a, fill)
return np.divide(a, b, out=out, where=where)

How to use Raise a value error when calculating derivatives using numpy

``I need to Write a function which calculates the following math equation and round your answer to 2 decimal places: z = π*ex2/4y.
There are the following contstraints: The input variables x and y are single values (that is, not a list/array). If a division by zero occurs, raise a ValueError. Output should be rounded to 2 decimal places. (hint) You can calculate ex using the NumPy function np.exp(x).
I have done the following but still fail the value error tests (ValueError
Inputs: [0.5, 0]
def test_question_3_ValueError(test_input)):
def custom_function(x, y):
# your code here
a = (np.pi*np.exp(x**2))
b = 4*y
z = a / b
if b < 0:
raise ValueError("Div by zero")
return round(z, 2)
First thing, I believe that the division by 0, will only happen if y = 0, as b = 4*y, and b can be only 0 if y = 0, thus, you should change the if statement for b == 0.
Anoter thing, that if statement should be before calculating the z value, because you want to raise the Error before any computations has been started, and thus stopping everything that follows afterwards.

Bitwise input verification with Numpy arrays

I have read this question and understand that Numpy arrays cannot be used in boolean context. Let's say I want to perform an element-wise boolean check on the validity of inputs to a function. Can I realize this behavior while still using Numpy vectorization, and if so, how? (and if not, why?)
In the following example, I compute a value from two inputs while checking that both inputs are valid (both must be greater than 0)
import math, numpy
def calculate(input_1, input_2):
if input_1 < 0 or input_2 < 0:
return 0
return math.sqrt(input_1) + math.sqrt(input_2)
calculate_many = (lambda x: calculate(x, 20 - x))(np.arange(-20, 40))
By itself, this would not work with Numpy arrays because of ValueError. But, it is imperative that math.sqrt is never run on negative inputs because that would result in another error.
One solution using list comprehension is as follows:
calculate_many = [calculate(x, 20 - x) for x in np.arange(-20, 40)]/=
However, this no longer uses vectorization and would be painfully slow if the size of the arange was increased drastically.
Is there a way to implement this if check while still using vectorization?
I believe below expression performs vectorized operations and avoid the use of loops/lambda functions
np.sqrt(((input1>0) & 1)*input1) + np.sqrt(((input2>0) & 1)*input2)
In [121]: x = np.array([1, 10, 21, -1.])
In [122]: y = 20-x
In [123]: np.sqrt(x)
/usr/local/bin/ipython3:1: RuntimeWarning: invalid value encountered in sqrt
#!/usr/bin/python3
Out[123]: array([1. , 3.16227766, 4.58257569, nan])
There are several ways of dealing with 'out-of-range' values.
#Sam's approach is to tweak the inputs so they are valid
In [129]: ((x>0) & 1)*x
Out[129]: array([ 1., 10., 21., -0.])
Another is to use masking to limit the values calculate.
Your function skips the sqrt is either input is negative; conversely it doe sthe calc where both are valid. That's different from testing each separately.
In [124]: mask = (x>=0) & (y>=0)
In [125]: mask
Out[125]: array([ True, True, False, False])
We can use the mask thus:
In [126]: res = np.zeros_like(x)
In [127]: res[mask] = np.sqrt(x[mask]) + np.sqrt(y[mask])
In [128]: res
Out[128]: array([5.35889894, 6.32455532, 0. , 0. ])
In my comments I suggested using the where parameter of np.sqrt. It does, though, need an out parameter as well.
In [130]: np.sqrt(x, where=mask, out=np.zeros_like(x)) +
np.sqrt(y, where=mask, out=np.zeros_like(x))
Out[130]: array([5.35889894, 6.32455532, 0. , 0. ])
Alternatively if we are are happy with the nan in Out[123] we can just suppress the RuntimeWarning.

Different behavior of float32/float64 numpy variables

after googling a while, I'm posting here for help.
I have two float64 variables returned from a function.
Both of them are apparently 1:
>>> x, y = somefunc()
>>> print x,y
>>> if x < 1 : print "x < 1"
>>> if y < 1 : print "y < 1"
1.0 1.0
y < 1
Behavior changes when variables are defined float32, in which case the 'y<1' statement doesn't appear.
I tried setting
np.set_printoptions(precision=10)
expecting to see the differences between variables but even so, both of them appear as 1.0 when printed.
I am a bit confused at this point.
Is there a way to visualize the difference of these float64 numbers?
Can "if/then" be used reliably to check float64 numbers?
Thanks
Trevarez
The printed values are not correct. In your case y is smaller than 1 when using float64 and bigger or equal to 1 when using float32. this is expected since rounding errors depend on the size of the float.
To avoid this kind of problems, when dealing with floating point numbers you should always decide a "minimum error", usually called epsilon and, instead of comparing for equality, checking whether the result is at most distant epsilon from the target value:
In [13]: epsilon = 1e-11
In [14]: number = np.float64(1) - 1e-16
In [15]: target = 1
In [16]: abs(number - target) < epsilon # instead of number == target
Out[16]: True
In particular, numpy already provides np.allclose which can be useful to compare arrays for equality given a certain tolerance. It works even when the arguments aren't arrays(e.g. np.allclose(1 - 1e-16, 1) -> True).
Note however than numpy.set_printoptions doesn't affect how np.float32/64 are printed. It affects only how arrays are printed:
In [1]: import numpy as np
In [2]: np.float(1) - 1e-16
Out[2]: 0.9999999999999999
In [3]: np.array([1 - 1e-16])
Out[3]: array([ 1.])
In [4]: np.set_printoptions(precision=16)
In [5]: np.array([1 - 1e-16])
Out[5]: array([ 0.9999999999999999])
In [6]: np.float(1) - 1e-16
Out[6]: 0.9999999999999999
Also note that doing print y or evaluating y in the interactive interpreter gives different results:
In [1]: import numpy as np
In [2]: np.float(1) - 1e-16
Out[2]: 0.9999999999999999
In [3]: print(np.float64(1) - 1e-16)
1.0
The difference is that print calls str while evaluating calls repr:
In [9]: str(np.float64(1) - 1e-16)
Out[9]: '1.0'
In [10]: repr(np.float64(1) - 1e-16)
Out[10]: '0.99999999999999989'
In [26]: x = numpy.float64("1.000000000000001")
In [27]: print x, repr(x)
1.0 1.0000000000000011
In other words, you are plagued by loss of precision in print statement. The value is very slightly different than 1.
Following the advices provided here I summarize the answers in this way:
To make comparisons between floats, the programmer has to define a minimum distance (eps) for them to be considered different (eps=1e-12, for example). Doing so, the conditions should be written like this:
Instead of (x>a), use (x-a)>eps
Instead of (x<a), use (a-x)>eps
Instead of (x==a), use abs(x-a)<eps
This doesn't apply to comparison between integer numbers since difference between them is fixed to 1.
Hope it helps others as it helped me.

Categories