Division with numpy matrices that might result in nan - python

How can I divide two numpy matrices A and B in python when sometimes the two matrices will have 0 on the same cell?
Basically A[i,j]>=B[i,j] for all i, j. I need to calculate C=A/B. But sometimes A[i,j]==B[i,j]==0. And when this happens I need A[i,j]/B[i,j] to be defined as 0.
Is there a simple pythonic way other than going through all the indexes?

You can use the where argument for ufuncs like np.true_divide:
np.true_divide(A, B, where=(A!=0) | (B!=0))
In case you have no negative values (as stated in the comments) and A >= B for each element (as stated in the question) you can simplify this to:
np.true_divide(A, B, where=(A!=0))
because A[i, j] == 0 implies B[i, j] == 0.
For example:
import numpy as np
A = np.random.randint(0, 3, (4, 4))
B = np.random.randint(0, 3, (4, 4))
print(A)
print(B)
print(np.true_divide(A, B, where=(A!=0) | (B!=0)))
[[1 0 2 1]
[1 0 0 0]
[2 1 0 0]
[2 2 0 2]]
[[1 0 1 1]
[2 2 1 2]
[2 1 0 1]
[2 0 1 2]]
[[ 1. 0. 2. 1. ]
[ 0.5 0. 0. 0. ]
[ 1. 1. 0. 0. ]
[ 1. inf 0. 1. ]]
As alternative: Just replace nans after the division:
C = A / B # may print warnings, suppress them with np.seterrstate if you want
C[np.isnan(C)] = 0

You could use a mask with np.where to choose between such a case of A and B being both zeros and otherwise and put out 0 or an elementwise division respectively -
from __future__ import division # For Python 2.x
mask = (A == B) & (A==0)
C = np.where(mask, 0, A/B)
About the mask creation : (A==B) would be the mask of all elements that are equal between A and B and with (A==0) we have a mask of all elements that are zero in A. Thus, with a combined mask of (A == B) & (A==0), we have mask of places where both A and B are zeros. A more simpler version to do the same task and maybe easier to understand would be to check for zeros in both A and B and it would be :
mask = (A==0) & (B==0)
About the use of np.where, its syntax is :
C = np.where(mask, array1, array2)
i.e. we would select elements for assinging into C based on the mask. If the corresponding mask element is True, we pick the corresponding element from array1, else from array2. This is done on elementwise level and thus, we have the output C.
Sample run -
In [48]: A
Out[48]:
array([[4, 1, 4, 0, 3],
[0, 4, 1, 4, 3],
[1, 0, 0, 4, 0]])
In [49]: B
Out[49]:
array([[4, 2, 2, 1, 4],
[2, 1, 2, 4, 2],
[4, 0, 2, 0, 3]])
In [50]: mask = (A == B) & (A==0)
In [51]: np.where(mask, 0, A/B)
Out[51]:
array([[ 1. , 0.5 , 2. , 0. , 0.75],
[ 0. , 4. , 0.5 , 1. , 1.5 ],
[ 0.25, 0. , 0. , inf, 0. ]])

Related

Python - Divide each row by a vector

I have a 10x10 matrix and I want to divide each row of the matrix with the elements of a vector.
For eg:
Suppose I have a 3x3 matrix
1 1 1
2 2 2
3 3 3
and a vector [1, 2, 3]
Then this is the operation I wish to do:
1/1 1/2 1/3
2/1 2/1 2/3
3/1 3/2 3/3
i.e, divide the elements of a row by the elements of a vector(A python list)
I can do this using for loops. But, is there a better way to do this operation in python?
You should look into broadcasting in numpy. For your example this is the solution:
a = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])
b = np.array([1, 2, 3]).reshape(1, 3)
c = a / b
print(c)
>>> [[1. 0.5 0.33333333]
[2. 1. 0.66666667]
[3. 1.5 1. ]]
The first source array should be created as a Numpy array:
a = np.array([
[ 1, 1, 1 ],
[ 2, 2, 2 ],
[ 3, 3, 3 ]])
You don't need to reshape the divisor array (it can be a 1-D array,
as in your source data sample):
v = np.array([1, 2, 3])
Just divide them:
result = a / v
and the result is:
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])

Numpy - Declare a specific nx1 array

I'm using numpy in python , in order to create a nx1 matrix . I want the 1st element of the matrix to be 3 , the 2nd -1 , then the n-1 element -1 again and at the end the n element 3. All the in between elements , i.e. from element 3 to element n-2 should be 0. I've made a drawing of the mentioned matrix , is like this :
I'm fairly new to python and using numpy but seems like a great tool for managing matrices. What I've tried so far is creating the nx1 array (giving n some value) and initializing it to 0 .
import numpy as np
n = 100
I = np.arange(n)
matrix = np.row_stack(0*I)
print("\Matrix is \n",matrix)
Any clues to how i proceed? Or what routine to use ?
Probably the simplest way is to just do the following:
import numpy as np
n = 10
a = np.zeros(n)
a[0] = 3
a[1] = -1
a[len(a)-1] = 3
a[len(a)-2] = -1
>>print(a)
output: [ 3. -1. 0. 0. 0. 0. 0. 0. -1. 3.]
Hope this helps ;)
In [97]: n=10
In [98]: arr = np.zeros(n,int)
In [99]: arr[[0,-1]]=3; arr[[1,-2]]=-1
In [100]: arr
Out[100]: array([ 3, -1, 0, 0, 0, 0, 0, 0, -1, 3])
Easily changed to (n,1):
In [101]: arr[:,None]
Out[101]:
array([[ 3],
[-1],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[ 0],
[-1],
[ 3]])
I guess something that works is :
import numpy as np
n = 100
I = np.arange(n)
matrix = np.row_stack(0*I)
matrix[0]=3
matrix[1]=-1
matrix[n-2]=-1
matrix[n-1]=3
print("\Matrix is \n",matrix)

Keep the n highest values of each row of an numpy array and zero everything else [duplicate]

This question already has answers here:
numpy matrix, setting 0 to values by sorting each row
(2 answers)
Closed 5 years ago.
I have a numpy array of data where I need to keep only n highest values, and zero everything else.
My current solution:
import numpy as np
np.random.seed(30)
# keep only the n highest values
n = 3
# Simple 2x5 data field for this example, real life application will be exteremely large
data = np.random.random((2,5))
#[[ 0.64414354 0.38074849 0.66304791 0.16365073 0.96260781]
# [ 0.34666184 0.99175099 0.2350579 0.58569427 0.4066901 ]]
# find indices of the n highest values per row
idx = np.argsort(data)[:,-n:]
#[[0 2 4]
# [4 3 1]]
# put those values back in a blank array
data_ = np.zeros(data.shape) # blank slate
for i in xrange(data.shape[0]):
data_[i,idx[i]] = data[i,idx[i]]
# Each row contains only the 3 highest values per row or the original data
#[[ 0.64414354 0. 0.66304791 0. 0.96260781]
# [ 0. 0.99175099 0. 0.58569427 0.4066901 ]]
In the code above, data_ has the n highest values and everything else is zeroed out. This works out nicely even if data.shape[1] is smaller than n. But the only issue is the for loop, which is slow because my actual use case is on very very large arrays.
Is it possible to get rid of the for loop?
You could act on the result of np.argsort -- np.argsort twice, the first to get the index order and the second to get the ranks -- in a vectorized fashion, and then use either np.where or simply multiplication to zero everything else:
In [116]: np.argsort(data)
Out[116]:
array([[3, 1, 0, 2, 4],
[2, 0, 4, 3, 1]])
In [117]: np.argsort(np.argsort(data)) # these are the ranks
Out[117]:
array([[2, 1, 3, 0, 4],
[1, 4, 0, 3, 2]])
In [118]: np.argsort(np.argsort(data)) >= data.shape[1] - 3
Out[118]:
array([[ True, False, True, False, True],
[False, True, False, True, True]], dtype=bool)
In [119]: data * (np.argsort(np.argsort(data)) >= data.shape[1] - 3)
Out[119]:
array([[ 0.64414354, 0. , 0.66304791, 0. , 0.96260781],
[ 0. , 0.99175099, 0. , 0.58569427, 0.4066901 ]])
In [120]: np.where(np.argsort(np.argsort(data)) >= data.shape[1]-3, data, 0)
Out[120]:
array([[ 0.64414354, 0. , 0.66304791, 0. , 0.96260781],
[ 0. , 0.99175099, 0. , 0.58569427, 0.4066901 ]])

Weird behavior when squaring elements in numpy array

I have two numpy arrays of shape (1, 250000):
a = [[ 0 254 1 ..., 255 0 1]]
b = [[ 1 0 252 ..., 0 255 255]]
I want to create a new numpy array whose elements are the square root of the sum of squares of elements in the arrays a and b, but I am not getting the correct result:
>>> c = np.sqrt(np.square(a)+np.square(b))
>>> print c
[[ 1. 2. 4.12310553 ..., 1. 1. 1.41421354]]
Am I missing something simple here?
Presumably your arrays a and b are arrays of unsigned 8 bit integers--you can check by inspecting the attribute a.dtype. When you square them, the data type is preserved, and the 8 bit values overflow, which means the values "wrap around" (i.e. the squared values are modulo 256):
In [7]: a = np.array([[0, 254, 1, 255, 0, 1]], dtype=np.uint8)
In [8]: np.square(a)
Out[8]: array([[0, 4, 1, 1, 0, 1]], dtype=uint8)
In [9]: b = np.array([[1, 0, 252, 0, 255, 255]], dtype=np.uint8)
In [10]: np.square(a) + np.square(b)
Out[10]: array([[ 1, 4, 17, 1, 1, 2]], dtype=uint8)
In [11]: np.sqrt(np.square(a) + np.square(b))
Out[11]:
array([[ 1. , 2. , 4.12310553, 1. , 1. ,
1.41421354]], dtype=float32)
To avoid the problem, you can tell np.square to use a floating point data type:
In [15]: np.sqrt(np.square(a, dtype=np.float64) + np.square(b, dtype=np.float64))
Out[15]:
array([[ 1. , 254. , 252.00198412, 255. ,
255. , 255.00196078]])
You could also use the function numpy.hypot, but you might still want to use the dtype argument, otherwise the default data type is np.float16:
In [16]: np.hypot(a, b)
Out[16]: array([[ 1., 254., 252., 255., 255., 255.]], dtype=float16)
In [17]: np.hypot(a, b, dtype=np.float64)
Out[17]:
array([[ 1. , 254. , 252.00198412, 255. ,
255. , 255.00196078]])
You might wonder why the dtype argument that I used in numpy.square and numpy.hypot is not shown in the functions' docstrings. Both of these functions are numpy "ufuncs", and the authors of numpy decided that it was better to show only the main arguments in the docstring. The optional arguments are documented in the reference manual.
For this simple case, it works perfectly fine:
In [1]: a = np.array([[ 0, 2, 4, 6, 8]])
In [2]: b = np.array([[ 1, 3, 5, 7, 9]])
In [3]: c = np.sqrt(np.square(a) + np.square(b))
In [4]: print(c)
[[ 1. 3.60555128 6.40312424 9.21954446 12.04159458]]
You must be doing something wrong.

Remove Decimals from Array

I have 2 arrays containing zeros & ones. I want to perform hstack() on them but not getting the desired output.
Python Code..
import numpy as np
zeros = np.zeros(8)
ones = np.ones(8)
zerosThenOnes = np.hstack((zeros, ones)) # A 1 by 16 array
Current Output..
[ 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1. 1.]
Expected Output..
[ 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 ]
I can't understand what silly mistake I'm doing.
You must tell numpy to return the values as integers
import numpy as np
zeros = np.zeros((8,), dtype=np.int)
ones = np.ones((8,), dtype=np.int)
zerosThenOnes = np.hstack((zeros, ones))
To print out zerosThenOnes like this [0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]
Use:
print([x for x in zerosThenOnes])
Numpy Zeros
np.hstack((np.zeros(8), np.ones(8))).astype(int)
for np.array output, or
map( int, np.hstack((np.zeros(8), np.ones(8))) )
for list output

Categories