How to count non zero rows in a N-d tensor? - python

I need to find the number of non zero rows and put them in a 1D tensor(kind of vector).
For an example:
tensor = [
[
[1, 2, 3, 4, 0, 0, 0],
[4, 5, 6, 7, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]
],
[
[4, 3, 2, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]
],
[
[0, 0, 0, 0, 0, 0, 0],
[4, 5, 6, 7, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]
]
]
the tensor shape will be [None,45,7] in a real application, but here it is [3,2,7].
So I need to find the number of non-zero rows in dimension 1 and keep them in a 1d tensor.
non_zeros = [2,1,1] #result for the above tensor
I need to do it in TensorFlow, if it is in NumPy, I would have done it.
Can anyone help me with this?
Thanks in advance

You can use tf.math.count_nonzero combined with tf.reduce_sum
>>> tf.math.count_nonzero(tf.reduce_sum(tensor,axis=2),axis=1)
<tf.Tensor: shape=(3,), dtype=int64, numpy=array([2, 1, 1])>

Try this code:
t = tf.math.not_equal(tensor, 0)
t = tf.reduce_any(t, -1)
t = tf.cast(t, tf.int32)
t = tf.reduce_sum(t, -1)

Related

How to convert diameter of matrix to zero in python?

I have a matrix and I want to convert diameter value to zero in python. can you help me?
Matrix:
array([[1, 0, 0, ..., 1, 0, 0],
[0, 1, 0, ..., 0, 0, 0],
[0, 0, 1, ..., 0, 0, 0],
...,
[1, 0, 0, ..., 1, 0, 0],
[0, 0, 0, ..., 0, 1, 0],
[0, 0, 0, ..., 0, 0, 1]])
assuming you meant diagonal, iterate over the list with enumerate, then iterate over the sublist, and check if the indexes are equal (that means you're on the diagonal), and assign zero, else the current value.
mydata = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
for i,l in enumerate(mydata):
l[i]=0
[0, 2, 3, 4]
[5, 0, 7, 8]
[9, 10, 0, 12]
[13, 14, 15, 0]
When possible, avoid nested loops.
As you are working with numpy's arrays, you can use
np.fill_diagonal(mydata, 0)

Replacement for numpy.apply_along_axis in CuPy

I have a NumPy-based neural network that I am trying to port to CuPy. I have a function as follows:
import numpy as np
def tensor_diag(x): return np.apply_along_axis(np.diag, -1, x)
# Usage: (x is a matrix, i.e. a 2-tensor)
def sigmoid_prime(x): return tensor_diag(sigmoid(x) * (1 - sigmoid(x)))
This works using NumPy, but CuPy does not have an analogue for the function (it is unsupported as of 8th May 2020). How can I emulate this behaviour in CuPy?
In [284]: arr = np.arange(24).reshape(2,3,4)
np.diag takes a 1d array, and returns a 2d with the values on the diagonal. apply_along_axis just iterates on all dimensions except the last, and passes the last, one array at a time to diag:
In [285]: np.apply_along_axis(np.diag,-1,arr)
Out[285]:
array([[[[ 0, 0, 0, 0],
[ 0, 1, 0, 0],
[ 0, 0, 2, 0],
[ 0, 0, 0, 3]],
[[ 4, 0, 0, 0],
[ 0, 5, 0, 0],
[ 0, 0, 6, 0],
[ 0, 0, 0, 7]],
[[ 8, 0, 0, 0],
[ 0, 9, 0, 0],
[ 0, 0, 10, 0],
[ 0, 0, 0, 11]]],
[[[12, 0, 0, 0],
[ 0, 13, 0, 0],
[ 0, 0, 14, 0],
[ 0, 0, 0, 15]],
[[16, 0, 0, 0],
[ 0, 17, 0, 0],
[ 0, 0, 18, 0],
[ 0, 0, 0, 19]],
[[20, 0, 0, 0],
[ 0, 21, 0, 0],
[ 0, 0, 22, 0],
[ 0, 0, 0, 23]]]])
In [286]: _.shape
Out[286]: (2, 3, 4, 4)
I could do the same mapping with:
In [287]: res = np.zeros((2,3,4,4),int)
In [288]: res[:,:,np.arange(4),np.arange(4)] = arr
check with the apply result:
In [289]: np.allclose(_285, res)
Out[289]: True
Or for a more direct copy of apply, use np.ndindex to generate all the i,j tuple pairs to iterate over the first 2 dimensions of arr:
In [298]: res = np.zeros((2,3,4,4),int)
In [299]: for ij in np.ndindex(2,3):
...: res[ij]=np.diag(arr[ij])
...:
In [300]: np.allclose(_285, res)
Out[300]: True

Numpy: Diff on non-adjacent values, in 2D

I'd like to take the difference of non-adjacent values within 2D numpy array along axis=-1 (per row). An array can consist of a large number of rows.
Each row is a selection of values along a timeline from 1 to N.
For N=12, the array could look like below 3x12 shape:
timeline = np.array([[ 0, 0, 0, 4, 0, 6, 0, 0, 9, 0, 11, 0],
[ 1, 0, 3, 4, 0, 0, 0, 0, 9, 0, 0, 12],
[ 0, 0, 0, 4, 0, 0, 0, 0, 9, 0, 0, 0]])
The desired result should look like: (size of array is intact and position is important)
diff = np.array([[ 0, 0, 0, 4, 0, 2, 0, 0, 3, 0, 2, 0],
[ 1, 0, 2, 1, 0, 0, 0, 0, 5, 0, 0, 3],
[ 0, 0, 0, 4, 0, 0, 0, 0, 5, 0, 0, 0]])
I am aware of the solution in 1D, Diff on non-adjacent values
imask = np.flatnonzero(timeline)
diff = np.zeros_like(timeline)
diff[imask] = np.diff(timeline[imask], prepend=0)
within which the last line can be replaced with
diff[imask[0]] = timeline[imask[0]]
diff[imask[1:]] = timeline[imask[1:]] - timeline[imask[:-1]]
and the first line can be replaced with
imask = np.where(timeline != 0)[0]
Attempting to generalise the 1D solution I can see imask = np.flatnonzero(timeline) is undesirable as rows becomes inter-dependent. Thus I am trying by using the alternative np.nonzero.
imask = np.nonzero(timeline)
diff = np.zeros_like(timeline)
diff[imask] = np.diff(timeline[imask], prepend=0)
However, this solution results in a connection between row's end values (inter-dependent).
array([[ 0, 0, 0, 4, 0, 2, 0, 0, 3, 0, 2, 0],
[-10, 0, 2, 1, 0, 0, 0, 0, 5, 0, 0, 3],
[ 0, 0, 0, -8, 0, 0, 0, 0, 5, 0, 0, 0]])
How can I make the "prepend" to start each row with a zero?
Wow. I did it... (It is interesting problem for me too..)
I made non_adjacent_diff function to be applied to every row, and apply it to every row using np.apply_along_axis.
Try this code.
timeline = np.array([[ 0, 0, 0, 4, 0, 6, 0, 0, 9, 0, 11, 0],
[ 1, 0, 3, 4, 0, 0, 0, 0, 9, 0, 0, 12],
[ 0, 0, 0, 4, 0, 0, 0, 0, 9, 0, 0, 0]])
def non_adjacent_diff(row):
not_zero_index = np.where(row != 0)
diff = row[not_zero_index][1:] - row[not_zero_index][:-1]
np.put(row, not_zero_index[0][1:], diff)
return row
np.apply_along_axis(non_adjacent_diff, 1, timeline)

Numpy remove duplicate columns with values greater than 0

I've the following array.
array([[ 0, 0, 0, 0, 0, 3],
[ 4, 4, 0, 0, 0, 0],
[ 0, 0, 0, 23, 0, 0]])
I am looking to find the unique values column wise such that my result is.
array([[ 0, 0, 0, 0, 3],
[ 4, 0, 0, 0, 0],
[ 0, 0, 23, 0, 0]])
The unique should only be applied to columns without 0 values i.e all columns which has 0 as their value should remain. Also I've to make sure that the indices of the columns is not changed. They remain at their place.
I've already tried the following.
np.unique(a,axis=1, return_index=True)
But this gives me
(array([[ 0, 0, 0, 3],
[ 0, 0, 4, 0],
[ 0, 23, 0, 0]]), array([2, 3, 0, 5]))
There are two problems in this result. The column indices are moved and the columns with only 0 values are also merged.
This will accomplish what you want:
import numpy as np
import pandas as pd
x = np.array([[ 0, 0, 0, 0, 0, 3],
[ 4, 4, 0, 0, 0, 0],
[ 0, 0, 0, 23, 0, 0]])
df = pd.DataFrame(x.T)
row_sum = np.sum(df, axis=1)
df1 = df[row_sum != 0].drop_duplicates()
df0 = df[row_sum == 0]
y = pd.concat([df1, df0]).sort_index().values.T
y
array([[ 0, 0, 0, 0, 3],
[ 4, 0, 0, 0, 0],
[ 0, 0, 23, 0, 0]])
By summing the columns (or the rows after transposing) you can identify which ones contain all zeros, and filter them out before dropping the duplicates. Then you can re-combine them and sort by the index to get the desired output.

Finding the Max value in a two dimensional Array

I'm trying to find an elegant way to find the max value in a two-dimensional array.
for example for this array:
[0, 0, 1, 0, 0, 1] [0, 1, 0, 2, 0, 0][0, 0, 2, 0, 0, 1][0, 1, 0, 3, 0, 0][0, 0, 0, 0, 4, 0]
I would like to extract the value '4'.
I thought of doing a max within max but I'm struggling in executing it.
Another way to solve this problem is by using function numpy.amax()
>>> import numpy as np
>>> arr = [0, 0, 1, 0, 0, 1] , [0, 1, 0, 2, 0, 0] , [0, 0, 2, 0, 0, 1] , [0, 1, 0, 3, 0, 0] , [0, 0, 0, 0, 4, 0]
>>> np.amax(arr)
Max of max numbers (map(max, numbers) yields 1, 2, 2, 3, 4):
>>> numbers = [0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]
>>> map(max, numbers)
<map object at 0x0000018E8FA237F0>
>>> list(map(max, numbers)) # max numbers from each sublist
[1, 2, 2, 3, 4]
>>> max(map(max, numbers)) # max of those max-numbers
4
Not quite as short as falsetru's answer but this is probably what you had in mind:
>>> numbers = [0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]
>>> max(max(x) for x in numbers)
4
How about this?
import numpy as np
numbers = np.array([[0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]])
print(numbers.max())
4
>>> numbers = [0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]
You may add key parameter to max as below to find Max value in a 2-D Array/List
>>> max(max(numbers, key=max))
4
One very easy solution to get both the index of your maximum and your maximum could be :
numbers = np.array([[0,0,1,0,0,1],[0,1,0,2,0,0],[0,0,2,0,0,1],[0,1,0,3,0,0],[0,0,0,0,4,0]])
ind = np.argwhere(numbers == numbers.max()) # In this case you can also get the index of your max
numbers[ind[0,0],ind[0,1]]
This approach is not as intuitive as others but here goes,
numbers = [0, 0, 1, 0, 0, 1], [0, 1, 0, 2, 0, 0], [0, 0, 2, 0, 0, 1], [0, 1, 0, 3, 0, 0], [0, 0, 0, 0, 4, 0]
maximum = -9999
for i in numbers:
maximum = max(maximum,max(i))
return maximum"

Categories