I have a numpy array with dim (157,1944).
I want to get indices of columns that have a Nonzero element in any row.
example: [[0,0,3,4], [0,0,1,1]] ----> [2,3]
If you look each row, there is a Non Zero element in columns [2, 3]
So if I have
[[0,1,3,4], [0,0,1,1]]
I should get [1,2,3] because column index 0 has no Nonzero elements in any row.
Not sure if your question is completely defined. However, say we start with
import numpy as np
a = np.array([[0,0,3,4], [0,0,1,1]])
then
>>> np.nonzero(np.all(a != 0, axis=0))[0]
array([2, 3])
are the indices of the columns for which none of the rows are nonzero, and
>>> np.nonzero(np.any(a != 0, axis=0))[0]
array([2, 3])
are the indices of the columns for which not all of the rows are zero (it happens to be the same for the example you gave).
Related
I have this 2D numpy array here:
arr = np.array([[1,2],
[2,2],
[3,2],
[4,2],
[5,3]])
I would like to delete all duplicates corresponding to the previous index at index 1 and get an output like so:
np.array([[1,2],
[5,3]])
However, when I try my code it errors.
Here is my code:
for x in range(0, len(arr)):
if arr[x][1] == arr[x-1][1]:
arr = np.delete(arr, x, 0)
>>> IndexError: index 3 is out of bounds for axis 0 with size 2
Rather than trying to delete from the array, you can use np.unique to find the indices of first occurrences of the unique values in the second columns and use that to pull those values out:
import numpy as np
arr = np.array([[1,2],
[2,2],
[3,2],
[4,2],
[5,3]])
u, i = np.unique(arr[:,1], return_index=True)
arr[i]
# array([[1, 2],
# [5, 3]])
I want to check how many columns of a numpy array/matrix have only positive values.
I took my matrix and printed A>0 and got True and False and then I tried any and all functions but didn't succeed.
In [55]: a = np.array([[13, 21, 12],
[21, -1, 6],
[ 1, 10, 2],
[41, 1, 4]])
The output should be 2.
I saved the matrix A in B and tried writing:
B.all(axis=1).any()>0
This function counts the number of column whose elements are all greater than 0:
def count(mat):
counter = 0
tmp = mat > 0
for col in tmp.T:
if all(col):
counter += 1
return counter
How does this function work?
First it assigns to tmp a matrix of boolean values indicating if the corresponding value of the original matrix was greater than 0, then it iterates through the transpose of such matrix and checks if all the values are True, meaning they are all greater than 0.
The transpose contains the columns of the original matrix. Whey you create a numpy array you pass the rows to the function. By transposing, the array will contain the columns.
Evaluate the results of the following statements for the following NumPy array A:
A = numpy.array([[8,3,1,0] , [2,2,4,-1] , [3,-2,1,6]])
>> z = A[[0,2],[3,0]]
"Question: What is the output?"
array([0, 3]) "Answer"
>> t = numpy.where(A[1:3,1:]>2)
"Question: What is the output?"
(array([0, 1], dtype=int64), array([1, 2], dtype=int64)) "Answer"
I didn't understand the answer. What did we process the array?
You get elements from first ([8,3,1,0]) and third ([3,-2,1,6]) arrays of A (due to zero-based [0,2] specifications).
Now, from the first array you get element 3, i.e. fourth number which is 0.
From the third array you get element 0, i.e. the first number which is 3
For your second question you slice elements starting second from arrays starting second up to fourth, i.e. [2,4,-1] , [-2,1,6]
From those arrays you attempt to get elements more than 2. There are only two numbers - from array 0 and 1 and the corresponding elements (4 and 6) have order numbers 1 and 2 in python zero-based definition. This is the answer.
It is called slicing.
First get the value of A[[0,2]] where 0 is first item (list) and 2 is third item
array([[ 8, 3, 1, 0],
[ 3, -2, 1, 6]])
Then A[[0,2],[3,0]] means get the third of first list and first item of second list. Thus,
array([0, 3])
I want to find indexes of row based on criteria over certain columns
So, something like:
import numpy as np
x = np.random.rand(4, 5)
x[2, 2] = 0
x[2, 3] = 0
x[3, 1] = 0
x[1, 3] = 0
Now, I want to get the index of the rows where either of columns 3 or 4 are zeros. How can one do that with numpy? Do I need to make multiple calls to nonzero for each column and combine these indices using a set or something like that?
Using np.where first array within the tuple is row index
np.where(x[:,[3,4]]==0)
Out[79]: (array([1, 2], dtype=int64), array([0, 0], dtype=int64))
I am looking for the first column containing a nonzero element in a sparse matrix (scipy.sparse.csc_matrix). Actually, the first column starting with the i-th one to contain a nonzero element.
This is part of a certain type of linear equation solver. For dense matrices I had the following: (Relevant line is pcol = ...)
import numpy
D = numpy.matrix([[1,0,0],[2,0,0],[3,0,1]])
i = 1
pcol = i + numpy.argmax(numpy.any(D[:,i:], axis=0))
if pcol != i:
# Pivot columns i, pcol
D[:,[i,pcol]] = D[:,[pcol,i]]
print(D)
# Result should be numpy.matrix([[1,0,0],[2,0,0],[3,1,0]])
The above should swap columns 1 and 2. If we set i = 0 instead, D is unchanged since column 0 already contains nonzero entries.
What is an efficient way to do this for scipy.sparse matrices? Are there analogues for the numpy.any() and numpy.argmax() functions?
With a csc matrix it is easy to find the nonzero columns.
In [302]: arr=sparse.csc_matrix([[0,0,1,2],[0,0,0,2]])
In [303]: arr.A
Out[303]:
array([[0, 0, 1, 2],
[0, 0, 0, 2]])
In [304]: arr.indptr
Out[304]: array([0, 0, 0, 1, 3])
In [305]: np.diff(arr.indptr)
Out[305]: array([0, 0, 1, 2])
The last line shows how many nonzero terms there are in each column.
np.nonzero(np.diff(arr.indptr))[0][0] would be the index of the first nonzero value in that diff.
Do the same on a csr matrix for find the 1st nonzero row.
I can elaborate on indptr if you want.