Numpy Array (What does that answer mean?) - python

Evaluate the results of the following statements for the following NumPy array A:
A = numpy.array([[8,3,1,0] , [2,2,4,-1] , [3,-2,1,6]])
>> z = A[[0,2],[3,0]]
"Question: What is the output?"
array([0, 3]) "Answer"
>> t = numpy.where(A[1:3,1:]>2)
"Question: What is the output?"
(array([0, 1], dtype=int64), array([1, 2], dtype=int64)) "Answer"
I didn't understand the answer. What did we process the array?

You get elements from first ([8,3,1,0]) and third ([3,-2,1,6]) arrays of A (due to zero-based [0,2] specifications).
Now, from the first array you get element 3, i.e. fourth number which is 0.
From the third array you get element 0, i.e. the first number which is 3
For your second question you slice elements starting second from arrays starting second up to fourth, i.e. [2,4,-1] , [-2,1,6]
From those arrays you attempt to get elements more than 2. There are only two numbers - from array 0 and 1 and the corresponding elements (4 and 6) have order numbers 1 and 2 in python zero-based definition. This is the answer.

It is called slicing.
First get the value of A[[0,2]] where 0 is first item (list) and 2 is third item
array([[ 8, 3, 1, 0],
[ 3, -2, 1, 6]])
Then A[[0,2],[3,0]] means get the third of first list and first item of second list. Thus,
array([0, 3])

Related

Understanding the slicing of NumPy array

I haven't understood the output of the following program:
import numpy as np
myList = [[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]]
myNumpyArray = np.array(myList)
print(myNumpyArray[0:3, 1:3])
Output
[[ 2 3]
[ 6 7]
[10 11]]
What I knew that would be the intersection of all rows, and 2nd to 4th columns. In that logic, the output should be:
2 3 4
6 7 8
10 11 12
14 15 16
What am I missing here?
The ending indices (the 3's in 0:3 and 1:3) are exclusive, not inclusive, while the starting indices (0 and 1) are in fact inclusive. If the ending indices were inclusive, then the output would be as you expect. But because they're exclusive, you're actually only grabbing rows 0, 1, and 2, and columns 1 and 2. The output is the intersection of those, which is equivalent to the output you're seeing.
If you are trying to get the data you expect, you can do myNumpyArray[:, 1:]. The : simply grabs all the elements of the array (in your case, in the first dimension of the array), and the 1: grabs all the content of the array starting at index 1, ignoring the data in the 0th place.
This is a classic case of just needing to understand slice notation.
inside the brackets, you have the slice for each dimension:
arr[dim1_start:dim1_end, dim2_start, dim2_end]
For the above notation, the slice will include the elements starting at dimX_start, up to, and not including, dimX_end.
So, for what you wrote: myNumpyArray[0:3, 1:3]
you selected rows 0, 1, and 2 (not including 3) and columns 1 and 2 (not including 3)
I hope that helps explain your results.
For the result you were expecting, you would need something more like:
print(myNumpyArray[0:4, 1:4])
For more info on slicing, you might go to the numpy docs or look at a similar question posted a while back.

numpy get row index where elements in certain columns are zero

I want to find indexes of row based on criteria over certain columns
So, something like:
import numpy as np
x = np.random.rand(4, 5)
x[2, 2] = 0
x[2, 3] = 0
x[3, 1] = 0
x[1, 3] = 0
Now, I want to get the index of the rows where either of columns 3 or 4 are zeros. How can one do that with numpy? Do I need to make multiple calls to nonzero for each column and combine these indices using a set or something like that?
Using np.where first array within the tuple is row index
np.where(x[:,[3,4]]==0)
Out[79]: (array([1, 2], dtype=int64), array([0, 0], dtype=int64))

How to the function 'where' generate this array? [duplicate]

This question already has answers here:
How do I use numpy.where()? What should I pass, and what does the result mean? [closed]
(2 answers)
Closed 4 years ago.
>>> x = np.arange(9.).reshape(3, 3)
>>> np.where( x > 5 )
(array([2, 2, 2]), array([0, 1, 2]))
What does the x>5 exactly mean? The resulting array seems mysterious.
It's a tuple with row and column indices. x > 5 returns a boolean array of the same shape as x with elements set to True where the condition is fulfilled and False otherwise. According to the documentation np.where will fallback on condition.nonzero when given no other arguments. For your given example all elements greater than 5 happen to be in row 2 and all columns fulfill the condition, hence the [2, 2, 2] (rows), [0, 1, 2] (columns). Note that you can use this result to index the original array:
>>> x[np.where(x > 5)]
[6 7 8]
The usual syntax is np.where(condition, res_if_true, res_if_false). With only the first argument, this is a special case described in the docs:
When only condition is provided, this function is a shorthand for
np.asarray(condition).nonzero().
So first calculate x > 5:
arr = x > 5
print(arr)
# array([[False, False, False],
# [False, False, False],
# [ True, True, True]])
Since it's already an array, calculate arr.nonzero():
print(arr.nonzero())
# (array([2, 2, 2], dtype=int64), array([0, 1, 2], dtype=int64))
This return the indices of the elements that are non-zero. The first element of the tuple represents coordinates of axis=0 and the second element coordinates of axis=1, i.e. all values in the 2nd and final row are greater than 5.

deleting rows based on value found in specififc column

I am attempting to write a code that searches a numpy array for cases where the value in the fifth column does not have 50. If it does not I wish to remove it.
This is what I have so far:
for rows in range(len(b)):
if b[:,4].any() != 50:
b = np.delete(b, b[rows])
However, I keep getting the following error:
too many indices for array
Lets run the calculation with some diagnositic prints. Note where the error occurs. That's important! (We shouldn't just keep trying things without isolating the problem!)
In [2]: b=np.array([[0,1,2],[1,2,3],[2,1,2]])
In [3]: for row in range(len(b)):
...: print(row)
...: if b[:,2].any() !=2:
...: print(b[row])
...: b = np.delete(b, b[row])
...:
0
[0 1 2]
1
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-3-04dc188d9a2b> in <module>()
1 for row in range(len(b)):
2 print(row)
----> 3 if b[:,2].any() !=2:
4 print(b[row])
5 b = np.delete(b, b[row])
IndexError: too many indices for array
So the error occurs on the 2nd iteration (row 1). Something is wrong with the b after the delete. What is the new value of b?
In [4]: b
Out[4]: array([1, 2, 3, 2, 1, 2])
b is a 1d array, not the 2d we started with. That explains the error, right? Something must be wrong with the use of delete. Maybe we need to check its documentation????
Look at the axis parameter:
axis : int, optional
The axis along which to delete the subarray defined by `obj`.
If `axis` is None, `obj` is applied to the flattened array.
We didn't specify an axis, so the delete was applied to the flattened array, and result was flattened - 1d.
But even if I specify an axis I get an error (I won't get into that), which prompts me to look more carefully at the if condition:
In [10]: b[:,2]
Out[10]: array([2, 3, 2])
In [11]: b[:,2].any()
Out[11]: True
In [12]: b[:,2]!=2
Out[12]: array([False, True, False])
Applying any to the column don't make sense - it just checks if any values in the column are not 0. Instead we want to test the column against the target, getting a boolean that matches the column in size.
We can use that boolean directly as row selection mask
In [13]: b[_,:]
Out[13]: array([[1, 2, 3]])
No need to iterate.
Another problem with your iteration. You iterate on the range(3), [0,1,2]. But inside the loop you try to remove a row from b, changing the size of b. That going to give problems when you try to index b[row] by number, right? When iterating, in Python or numpy, be careful about modifying the object that you are iterating over.
Sorry to be long winded about this, but it looks like you need some basic debugging guidance.
Here's a basic list approach:
In [15]: [row for row in b if row[2]!=2]
Out[15]: [array([1, 2, 3])]
I'm iterating on the rows, not their indices, and for each row checking the column value, and keeping that row if the check is True. We could do that with np.delete, but a list comprehension is clearer (and faster).
It would be better to provide b and desired output, but if i understand it correctly, you could use:
import numpy as np
b = np.array([[50, 2, 3, 4, 5, 6],
[4, 50, 6, 7, 8, 9],
[1, 1, 1, 1, 50, 9]])
array([[50, 2, 3, 4, 5, 6],
[ 4, 50, 6, 7, 8, 9],
[ 1, 1, 1, 1, 50, 9]])
Then you can check which rows contain 50 in the 5th column using
b[:, 4] == 50
array([False, False, True])
and feed this Boolean array back to b to select the desired columns:
b[b[:, 4] == 50]
which leaves you with one row in this case
array([[ 1, 1, 1, 1, 50, 9]])

Find Indices Of Columns Having Some Nonzero Element In A 2d array

I have a numpy array with dim (157,1944).
I want to get indices of columns that have a Nonzero element in any row.
example: [[0,0,3,4], [0,0,1,1]] ----> [2,3]
If you look each row, there is a Non Zero element in columns [2, 3]
So if I have
[[0,1,3,4], [0,0,1,1]]
I should get [1,2,3] because column index 0 has no Nonzero elements in any row.
Not sure if your question is completely defined. However, say we start with
import numpy as np
a = np.array([[0,0,3,4], [0,0,1,1]])
then
>>> np.nonzero(np.all(a != 0, axis=0))[0]
array([2, 3])
are the indices of the columns for which none of the rows are nonzero, and
>>> np.nonzero(np.any(a != 0, axis=0))[0]
array([2, 3])
are the indices of the columns for which not all of the rows are zero (it happens to be the same for the example you gave).

Categories