Select "corner" elements of a 2D NumPy array [duplicate] - python

This question already has answers here:
Get corner values in Python numpy ndarray
(4 answers)
Closed 4 years ago.
I am trying to extract the four corner elements of a NumPy 2D array:
import numpy as np
data = np.arange(16).reshape((4, -1))
#array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11],
# [12, 13, 14, 15]])
The expected output is either [[0,3],[12,15]] or [0,3,12,15] (anything goes). True 2D fancy indexing delivers only the ends of the main diagonal:
data[[0,-1],[0,-1]]
#array([ 0, 15])
Pseudo-2D fancy indexing (first row-wise, then column-wise) delivers the right answer, but looks awkward:
data[[0,-1]][:,[0,-1]]
#array([[ 0, 3],
# [12, 15]])
Is there a way to use true fancy indexing, such as data[XXX,YYY], where XXX and YYY are lists/arrays/slices, to extract all four corners?

You can do:
data[[0, 0, -1, -1], [0, -1, 0, -1]]

Here are two possibilities. (Ok, first one isn't actually fancy):
>>> a = np.arange(9).reshape(3, 3)
>>>
>>> m, n = a.shape
>>> a[::m-1, ::n-1]
array([[0, 2],
[6, 8]])
>>>
>>> a[np.ix_((0,-1), (0,-1))]
array([[0, 2],
[6, 8]])
More explicitly:
>>> idx = np.ix_((0,-1), (0,-1))
>>> idx
(array([[ 0],
[-1]]), array([[ 0, -1]]))
>>> a[idx]
array([[0, 2],
[6, 8]])
The trick is to leverage broadcasting on the indices. np.ix_ knows the details of how to do it.

Related

Is there an array method for testing multiple equality values?

I want to know where array a is equal to any of the values in array b.
For example,
a = np.random.randint(0,16, size=(3,4))
b = np.array([2,3,9])
# like this, but for any size b:
locations = np.nonzero((a==b[0]) | (a==b[1]) | (a==b[3]))
The reason is so I can change the values in a from (any of b) to another value:
a[locations] = 99
Or-ing the equality checks is not a great solution, because I would like to do this without knowing the size of b ahead of time. Is there an array solution?
[edit]
There are now 2 good answers to this question, one using broadcasting with extra dimensions, and another using np.in1d. Both work for the specific case in this question. I ended up using np.isin instead, since it seems like it is more agnostic to the shapes of both a and b.
I accepted the answer that taught me about in1d since that led me to my preferred solution.
You can use np.in1d then reshape back to a's shape so you can set the values in a to your special flag.
import numpy as np
np.random.seed(410012)
a = np.random.randint(0, 16, size=(3, 4))
#array([[ 8, 5, 5, 15],
# [ 3, 13, 8, 10],
# [ 3, 11, 0, 10]])
b = np.array([[2,3,9], [4,5,6]])
a[np.in1d(a, b).reshape(a.shape)] = 999
#array([[ 8, 999, 999, 15],
# [999, 13, 8, 10],
# [999, 11, 0, 10]])
Or-ing the equality checks is not a great solution, because I would like to do this without knowing the size of b ahead of time.
EDIT:
Vectorized equivalent to the code you have written above -
a = np.random.randint(0,16, size=(3,4))
b = np.array([2,3,9])
locations = np.nonzero((a==b[0]) | (a==b[1]) | (a==b[2]))
locations2 = np.nonzero((a[None,:,:]==b[:,None,None]).any(0))
np.allclose(locations, locations2)
True
This shows that your output is exactly the same as this output, without the need of explicitly mentioning b[0], b[1]... or using a for loop.
Explanation -
Broadcasting an operation can help you in this case. What you are trying to do is to compare each of the (3,4) matrix elements to each value in b which is (3,). This means that the resultant boolean matrix that you want is going to be three, (3,4) matrices, or (3,3,4)
Once you have done that, you want to take an ANY or OR between the three (3,4) matrices element-wise. That would reduce the (3,3,4) to a (3,4)
Finally you want to use np.nonzero to identify the locations where values are equal to TRUE
The above 3 steps can be done as follows -
Broadcasting comparison operation:
a[None,:,:]==b[:,None,None]] #(1,3,4) == (3,1,1) -> (3,3,4)
Reduction using OR logic:
(a[None,:,:]==b[:,None,None]).any(0) #(3,3,4) -> (3,4)
Get non-zero locations:
np.nonzero((a[None,:,:]==b[:,None,None]).any(0))
numpy.isin works on multi-dimensional a and b.
In [1]: import numpy as np
In [2]: a = np.random.randint(0, 16, size=(3, 4)); a
Out[2]:
array([[12, 2, 15, 11],
[12, 15, 5, 10],
[ 4, 2, 14, 7]])
In [3]: b = [2, 4, 5, 12]
In [4]: c = [[2, 4], [5, 12]]
In [5]: np.isin(a, b).astype(int)
Out[5]:
array([[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 0, 0]])
In [6]: np.isin(a, c).astype(int)
Out[6]:
array([[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 0, 0]])
In [7]: a[np.isin(a, b)] = 99; a
Out[7]:
array([[99, 99, 15, 11],
[99, 15, 99, 10],
[99, 99, 14, 7]])

Choose indices in numpy arrays on particular dimensions [duplicate]

This question already has answers here:
Index n dimensional array with (n-1) d array
(3 answers)
Closed 4 years ago.
It is hard to find a clear title but an example will put it clearly.
For example, my inputs are:
c = np.full((4, 3, 2), 5)
c[:,:,1] *= 2
ix = np.random.randint(0, 2, (4, 3))
if ix is:
array([[1, 0, 1],
[0, 0, 1],
[0, 0, 1],
[1, 1, 0]])
if want as a result:
array([[10, 5, 10],
[ 5, 5, 10],
[ 5, 5, 10],
[10, 10, 5]])
My c array can be of arbitrary dimensions, as well a the dimension I want to sample in.
It sounds like interpolation, but I'm reluctant to construct a be array of indices each time I want to apply this. Is there a way of doing this using some kind of indexing on numpy arrays ? Or do I have to use some interpolation methods...
Speed and memory are a concern here because I have to do this many times, and the arrays can be really large.
Thanks for any insight !
Create the x, y indices with numpy.ogrid, and then use advanced indexing:
idx, idy = np.ogrid[:c.shape[0], :c.shape[1]]
c[idx, idy, ix]
#array([[10, 5, 10],
# [ 5, 5, 10],
# [ 5, 5, 10],
# [10, 10, 5]])

Numpy.where used with list of values

I have a 2d and 1d array. I am looking to find the two rows that contain at least once the values from the 1d array as follows:
import numpy as np
A = np.array([[0, 3, 1],
[9, 4, 6],
[2, 7, 3],
[1, 8, 9],
[6, 2, 7],
[4, 8, 0]])
B = np.array([0,1,2,3])
results = []
for elem in B:
results.append(np.where(A==elem)[0])
This works and results in the following array:
[array([0, 5], dtype=int64),
array([0, 3], dtype=int64),
array([2, 4], dtype=int64),
array([0, 2], dtype=int64)]
But this is probably not the best way of proceeding. Following the answers given in this question (Search Numpy array with multiple values) I tried the following solutions:
out1 = np.where(np.in1d(A, B))
num_arr = np.sort(B)
idx = np.searchsorted(B, A)
idx[idx==len(num_arr)] = 0
out2 = A[A == num_arr[idx]]
But these give me incorrect values:
In [36]: out1
Out[36]: (array([ 0, 1, 2, 6, 8, 9, 13, 17], dtype=int64),)
In [37]: out2
Out[37]: array([0, 3, 1, 2, 3, 1, 2, 0])
Thanks for your help
If you need to know whether each row of A contains ANY element of array B without interest in which particular element of B it is, the following script can be used:
input:
np.isin(A,B).sum(axis=1)>0
output:
array([ True, False, True, True, True, True])
Since you're dealing with a 2D array* you can use broadcasting to compare B with raveled version of A. This will give you the respective indices in a raveled shape. Then you can reverse the result and get the corresponding indices in original array using np.unravel_index.
In [50]: d = np.where(B[:, None] == A.ravel())[1]
In [51]: np.unravel_index(d, A.shape)
Out[51]: (array([0, 5, 0, 3, 2, 4, 0, 2]), array([0, 2, 2, 0, 0, 1, 1, 2]))
^
# expected result
* From documentation: For 3-dimensional arrays this is certainly efficient in terms of lines of code, and, for small data sets, it can also be computationally efficient. For large data sets, however, the creation of the large 3-d array may result in sluggish performance.
Also, Broadcasting is a powerful tool for writing short and usually intuitive code that does its computations very efficiently in C. However, there are cases when broadcasting uses unnecessarily large amounts of memory for a particular algorithm. In these cases, it is better to write the algorithm's outer loop in Python. This may also produce more readable code, as algorithms that use broadcasting tend to become more difficult to interpret as the number of dimensions in the broadcast increases.
Is something like this what you are looking for?
import numpy as np
from itertools import combinations
A = np.array([[0, 3, 1],
[9, 4, 6],
[2, 7, 3],
[1, 8, 9],
[6, 2, 7],
[4, 8, 0]])
B = np.array([0,1,2,3])
for i in combinations(A, 2):
if np.all(np.isin(B, np.hstack(i))):
print(i[0], ' ', i[1])
which prints the following:
[0 3 1] [2 7 3]
[0 3 1] [6 2 7]
note: this solution does NOT require the rows be consecutive. Please let me know if that is required.

Flip or reverse columns in numpy array

I want to flip the first and second values of arrays in an array. A naive solution is to loop through the array. What is the right way of doing this?
import numpy as np
contour = np.array([[1, 4],
[3, 2]])
flipped_contour = np.empty((0,2))
for point in contour:
x_y_fipped = np.array([point[1], point[0]])
flipped_contour = np.vstack((flipped_contour, x_y_fipped))
print(flipped_contour)
[[4. 1.]
[2. 3.]]
Use the aptly named np.flip:
np.flip(contour, axis=1)
Or,
np.fliplr(contour)
array([[4, 1],
[2, 3]])
You can use numpy indexing:
contour[:, ::-1]
In addition to COLDSPEED's answer, if we only want to swap the first and second column only, not to flip the entire array:
contour[:, :2] = contour[:, 1::-1]
Here contour[:, 1::-1] is the array formed by first two columns of the array contour, in the reverse order. It then is assigned to the first two columns (contour[:, :2]). Now the first two column are swapped.
In general, to swap the ith and jth columns, do the following:
contour[:, [i, j]] = contour[:, [j, i]]
Here are two non-inplace ways of swapping the first two columns:
>>> a = np.arange(15).reshape(3, 5)
>>> a[:, np.r_[1:-1:-1, 2:5]]
array([[ 1, 0, 2, 3, 4],
[ 6, 5, 7, 8, 9],
[11, 10, 12, 13, 14]])
or
>>> np.c_[a[:, 1::-1], a[:, 2:]]
array([[ 1, 0, 2, 3, 4],
[ 6, 5, 7, 8, 9],
[11, 10, 12, 13, 14]])
>>> your_array[indices_to_flip] = np.flip(your_array[indices_to_flip], axis=1)

Numpy array slicing using colons

I am trying to learn numpy array slicing.
But this is a syntax i cannot seem to understand.
What does
a[:1] do.
I ran it in python.
a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])
a = a.reshape(2,2,2,2)
a[:1]
Output:
array([[[ 5, 6],
[ 7, 8]],
[[13, 14],
[15, 16]]])
Can someone explain to me the slicing and how it works. The documentation doesn't seem to answer this question.
Another question would be would there be a way to generate the a array using something like
np.array(1:16) or something like in python where
x = [x for x in range(16)]
The commas in slicing are to separate the various dimensions you may have. In your first example you are reshaping the data to have 4 dimensions each of length 2. This may be a little difficult to visualize so if you start with a 2D structure it might make more sense:
>>> a = np.arange(16).reshape((4, 4))
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> a[0] # access the first "row" of data
array([0, 1, 2, 3])
>>> a[0, 2] # access the 3rd column (index 2) in the first row of the data
2
If you want to access multiple values using slicing you can use the colon to express a range:
>>> a[:, 1] # get the entire 2nd (index 1) column
array([[1, 5, 9, 13]])
>>> a[1:3, -1] # get the second and third elements from the last column
array([ 7, 11])
>>> a[1:3, 1:3] # get the data in the second and third rows and columns
array([[ 5, 6],
[ 9, 10]])
You can do steps too:
>>> a[::2, ::2] # get every other element (column-wise and row-wise)
array([[ 0, 2],
[ 8, 10]])
Hope that helps. Once that makes more sense you can look in to stuff like adding dimensions by using None or np.newaxis or using the ... ellipsis:
>>> a[:, None].shape
(4, 1, 4)
You can find more here: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
It might pay to explore the shape and individual entries as we go along.
Let's start with
>>> a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])
>>> a.shape
(16, )
This is a one-dimensional array of length 16.
Now let's try
>>> a = a.reshape(2,2,2,2)
>>> a.shape
(2, 2, 2, 2)
It's a multi-dimensional array with 4 dimensions.
Let's see the 0, 1 element:
>>> a[0, 1]
array([[5, 6],
[7, 8]])
Since there are two dimensions left, it's a matrix of two dimensions.
Now a[:, 1] says: take a[i, 1 for all possible values of i:
>>> a[:, 1]
array([[[ 5, 6],
[ 7, 8]],
[[13, 14],
[15, 16]]])
It gives you an array where the first item is a[0, 1], and the second item is a[1, 1].
To answer the second part of your question (generating arrays of sequential values) you can use np.arange(start, stop, step) or np.linspace(start, stop, num_elements). Both of these return a numpy array with the corresponding range of values.

Categories