Index n dimensional array with (n-1) d array - python

What is the most elegant way to access an n dimensional array with an (n-1) dimensional array along a given dimension as in the dummy example
a = np.random.random_sample((3,4,4))
b = np.random.random_sample((3,4,4))
idx = np.argmax(a, axis=0)
How can I access now with idx a to get the maxima in a as if I had used a.max(axis=0)? or how to retrieve the values specified by idx in b?
I thought about using np.meshgrid but I think it is an overkill. Note that the dimension axis can be any usefull axis (0,1,2) and is not known in advance. Is there an elegant way to do this?

Make use of advanced-indexing -
m,n = a.shape[1:]
I,J = np.ogrid[:m,:n]
a_max_values = a[idx, I, J]
b_max_values = b[idx, I, J]
For the general case:
def argmax_to_max(arr, argmax, axis):
"""argmax_to_max(arr, arr.argmax(axis), axis) == arr.max(axis)"""
new_shape = list(arr.shape)
del new_shape[axis]
grid = np.ogrid[tuple(map(slice, new_shape))]
grid.insert(axis, argmax)
return arr[tuple(grid)]
Quite a bit more awkward than such a natural operation should be, unfortunately.
For indexing a n dim array with a (n-1) dim array, we could simplify it a bit to give us the grid of indices for all axes, like so -
def all_idx(idx, axis):
grid = np.ogrid[tuple(map(slice, idx.shape))]
grid.insert(axis, idx)
return tuple(grid)
Hence, use it to index into input arrays -
axis = 0
a_max_values = a[all_idx(idx, axis=axis)]
b_max_values = b[all_idx(idx, axis=axis)]

using indexing in numpy https://docs.scipy.org/doc/numpy-1.10.1/reference/arrays.indexing.html#advanced-indexing
a = np.array([[1, 2], [3, 4], [5, 6]])
a
> a: array([[1, 2],
[3, 4],
[5, 6]])
idx = a.argmax(axis=1)
idx
> array([1, 0, 0], dtype=int64)
since you want all rows but only columns with idx indexes you can use [0, 1, 2] or np.arange(a.shape[0]) for the row indexes
rows = np.arange(a.shape[0])
a[rows, idx]
>array([3, 2, 1])
which is the same as a.max(axis=1)
a.max(axis=1)
>array([3, 2, 1])
if you have 3 dimensions you add the indexes of the 3rd dimension as well:
index2 = np.arange(a.shape[2])
a[rows, idx, index2]

I suggest the following:
a = np.array([[1, 3], [2, -2], [1, -1]])
a
>array([[ 1, 3],
[ 2, -2],
[ 1, -1]])
idx = a.argmax(axis=1)
idx
> array([1, 0, 0], dtype=int64)
np.take_along_axis(a, idx[:, None], axis=1).squeeze()
>array([3, 2, 1])
a.max(axis=1)
>array([3, 2, 1])

Related

Search elements of one array in another, row-wise - Python / NumPy

For example, I have a matrix of unique elements,
a=[
[1,2,3,4],
[7,5,8,6]
]
and another unique matrix filled with elements which has appeard in the first matrix.
b=[
[4,1],
[5,6]
]
And I expect the result of
[
[3,0],
[1,3]
].
That is to say, I want to find each row elements of b which equals to some elements of a in the same row, return the indices of these elements in a.
How can i do that? Thanks.
Here's a vectorized approach -
# https://stackoverflow.com/a/40588862/ #Divakar
def searchsorted2d(a,b):
m,n = a.shape
max_num = np.maximum(a.max() - a.min(), b.max() - b.min()) + 1
r = max_num*np.arange(a.shape[0])[:,None]
p = np.searchsorted( (a+r).ravel(), (b+r).ravel() ).reshape(m,-1)
return p - n*(np.arange(m)[:,None])
def search_indices(a, b):
sidx = a.argsort(1)
a_s = np.take_along_axis(a,sidx,axis=1)
return np.take_along_axis(sidx,searchsorted2d(a_s,b),axis=1)
Sample run -
In [54]: a
Out[54]:
array([[1, 2, 3, 4],
[7, 5, 8, 6]])
In [55]: b
Out[55]:
array([[4, 1],
[5, 6]])
In [56]: search_indices(a, b)
Out[56]:
array([[3, 0],
[1, 3]])
Another vectorized one leveraging broadcasting -
In [65]: (a[:,None,:]==b[:,:,None]).argmax(2)
Out[65]:
array([[3, 0],
[1, 3]])
If you don't mind using loops, here's a quick solution using np.where:
import numpy as np
a=[[1,2,3,4],
[7,5,8,6]]
b=[[4,1],
[5,6]]
a = np.array(a)
b = np.array(b)
c = np.zeros_like(b)
for i in range(c.shape[0]):
for j in range(c.shape[1]):
_, pos = np.where(a==b[i,j])
c[i,j] = pos
print(c.tolist())
You can do it this way:
np.split(pd.DataFrame(a).where(pd.DataFrame(np.isin(a,b))).T.sort_values(by=[0,1])[::-1].unstack().dropna().reset_index().iloc[:,1].to_numpy(),len(a))
# [array([3, 0]), array([1, 3])]

Vectorizing this for-loop in numpy

I was wondering how I would vectorize this for loop. Given a 2x2x2 array x and an array where each element is the ith, jth, and kth element of the array I want to get x[i,j,k]
Given an arrays x and y
x = np.arange(8).reshape((2, 2, 2))
y = [[0, 1, 1], [1, 1, 0]]
I want to get:
x[0, 1, 1] = 3 and x[1, 1, 0] = 6
I tried:
print(x[y])
But it prints:
array([[2, 3],
[6, 7],
[4, 5]])
So I ended up doing:
for y_ in y:
print(x[y_[0], y_[1], y_[2]])
Which works, but I can't help but think there is a better way.
Use transposed y i.e zip(*y) as the index; You need to have the indices for each dimension as an element for advanced indexing to work:
x[tuple(zip(*y))]
# array([3, 6])

numpy advanced indexing with array [duplicate]

What is the most elegant way to access an n dimensional array with an (n-1) dimensional array along a given dimension as in the dummy example
a = np.random.random_sample((3,4,4))
b = np.random.random_sample((3,4,4))
idx = np.argmax(a, axis=0)
How can I access now with idx a to get the maxima in a as if I had used a.max(axis=0)? or how to retrieve the values specified by idx in b?
I thought about using np.meshgrid but I think it is an overkill. Note that the dimension axis can be any usefull axis (0,1,2) and is not known in advance. Is there an elegant way to do this?
Make use of advanced-indexing -
m,n = a.shape[1:]
I,J = np.ogrid[:m,:n]
a_max_values = a[idx, I, J]
b_max_values = b[idx, I, J]
For the general case:
def argmax_to_max(arr, argmax, axis):
"""argmax_to_max(arr, arr.argmax(axis), axis) == arr.max(axis)"""
new_shape = list(arr.shape)
del new_shape[axis]
grid = np.ogrid[tuple(map(slice, new_shape))]
grid.insert(axis, argmax)
return arr[tuple(grid)]
Quite a bit more awkward than such a natural operation should be, unfortunately.
For indexing a n dim array with a (n-1) dim array, we could simplify it a bit to give us the grid of indices for all axes, like so -
def all_idx(idx, axis):
grid = np.ogrid[tuple(map(slice, idx.shape))]
grid.insert(axis, idx)
return tuple(grid)
Hence, use it to index into input arrays -
axis = 0
a_max_values = a[all_idx(idx, axis=axis)]
b_max_values = b[all_idx(idx, axis=axis)]
using indexing in numpy https://docs.scipy.org/doc/numpy-1.10.1/reference/arrays.indexing.html#advanced-indexing
a = np.array([[1, 2], [3, 4], [5, 6]])
a
> a: array([[1, 2],
[3, 4],
[5, 6]])
idx = a.argmax(axis=1)
idx
> array([1, 0, 0], dtype=int64)
since you want all rows but only columns with idx indexes you can use [0, 1, 2] or np.arange(a.shape[0]) for the row indexes
rows = np.arange(a.shape[0])
a[rows, idx]
>array([3, 2, 1])
which is the same as a.max(axis=1)
a.max(axis=1)
>array([3, 2, 1])
if you have 3 dimensions you add the indexes of the 3rd dimension as well:
index2 = np.arange(a.shape[2])
a[rows, idx, index2]
I suggest the following:
a = np.array([[1, 3], [2, -2], [1, -1]])
a
>array([[ 1, 3],
[ 2, -2],
[ 1, -1]])
idx = a.argmax(axis=1)
idx
> array([1, 0, 0], dtype=int64)
np.take_along_axis(a, idx[:, None], axis=1).squeeze()
>array([3, 2, 1])
a.max(axis=1)
>array([3, 2, 1])

Numpy - Converting array of indices to array of values [duplicate]

What is the most elegant way to access an n dimensional array with an (n-1) dimensional array along a given dimension as in the dummy example
a = np.random.random_sample((3,4,4))
b = np.random.random_sample((3,4,4))
idx = np.argmax(a, axis=0)
How can I access now with idx a to get the maxima in a as if I had used a.max(axis=0)? or how to retrieve the values specified by idx in b?
I thought about using np.meshgrid but I think it is an overkill. Note that the dimension axis can be any usefull axis (0,1,2) and is not known in advance. Is there an elegant way to do this?
Make use of advanced-indexing -
m,n = a.shape[1:]
I,J = np.ogrid[:m,:n]
a_max_values = a[idx, I, J]
b_max_values = b[idx, I, J]
For the general case:
def argmax_to_max(arr, argmax, axis):
"""argmax_to_max(arr, arr.argmax(axis), axis) == arr.max(axis)"""
new_shape = list(arr.shape)
del new_shape[axis]
grid = np.ogrid[tuple(map(slice, new_shape))]
grid.insert(axis, argmax)
return arr[tuple(grid)]
Quite a bit more awkward than such a natural operation should be, unfortunately.
For indexing a n dim array with a (n-1) dim array, we could simplify it a bit to give us the grid of indices for all axes, like so -
def all_idx(idx, axis):
grid = np.ogrid[tuple(map(slice, idx.shape))]
grid.insert(axis, idx)
return tuple(grid)
Hence, use it to index into input arrays -
axis = 0
a_max_values = a[all_idx(idx, axis=axis)]
b_max_values = b[all_idx(idx, axis=axis)]
using indexing in numpy https://docs.scipy.org/doc/numpy-1.10.1/reference/arrays.indexing.html#advanced-indexing
a = np.array([[1, 2], [3, 4], [5, 6]])
a
> a: array([[1, 2],
[3, 4],
[5, 6]])
idx = a.argmax(axis=1)
idx
> array([1, 0, 0], dtype=int64)
since you want all rows but only columns with idx indexes you can use [0, 1, 2] or np.arange(a.shape[0]) for the row indexes
rows = np.arange(a.shape[0])
a[rows, idx]
>array([3, 2, 1])
which is the same as a.max(axis=1)
a.max(axis=1)
>array([3, 2, 1])
if you have 3 dimensions you add the indexes of the 3rd dimension as well:
index2 = np.arange(a.shape[2])
a[rows, idx, index2]
I suggest the following:
a = np.array([[1, 3], [2, -2], [1, -1]])
a
>array([[ 1, 3],
[ 2, -2],
[ 1, -1]])
idx = a.argmax(axis=1)
idx
> array([1, 0, 0], dtype=int64)
np.take_along_axis(a, idx[:, None], axis=1).squeeze()
>array([3, 2, 1])
a.max(axis=1)
>array([3, 2, 1])

Create Numpy 2D Array with data from triplets of (x,y,value)

I have a lot of data in database in (x, y, value) triplet form.
I would like to be able to create dynamically a 2D numpy array from this data by setting value at the coords (x,y) of the array.
For instance if I have :
(0,0,8)
(0,1,5)
(0,2,3)
(1,0,4)
(1,1,0)
(1,2,0)
(2,0,1)
(2,1,2)
(2,2,5)
The resulting array should be :
Array([[8,5,3],[4,0,0],[1,2,5]])
I'm new to numpy, is there any method in numpy to do so ? If not, what approach would you advice to do this ?
Extending the answer from #MaxU, in case the coordinates are not ordered in a grid fashion (or in case some coordinates are missing), you can create your array as follows:
import numpy as np
a = np.array([(0,0,8),(0,1,5),(0,2,3),
(1,0,4),(1,1,0),(1,2,0),
(2,0,1),(2,1,2),(2,2,5)])
Here a represents your coordinates. It is an (N, 3) array, where N is the number of coordinates (it doesn't have to contain ALL the coordinates). The first column of a (a[:, 0]) contains the Y positions while the second columne (a[:, 1]) contains the X positions. Similarly, the last column (a[:, 2]) contains your values.
Then you can extract the maximum dimensions of your target array:
# Maximum Y and X coordinates
ymax = a[:, 0].max()
xmax = a[:, 1].max()
# Target array
target = np.zeros((ymax+1, xmax+1), a.dtype)
And finally, fill the array with data from your coordinates:
target[a[:, 0], a[:, 1]] = a[:, 2]
The line above sets values in target at a[:, 0] (all Y) and a[:, 1] (all X) locations to their corresponding a[:, 2] value (your value).
>>> target
array([[8, 5, 3],
[4, 0, 0],
[1, 2, 5]])
Additionally, if you have missing coordinates, and you want to replace those missing values by some number, you can initialize the array as:
default_value = -1
target = np.full((ymax+1, xmax+1), default_value, a.type)
This way, the coordinates not present in your list will be filled with -1 in the target array/
Why not using sparse matrices? (which is pretty much the format of your triplets.)
First split the triplets in rows, columns, and data using numpy.hsplit(). (Use numpy.squeeze() to convert the resulting 2d arrays to 1d arrays.)
>>> row, col, data = [np.squeeze(splt) for splt
... in np.hsplit(tripets, tripets.shape[-1])]
Use the sparse matrix in coordinate format, and convert it to an array.
>>> from scipy.sparse import coo_matrix
>>> coo_matrix((data, (row, col))).toarray()
array([[8, 5, 3],
[4, 0, 0],
[1, 2, 5]])
is that what you want?
In [37]: a = np.array([(0,0,8)
....: ,(0,1,5)
....: ,(0,2,3)
....: ,(1,0,4)
....: ,(1,1,0)
....: ,(1,2,0)
....: ,(2,0,1)
....: ,(2,1,2)
....: ,(2,2,5)])
In [38]:
In [38]: a
Out[38]:
array([[0, 0, 8],
[0, 1, 5],
[0, 2, 3],
[1, 0, 4],
[1, 1, 0],
[1, 2, 0],
[2, 0, 1],
[2, 1, 2],
[2, 2, 5]])
In [39]:
In [39]: a[:, 2].reshape(3,len(a)//3)
Out[39]:
array([[8, 5, 3],
[4, 0, 0],
[1, 2, 5]])
or a bit more flexible (after your comment):
In [48]: a[:, 2].reshape([int(len(a) ** .5)] * 2)
Out[48]:
array([[8, 5, 3],
[4, 0, 0],
[1, 2, 5]])
Explanation:
this gives you the 3rd column (value):
In [42]: a[:, 2]
Out[42]: array([8, 5, 3, 4, 0, 0, 1, 2, 5])
In [49]: [int(len(a) ** .5)]
Out[49]: [3]
In [50]: [int(len(a) ** .5)] * 2
Out[50]: [3, 3]

Categories