Delete one element from each row of a NumPy array - python

import numpy as np
a=np.array([[1,2,3], [4,5,6], [7,8,9]])
k = [0, 1, 2]
print np.delete(a, k, 1)
This returns
[]
But, the result I really want is
[[2,3],
[4,6],
[7,8]]
I want to delete the first element (indexed as 0) from a[0], the second (indexed as 1) from a[1], and the third (indexed as 2) from a[2].
Any thoughts?

Here's an approach using boolean indexing -
m,n = a.shape
out = a[np.arange(n) != np.array(k)[:,None]].reshape(m,-1)
If you would like to persist with np.delete, you could calculate the linear indices and then delete those after flattening the input array, like so -
m,n = a.shape
del_idx = np.arange(n)*m + k
out = np.delete(a.ravel(),del_idx,axis=0).reshape(m,-1)
Sample run -
In [94]: a
Out[94]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [95]: k = [0, 2, 1]
In [96]: m,n = a.shape
In [97]: a[np.arange(n) != np.array(k)[:,None]].reshape(m,-1)
Out[97]:
array([[2, 3],
[4, 5],
[7, 9]])
In [98]: del_idx = np.arange(n)*m + k
In [99]: np.delete(a.ravel(),del_idx,axis=0).reshape(m,-1)
Out[99]:
array([[2, 3],
[4, 5],
[7, 9]])

Related

Search elements of one array in another, row-wise - Python / NumPy

For example, I have a matrix of unique elements,
a=[
[1,2,3,4],
[7,5,8,6]
]
and another unique matrix filled with elements which has appeard in the first matrix.
b=[
[4,1],
[5,6]
]
And I expect the result of
[
[3,0],
[1,3]
].
That is to say, I want to find each row elements of b which equals to some elements of a in the same row, return the indices of these elements in a.
How can i do that? Thanks.
Here's a vectorized approach -
# https://stackoverflow.com/a/40588862/ #Divakar
def searchsorted2d(a,b):
m,n = a.shape
max_num = np.maximum(a.max() - a.min(), b.max() - b.min()) + 1
r = max_num*np.arange(a.shape[0])[:,None]
p = np.searchsorted( (a+r).ravel(), (b+r).ravel() ).reshape(m,-1)
return p - n*(np.arange(m)[:,None])
def search_indices(a, b):
sidx = a.argsort(1)
a_s = np.take_along_axis(a,sidx,axis=1)
return np.take_along_axis(sidx,searchsorted2d(a_s,b),axis=1)
Sample run -
In [54]: a
Out[54]:
array([[1, 2, 3, 4],
[7, 5, 8, 6]])
In [55]: b
Out[55]:
array([[4, 1],
[5, 6]])
In [56]: search_indices(a, b)
Out[56]:
array([[3, 0],
[1, 3]])
Another vectorized one leveraging broadcasting -
In [65]: (a[:,None,:]==b[:,:,None]).argmax(2)
Out[65]:
array([[3, 0],
[1, 3]])
If you don't mind using loops, here's a quick solution using np.where:
import numpy as np
a=[[1,2,3,4],
[7,5,8,6]]
b=[[4,1],
[5,6]]
a = np.array(a)
b = np.array(b)
c = np.zeros_like(b)
for i in range(c.shape[0]):
for j in range(c.shape[1]):
_, pos = np.where(a==b[i,j])
c[i,j] = pos
print(c.tolist())
You can do it this way:
np.split(pd.DataFrame(a).where(pd.DataFrame(np.isin(a,b))).T.sort_values(by=[0,1])[::-1].unstack().dropna().reset_index().iloc[:,1].to_numpy(),len(a))
# [array([3, 0]), array([1, 3])]

argsort for a multidimensional ndarray

I'm trying to get the indices to sort a multidimensional array by the last axis, e.g.
>>> a = np.array([[3,1,2],[8,9,2]])
And I'd like indices i such that,
>>> a[i]
array([[1, 2, 3],
[2, 8, 9]])
Based on the documentation of numpy.argsort I thought it should do this, but I'm getting the error:
>>> a[np.argsort(a)]
IndexError: index 2 is out of bounds for axis 0 with size 2
Edit: I need to rearrange other arrays of the same shape (e.g. an array b such that a.shape == b.shape) in the same way... so that
>>> b = np.array([[0,5,4],[3,9,1]])
>>> b[i]
array([[5,4,0],
[9,3,1]])
Solution:
>>> a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
array([[1, 2, 3],
[2, 8, 9]])
You got it right, though I wouldn't describe it as cheating the indexing.
Maybe this will help make it clearer:
In [544]: i=np.argsort(a,axis=1)
In [545]: i
Out[545]:
array([[1, 2, 0],
[2, 0, 1]])
i is the order that we want, for each row. That is:
In [546]: a[0, i[0,:]]
Out[546]: array([1, 2, 3])
In [547]: a[1, i[1,:]]
Out[547]: array([2, 8, 9])
To do both indexing steps at once, we have to use a 'column' index for the 1st dimension.
In [548]: a[[[0],[1]],i]
Out[548]:
array([[1, 2, 3],
[2, 8, 9]])
Another array that could be paired with i is:
In [560]: j=np.array([[0,0,0],[1,1,1]])
In [561]: j
Out[561]:
array([[0, 0, 0],
[1, 1, 1]])
In [562]: a[j,i]
Out[562]:
array([[1, 2, 3],
[2, 8, 9]])
If i identifies the column for each element, then j specifies the row for each element. The [[0],[1]] column array works just as well because it can be broadcasted against i.
I think of
np.array([[0],
[1]])
as 'short hand' for j. Together they define the source row and column of each element of the new array. They work together, not sequentially.
The full mapping from a to the new array is:
[a[0,1] a[0,2] a[0,0]
a[1,2] a[1,0] a[1,1]]
def foo(a):
i = np.argsort(a, axis=1)
return (np.arange(a.shape[0])[:,None], i)
In [61]: foo(a)
Out[61]:
(array([[0],
[1]]), array([[1, 2, 0],
[2, 0, 1]], dtype=int32))
In [62]: a[foo(a)]
Out[62]:
array([[1, 2, 3],
[2, 8, 9]])
The above answers are now a bit outdated, since new functionality was added in numpy 1.15 to make it simpler; take_along_axis (https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.take_along_axis.html) allows you to do:
>>> a = np.array([[3,1,2],[8,9,2]])
>>> np.take_along_axis(a, a.argsort(axis=-1), axis=-1)
array([[1 2 3]
[2 8 9]])
I found the answer here, with someone having the same problem. They key is just cheating the indexing to work properly...
>>> a[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
array([[1, 2, 3],
[2, 8, 9]])
You can also use linear indexing, which might be better with performance, like so -
M,N = a.shape
out = b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
So, a.argsort(1)+(np.arange(M)[:,None]*N) basically are the linear indices that are used to map b to get the desired sorted output for b. The same linear indices could also be used on a for getting the sorted output for a.
Sample run -
In [23]: a = np.array([[3,1,2],[8,9,2]])
In [24]: b = np.array([[0,5,4],[3,9,1]])
In [25]: M,N = a.shape
In [26]: b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
Out[26]:
array([[5, 4, 0],
[1, 3, 9]])
Rumtime tests -
In [27]: a = np.random.rand(1000,1000)
In [28]: b = np.random.rand(1000,1000)
In [29]: M,N = a.shape
In [30]: %timeit b[np.arange(np.shape(a)[0])[:,np.newaxis], np.argsort(a)]
10 loops, best of 3: 133 ms per loop
In [31]: %timeit b.ravel()[a.argsort(1)+(np.arange(M)[:,None]*N)]
10 loops, best of 3: 96.7 ms per loop

Row wise element search in an array

I have a vector ( say v = (1, 5, 7) ) and an array.
a = [ [1, 2, 3],
[4, 5, 6],
[7, 8, 9] ]
What would be the most efficient way to find indices of elements in vector v in the corresponding row in a. For example, the output here would be
b = (0, 1, 0) since 1 is at the 0th index in 1st row and so on.
You can convert v to a column vector with [:,None] and then compare with a to bring in broadcasting and finally use np.where to get the final output as indices -
np.where(a == v[:,None])[1]
Sample run -
In [34]: a
Out[34]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [35]: v
Out[35]: array([1, 5, 7])
In [36]: np.where(a == v[:,None])[1]
Out[36]: array([0, 1, 0])
In case, there are multiple elements in a row in a that match the corresponding element from v, you can use np.argmax to get indices of the first match in each row, like so -
np.argmax(a == v[:,None],axis=1)
Sample run -
In [57]: a
Out[57]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 7]])
In [58]: v
Out[58]: array([1, 5, 7])
In [59]: np.argmax(a == v[:,None],axis=1)
Out[59]: array([0, 1, 0])
>>> a = [ [1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> v = (1, 5, 7)
>>> b = tuple([a[id].index(val) for id, val in enumerate(v)])
>>> b
(0, 1, 0)
You can use list comprehension:
[a[idx].index(val) for idx, val in enumerate(v)]
Where enumerate returns an iterable of index and the value itself, and index returns the index of the first apperance of val in the correct row.
If you must get a tuple as the return value convert it in the end:
b = tuple([a[idx].index(val) for idx, val in enumerate(v)])
Just note that index may raise ValueError if val wasn't found in the correct row of a.

Slicing a 3-D array using a 2-D array

Assume we have two matrices:
x = np.random.randint(10, size=(2, 3, 3))
idx = np.random.randint(3, size=(2, 3))
The question is to access the element of x using idx, in the way as:
dim1 = x[0, range(0,3), idx[0]] # slicing x[0] using idx[0]
dim2 = x[1, range(0,3), idx[1]]
res = np.vstack((dim1, dim2))
Is there a neat way to do this?
You can just index it the basic way, only that the size of indexer array has to match. That's what those .reshape s are for:
x[np.array([0,1]).reshape(idx.shape[0], -1),
np.array([0,1,2]).reshape(-1,idx.shape[1]),
idx]
Out[29]:
array([[ 0.10786251, 0.2527514 , 0.11305823],
[ 0.67264076, 0.80958292, 0.07703623]])
Here's another way to do it with reshaping -
x.reshape(-1,x.shape[2])[np.arange(idx.size),idx.ravel()].reshape(idx.shape)
Sample run -
In [2]: x
Out[2]:
array([[[5, 0, 9],
[3, 0, 7],
[7, 1, 2]],
[[5, 3, 5],
[8, 6, 1],
[7, 0, 9]]])
In [3]: idx
Out[3]:
array([[2, 1, 2],
[1, 2, 0]])
In [4]: x.reshape(-1,x.shape[2])[np.arange(idx.size),idx.ravel()].reshape(idx.shape)
Out[4]:
array([[9, 0, 2],
[3, 1, 7]])

Howto expand 2D NumPy array by copy bottom row and right column?

I have a 2D NumPy array and I hope to expand its size on both dimensions by copying the bottom row and right column.
For example, from 2x2:
[[0,1],
[2,3]]
to 4x4:
[[0,1,1,1],
[2,3,3,3],
[2,3,3,3],
[2,3,3,3]]
What's the best way to do it?
Thanks.
Here, the hstack and vstack functions can come in handy. For example,
In [16]: p = array(([0,1], [2,3]))
In [20]: vstack((p, p[-1], p[-1]))
Out[20]:
array([[0, 1],
[2, 3],
[2, 3],
[2, 3]])
And remembering that p.T is the transpose:
So now you can do something like the following:
In [16]: p = array(([0,1], [2,3]))
In [22]: p = vstack((p, p[-1], p[-1]))
In [25]: p = vstack((p.T, p.T[-1], p.T[-1])).T
In [26]: p
Out[26]:
array([[0, 1, 1, 1],
[2, 3, 3, 3],
[2, 3, 3, 3],
[2, 3, 3, 3]])
So the 2 lines of code should do it...
Make an empty array and copy whatever rows, columns you want into it.
def expand(a, new_shape):
x, y = a.shape
r = np.empty(new_shape, a.dtype)
r[:x, :y] = a
r[x:, :y] = a[-1:, :]
r[:x, y:] = a[:, -1:]
r[x:, y:] = a[-1, -1]
return r

Categories