Select different columns for each row - python

Suppose I have the following array:
>>> a = np.arange(25).reshape((5, 5))
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
Now I want to select different columns for each row based on the following index array:
>>> i = np.array([0, 1, 2, 1, 0])
This index array denotes the start column for each row and the selections should be of similar range, e.g. 3. Thus I want to obtain the following result:
>>> ???
array([[ 0, 1, 2],
[ 6, 7, 8],
[12, 13, 14],
[16, 17, 18],
[20, 21, 22]])
I know that I can select a single column per row via
>>> a[np.arange(a.shape[0]), i]
but how about multiple columns?

Use advanced indexing with properly broadcasted 2d array as index.
a[np.arange(a.shape[0])[:,None], i[:,None] + np.arange(3)]
#array([[ 0, 1, 2],
# [ 6, 7, 8],
# [12, 13, 14],
# [16, 17, 18],
# [20, 21, 22]])
idx_row = np.arange(a.shape[0])[:,None]
idx_col = i[:,None] + np.arange(3)
idx_row
#array([[0],
# [1],
# [2],
# [3],
# [4]])
idx_col
#array([[0, 1, 2],
# [1, 2, 3],
# [2, 3, 4],
# [1, 2, 3],
# [0, 1, 2]])
a[idx_row, idx_col]
#array([[ 0, 1, 2],
# [ 6, 7, 8],
# [12, 13, 14],
# [16, 17, 18],
# [20, 21, 22]])

Related

Select non-consecutive row and column indices from 2d numpy array

I have an array a
a = np.arange(5*5).reshape(5,5)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
and want to select the last two columns from row one and two, and the first two columns of row three and four.
The result should look like this
array([[3, 4, 10, 11],
[8, 9, 15, 16]])
How to do that in one go without indexing twice and concatenation?
I tried using take
a.take([[0,1,2,3], [3,4,0,1]])
array([[0, 1, 2, 3],
[3, 4, 0, 1]])
ix_
a[np.ix_([0,1,2,3], [3,4,0,1])]
array([[ 3, 4, 0, 1],
[ 8, 9, 5, 6],
[13, 14, 10, 11],
[18, 19, 15, 16]])
and r_
a[np.r_[0:2, 2:4], np.r_[3:5, 0:2]]
array([ 3, 9, 10, 16])
and a combination of ix_ and r_
a[np.ix_([0,1,2,3], np.r_[3:4, 0:1])]
array([[ 3, 0],
[ 8, 5],
[13, 10],
[18, 15]])
Using integer advanced indexing, you can do something like this
index_rows = np.array([
[0, 0, 2, 2],
[1, 1, 3, 3],
])
index_cols = np.array([
[-2, -1, 0, 1],
[-2, -1, 0, 1],
])
a[index_rows, index_cols]
where you just select directly what elements you want.

How do I put data in a two-dimensional array sequentially without duplication?

I want to put data consisting of a one-dimensional array into a two-dimensional array. I will assume that the number of rows and columns is 5.
The code I tried is as follows.
data = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
a = []
for i in range(5):
a.append([])
for j in range(5):
a[i].append(j)
print(a)
# result : [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
# I want this : [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20]]
You don't have to worry about the last [20].
The important thing is that the row must change without duplicating the data.
I want to solve it, but I can't think of any way. I ask for your help.
There are two issues with the current code.
It doesn't actually use any of the values from the variable data.
The data does not contain enough items to populate a 5x5 array.
After adding 0 to the beginning of the variable data and using the values from the variable, the code becomes
data = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
a = []
for i in range(5):
a.append([])
for j in range(5):
if i*5+j >= len(data):
break
a[i].append(data[i*5+j])
print(a)
The output of the new code will be
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20]]
This should deliever the desired output
data = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
a = []
x_i = 5
x_j = 5
for i in range(x_i):
a.append([])
for j in range(x_j):
a[i].append(i*x_j+j)
print(a)
Output:
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]
By using list comprehension...
data = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
columns = 5
rows = 5
result = [data[i * columns: (i + 1) * columns] for i in range(rows)]
print(result)
# [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20]]
You could use itertools.groupby with an integer division to create the groups
from itertools import groupby
data = [0, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
grouped_data = [list(v) for k, v in groupby(data, key=lambda x: x//5)]
print(grouped_data)
Output
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20]]

Broadcasting 2D array in specific columns in Python

I have an array like this:
A = np.array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20]])
What I want to do is add 1 to each value in the first and last column. I want to understand broadcasting (avoid loops), by using this and appropriate vector, but I have tried but it doesn't work. Expected results:
A = np.array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
You can use numpy indexing to do this. Try this:
# 0 is the first and -1 is the last column
A[:,[0,-1]] = A[:,[0,-1]]+1
Or
A[:,(0,-1)] = A[:,(0,-1)]+1
Or
A[:,[0,-1]]+=1
Or
A[:,(0,-1)]+=1
Output in either case:
array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
You can use vector [1,0,0,0,1] and python will do broadcasting for you.
b = np.array([1,0,0,0,1])
A + b
array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
If you would like to know how broadcasting works, you can simply try to broadcast once by yourself.
b = np.array([1,0,0,0,1])
B = np.tile(b,(A.shape[0],1))
array([[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1]])
A + B
Same result.

Numpy: how to return a view on a matrix A based on submatrix B

Given a matrix A with dimensions axa, and B with dimensions bxb, and axa modulo bxb == 0. B is a submatrix(s) of A starting at (0,0) and tiled until the dimensions of axa is met.
A = array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
An example of a submatrix might be:
B = array([[10, 11],
[14, 15]])
Where the number 15 is in position (1, 1) with respect to B's coordinates.
How could I return a view on the array A, for a particular position in B? For example for position (1,1) in B, I want to get all such values from A:
C = array([[5, 7],
[13, 15]])
The reason I want a view, is that I wish to update multiple positions in A:
C = array([[5, 7],[13, 15]]) = 20
results in
A = array([[ 0, 1, 2, 3],
[ 4, 20, 6, 20],
[ 8, 9, 10, 11],
[12, 20, 14, 20]])
You can obtain this as follows:
>>> A = np.array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> A[np.ix_([1,3],[1,3])] = 20
>>> A
array([[ 0, 1, 2, 3],
[ 4, 20, 6, 20],
[ 8, 9, 10, 11],
[12, 20, 14, 20]])
For more info about np.ix_ could review the NumPy documentation

Using NumPy arrays as indices to NumPy arrays

I have a 3x3x3 NumPy array:
>>> x = np.arange(27).reshape((3, 3, 3))
>>> x
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
Now I create an ordinary list of indices:
>>> i = [[0, 1, 2, 1], [2, 1, 0, 1], [1, 2, 0, 1]]
As expected, I get four values using this list as the index:
>>> x[i]
array([ 7, 14, 18, 13])
But if I now convert i into a NumPy array, I won't get the same answer.
>>> j = np.asarray(i)
>>> x[j]
array([[[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]],
...,
[[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]],
[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]]]])
Why is this so? Why can't I use NumPy arrays as indices to NumPy array?
x[j] is the equivalent of x[j,:,:]
In [163]: j.shape
Out[163]: (3, 4)
In [164]: x[j].shape
Out[164]: (3, 4, 3, 3)
The resulting shape is the shape of j joined with the last 2 dimensions of x. j just selects from the 1st dimension of x.
x[i] on the other hand, is the equivalent to x[tuple(i)], that is:
In [168]: x[[0, 1, 2, 1], [2, 1, 0, 1], [1, 2, 0, 1]]
Out[168]: array([ 7, 14, 18, 13])
In fact x(tuple(j)] produces the same 4 item array.
The different ways of indexing numpy arrays can be confusing.
Another example of how the shape of the index array or lists affects the output:
In [170]: x[[[0, 1], [2, 1]], [[2, 1], [0, 1]], [[1, 2], [0, 1]]]
Out[170]:
array([[ 7, 14],
[18, 13]])
Same items, but in a 2d array.
Check out the docs for numpy, what you are doing is "Integer Array Indexing", you need to pass each coordinate in as a separate array:
j = [np.array(x) for x in i]
x[j]
Out[191]: array([ 7, 14, 18, 13])

Categories