How to select row or column from a matrix? - python

Here I have a matrix a=np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]])
I want to select all rows, but the column I want to select is from the first to the third one.
It should be [[1,2,3],[6,7,8],[11,12,13]]
However, I have ever tried a[:,[0,2]], but it shows
array([[ 1, 3],
[ 6, 8],
[11, 13]])
It seems not the correct, so I tried another one a[:][0:2], it still is a wrong result.
So I want to ask if there are any function or method can fix the problem?

Sounds like you are looking for a[:, 0:3]:
In [4]: a[:, 0:3]
Out[4]:
array([[ 1, 2, 3],
[ 6, 7, 8],
[11, 12, 13]])

I think need indexing 0:3:
print (a[:,0:3])
[[ 1 2 3]
[ 6 7 8]
[11 12 13]]

Try the following
a=np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]])
a = a[:,0:3]
print(a)
#Output
#array([[ 1, 2, 3],
# [ 6, 7, 8],
# [11, 12, 13]])

Related

indexing rows and columns in numpy

a = np.array(list(range(16).reshape((4,4))
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
Say I want the middle square. It'd seem reasonable to do this:
a[[1,2],[1,2]]
but I get this:
array([5, 10])
This works, but seems inelegant:
a[[1,2],:][:,[1,2]]
array([[5, 6],
[9, 10]])
So my questions are:
Why is it this way? What premises are required to make the implemented way sensible?
Is there a canonical way to select along more than one index at once?
I think you can read more details on advanced indexing. Basically, when you slice the array by lists/arrays, the arrays will be broadcast and iterate together.
In your case, you can do:
idx = np.array([1,3])
a[idx,idx[:,None]]
Or as in the doc above:
a[np.ix_(idx, idx)]
Output:
array([[ 5, 13],
[ 7, 15]])
You can do both slicing operations at once instead of creating a view and indexing that again:
import numpy as np
a = np.arange(16).reshape((4, 4))
# preferred if possible
print(a[1:3, 1:3])
# [[ 5 6]
# [ 9 10]]
# otherwise add a second dimension to the first index to make it broadcastable
index1 = np.asarray([1, 2])
index2 = np.asarray([1, 2])
print(a[index1[:, None], index2])
# [[ 5 6]
# [ 9 10]]
You could use multiple np.take to select indices from multiple axes
a = np.arange(16).reshape((4, 4))
idx = np.array([1,2])
np.take(np.take(a, idx, axis=1), idx, axis=0)
Or (slightly more readable)
a.take(idx, axis=1).take(idx, axis=0)
Output:
array([[ 5, 6],
[ 9, 10]])
np.take also allows you to conveniently wrap around out-of-bound indices and such.

slice matrix by different index arrary

I have a, ex 5*3 array such as
[1,2,3]
[4,5,6]
[7,8,9]
[10,11,12]
[13,14,15]
and I have 3 list to select them, ex
a1 = [0,1,2]
a2 = [0,1,3]
a3 = [0,2,4]
Now I want to get 3 array, each comes from a for a1, a2 & a3
also, a1 selects 1st column only, a2 selects 2nd column only...
for given example, I want
[1,4,7], [2,5,11], [9,12,15]
What's the best way to do it?
Thanks.
In [913]: arr = np.arange(1,16).reshape(5,3)
In [914]: arr
Out[914]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15]])
In [915]: idx = np.array([[0,1,2],[0,1,3],[0,2,4]])
In [916]: idx.shape
Out[916]: (3, 3)
We want to select a (3,3) array of values, where idx identifies rows. So we need an column index that broadcasts with it. [0,1,2] will do.
In [917]: arr[idx, np.arange(3)]
Out[917]:
array([[ 1, 5, 9],
[ 1, 5, 12],
[ 1, 8, 15]])
Oops, wrong selection; let's try the transpose:
In [918]: arr[idx.T, np.arange(3)]
Out[918]:
array([[ 1, 2, 3],
[ 4, 5, 9],
[ 7, 11, 15]])

How to split numpy array vertically from any column index

I want to split a numpy array into two subarrays where the splitting point is based on a column id, i.e., vertical split. For instance, if I generate a numpy array of shape [10,16] and I want to create two subarrays by splitting it from the column's index 11, then I should get one subarray of size [10,10] and the other one is from [10,15]. Therefore, I am following numpy.hsplit here but it seems it only does an even split (the subarrays need to be equal). I want to be able to:
Split any numpy array vertically, no matter what is the size of subarrays.
Extract both subarrays.
To simulate my request, the following is my code:
import numpy as np
C = [[1,2,3,4],[5,6,7,8],[9,10,11,12], [13,14,15,16]]
C = np.asarray(C)
C = np.hsplit(C, 3)
print(C)
As you can see, np.hsplit(C, 3) doesn't work unless the splitting generates similar subarrays. Even if I did np.hsplit(C, 2), I don't know how to extract both subarrays into separate numpy arrays.
To achieve my goals, how can I modify this code?
Use the array indexing.
C[:,:3] # All rows , columns 0 to 2
Out[29]:
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]])
C[:,3:] # All rows column 3 (to end in this case also 3).
Out[30]:
array([[ 4],
[ 8],
[12],
[16]])
You need to specify the indices as list:
import numpy as np
C = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
C = np.asarray(C)
C = np.hsplit(C, [3])
print(C)
Output
[array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11],
[13, 14, 15]]), array([[ 4],
[ 8],
[12],
[16]])]

pick TxK numpy array from TxN numpy array using TxK column index array

This is an indirect indexing problem.
It can be solved with a list comprehension.
The question is whether, or, how to solve it within numpy,
When
data.shape is (T,N)
and
c.shape is (T,K)
and each element of c is an int between 0 and N-1 inclusive, that is,
each element of c is intended to refer to a column number from data.
The goal is to obtain out where
out.shape = (T,K)
And for each i in 0..(T-1)
the row out[i] = [ data[i, c[i,0]] , ... , data[i, c[i,K-1]] ]
Concrete example:
data = np.array([\
[ 0, 1, 2],\
[ 3, 4, 5],\
[ 6, 7, 8],\
[ 9, 10, 11],\
[12, 13, 14]])
c = np.array([
[0, 2],\
[1, 2],\
[0, 0],\
[1, 1],\
[2, 2]])
out should be out = [[0, 2], [4, 5], [6, 6], [10, 10], [14, 14]]
The first row of out is [0,2] because the columns chosen are given by c's row 0, they are 0 and 2, and data[0] at columns 0 and 2 are 0 and 2.
The second row of out is [4,5] because the columns chosen are given by c's row 1, they are 1 and 2, and data[1] at columns 1 and 2 is 4 and 5.
Numpy fancy indexing doesn't seem to solve this in an obvious way because indexing data with c (e.g. data[c], np.take(data,c,axis=1) ) always produces a 3 dimensional array.
A list comprehension can solve it:
out = [ [data[rowidx,i1],data[rowidx,i2]] for (rowidx, (i1,i2)) in enumerate(c) ]
if K is 2 I suppose this is marginally OK. If K is variable, this is not so good.
The list comprehension has to be rewritten for each value K, because it unrolls the columns picked out of data by each row of c. It also violates DRY.
Is there a solution based entirely in numpy?
You can avoid loops with np.choose:
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
data = np.array([\
[ 0, 1, 2],\
[ 3, 4, 5],\
[ 6, 7, 8],\
[ 9, 10, 11],\
[12, 13, 14]])
c = np.array([
[0, 2],\
[1, 2],\
[0, 0],\
[1, 1],\
[2, 2]])
--
In [2]: np.choose(c, data.T[:,:,np.newaxis])
Out[2]:
array([[ 0, 2],
[ 4, 5],
[ 6, 6],
[10, 10],
[14, 14]])
Here's one possible route to a general solution...
Create masks for data to select the values for each column of out. For example, the first mask could be achieved by writing:
>>> np.arange(3) == np.vstack(c[:,0])
array([[ True, False, False],
[False, True, False],
[ True, False, False],
[False, True, False],
[False, False, True]], dtype=bool)
>>> data[_]
array([ 2, 5, 6, 10, 14])
The mask to get the values for the second column of out: np.arange(3) == np.vstack(c[:,1]).
So, to get the out array...
>>> mask0 = np.arange(3) == np.vstack(c[:,0])
>>> mask1 = np.arange(3) == np.vstack(c[:,1])
>>> np.vstack((data[mask0], data[mask1])).T
array([[ 0, 2],
[ 4, 5],
[ 6, 6],
[10, 10],
[14, 14]])
Edit: Given arbitrary array widths K and N you could use a loop to create the masks, so the general construction of the out array might simply look like this:
np.vstack([data[np.arange(N) == np.vstack(c[:,i])] for i in range(K)]).T
Edit 2: A slightly neater solution (though still relying on a loop) is:
np.vstack([data[i][c[i]] for i in range(T)])

Adding a dimension to every element of a numpy.array

I'm trying to transform each element of a numpy array into an array itself (say, to interpret a greyscale image as a color image). In other words:
>>> my_ar = numpy.array((0,5,10))
[0, 5, 10]
>>> transformed = my_fun(my_ar) # In reality, my_fun() would do something more useful
array([
[ 0, 0, 0],
[ 5, 10, 15],
[10, 20, 30]])
>>> transformed.shape
(3, 3)
I've tried:
def my_fun_e(val):
return numpy.array((val, val*2, val*3))
my_fun = numpy.frompyfunc(my_fun_e, 1, 3)
but get:
my_fun(my_ar)
(array([[0 0 0], [ 5 10 15], [10 20 30]], dtype=object), array([None, None, None], dtype=object), array([None, None, None], dtype=object))
and I've tried:
my_fun = numpy.frompyfunc(my_fun_e, 1, 1)
but get:
>>> my_fun(my_ar)
array([[0 0 0], [ 5 10 15], [10 20 30]], dtype=object)
This is close, but not quite right -- I get an array of objects, not an array of ints.
Update 3! OK. I've realized that my example was too simple beforehand -- I don't just want to replicate my data in a third dimension, I'd like to transform it at the same time. Maybe this is clearer?
Does numpy.dstack do what you want? The first two indexes are the same as the original array, and the new third index is "depth".
>>> import numpy as N
>>> a = N.array([[1,2,3],[4,5,6],[7,8,9]])
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> b = N.dstack((a,a,a))
>>> b
array([[[1, 1, 1],
[2, 2, 2],
[3, 3, 3]],
[[4, 4, 4],
[5, 5, 5],
[6, 6, 6]],
[[7, 7, 7],
[8, 8, 8],
[9, 9, 9]]])
>>> b[1,1]
array([5, 5, 5])
Use map to apply your transformation function to each element in my_ar:
import numpy
my_ar = numpy.array((0,5,10))
print my_ar
transformed = numpy.array(map(lambda x:numpy.array((x,x*2,x*3)), my_ar))
print transformed
print transformed.shape
I propose:
numpy.resize(my_ar, (3,3)).transpose()
You can of course adapt the shape (my_ar.shape[0],)*2 or whatever
Does this do what you want:
tile(my_ar, (1,1,3))

Categories