I have data in a numpy array:
a = np.arange(100)
a = a.reshape((20,5))
When I type
a[:10]
it returns
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39],
[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49]])
Now i decided to reshape the array into 3d array.
b = a.reshape((5,4,5))
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]],
[[60, 61, 62, 63, 64],
[65, 66, 67, 68, 69],
[70, 71, 72, 73, 74],
[75, 76, 77, 78, 79]],
[[80, 81, 82, 83, 84],
[85, 86, 87, 88, 89],
[90, 91, 92, 93, 94],
[95, 96, 97, 98, 99]]])
How do I slice b to that I obtain the values like a[:10]?
I tried
b[:10,0,:5]
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44],
[50, 51, 52, 53, 54],
[60, 61, 62, 63, 64],
[70, 71, 72, 73, 74],
[80, 81, 82, 83, 84],
[90, 91, 92, 93, 94]])
But its not correct.
Thank you in advance!
When you use b = a.reshape((5,4,5)) you just create a different view on the same data used by the array a. (ie changes to the elements of a will appear in b). reshape() does not copy data in this case, so it is a very fast operation. Slicing b and slicing a accesses the same memory, so there shouldn't be any need for a different syntax for the b array (just use a[:10]). If you have created a copy of the data, perhaps with np.resize(), and discarded a, just reshape b: b.reshape((20,5))[:10].
By reshaping (20,5) to (5,4,5), there's no way you can pull out the 1st half of the values. You can't split those 5 rows into 2 even groups:
In [9]: b[:2]
Out[9]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]]])
In [10]: b[:3]
Out[10]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
The last row of a[:10] is in the middle of b[3,:,:].
Note that b[:2] is (2,4,5), 8 rows of a, grouped into 2 sets of 4.
Now if you'd done c=a.reshape(4,5,5), then c[:2] would have those same 10 rows - in 2 sets of 5. And c[:2].reshape(10,-1) will look just like a[:10].
There could be a programmatic way to get what you want, but not a python slice.
It is important to understand what every component in the shape tells us about the arrangement. I like to think in terms of vectors.
Let's talk about the shape (20, 5) - this would mean, I have 20 vectors where every vector has 5 elements.
For the shape (5, 4, 5) - this would mean, I have 5 vectors, where each vector again has 4 vectors where every vector within has 5 elements.
This might sound complicated but with some deliberation, this could be understood.
Coming to your question, by a[:10] you want to retrieve the first 10 rows where each row should be a vector containing 5 elements but using a shape of (5, 4, 5).
This is only possible if you retrieve the first 4 vectors from 1st vector of the leftmost dimension (5), next 4 vectors from the next vector and next 2 from the 3rd.
Python slicing might not be the best tool to achieve this.
Related
I have a 40 x 40 x 40 array in Python and would like to extract all i values with k=10 index. I understand how to do this with a nxn array but not a nxnxn array.
Thanks
As per the Numpy Index documentation.
import numpy as np
arr = np.arange(3*4*5).reshape( 3,4,5 )
arr
# array([[[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19]],
# [[20, 21, 22, 23, 24],
# [25, 26, 27, 28, 29],
# [30, 31, 32, 33, 34],
# [35, 36, 37, 38, 39]],
# [[40, 41, 42, 43, 44],
# [45, 46, 47, 48, 49],
# [50, 51, 52, 53, 54],
# [55, 56, 57, 58, 59]]])
arr[:,:,1] # For k = 1
# or arr[ ...,1] as its the last index
This returns a 2d array.
array([[ 1, 6, 11, 16],
[21, 26, 31, 36],
[41, 46, 51, 56]])
Suppose I have an array with shape (3, 4, 5) and want to slice along the second axis with an index array [2, 1, 0].
I could not explain what I want to do in text, so please refer the below code and figure:
>>> src = np.arange(3*4*5).reshape(3,4,5)
>>> index = [2,1,0]
>>> src
>>> array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
>>> # what I need is:
array([[[10, 11, 12, 13, 14]], # slice the 2nd row (index[0])
[[25, 26, 27, 28, 29]], # 1st row (index[1])
[[40, 41, 42, 43, 44]]]) # 0th row (index[2])
src[np.arange(src.shape[0]), [2, 1, 0]]
# src[np.arange(src.shape[0]), [2, 1, 0], :]
array([[10, 11, 12, 13, 14],
[25, 26, 27, 28, 29],
[40, 41, 42, 43, 44]])
We need to compute the indices for axis=0:
>>> np.arange(src.shape[0])
array([0, 1, 2])
And we already have the indices for axes=1. We then slice across axis=3 to extract our cross-section.
You could do:
import numpy as np
arr = np.array([[[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
first, second = zip(*enumerate([2, 1, 0]))
result = arr[first, second, :]
print(result)
Output
[[10 11 12 13 14]
[25 26 27 28 29]
[40 41 42 43 44]]
I have a 3D tensor A x B x C. For each matrix B x C, I want to extract the leading diagonal.
Is there a vectorized way of doing this in numpy or pytorch instead of looping over A?
You can use numpy.diagonal()
np.diagonal(a, axis1=1, axis2=2)
Example:
In [10]: a = np.arange(3*4*5).reshape(3,4,5)
In [11]: a
Out[11]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
In [12]: np.diagonal(a, axis1=1, axis2=2)
Out[12]:
array([[ 0, 6, 12, 18],
[20, 26, 32, 38],
[40, 46, 52, 58]])
Assuming that the leading diagonal for a generic non-squared (BxC) slice starts off from the top-left corner, we can reshape and slice -
a.reshape(a.shape[0],-1)[:,::a.shape[-1]+1]
Sample run -
In [193]: np.random.seed(0)
In [194]: a = np.random.randint(11,99,(3,4,5))
In [195]: a
Out[195]:
array([[[55, 58, 75, 78, 78],
[20, 94, 32, 47, 98],
[81, 23, 69, 76, 50],
[98, 57, 92, 48, 36]],
[[88, 83, 20, 31, 91],
[80, 90, 58, 75, 93],
[60, 40, 30, 30, 25],
[50, 43, 76, 20, 68]],
[[43, 42, 85, 34, 46],
[86, 66, 39, 45, 11],
[11, 47, 64, 16, 49],
[28, 90, 15, 53, 69]]])
In [196]: a.reshape(a.shape[0],-1)[:,::a.shape[-1]+1]
Out[196]:
array([[55, 94, 69, 48],
[88, 90, 30, 20],
[43, 66, 64, 53]])
In PyTorch, use torch.diagonal():
t.diagonal(dim0=-2, dim1=-1)
sorry if this question is so basic
A=np.arange(64).reshape(2,32)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
[32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]])
A.reshape(4,4,4)
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]],
[[32, 33, 34, 35],
[36, 37, 38, 39],
[40, 41, 42, 43],
[44, 45, 46, 47]],
[[48, 49, 50, 51],
[52, 53, 54, 55],
[56, 57, 58, 59],
[60, 61, 62, 63]]])
Now, i would have liked something like A[2] or A[2,:] or A[2,:,:] to return me the matrix
[[32, 33, 34, 35],
[36, 37, 38, 39],
[40, 41, 42, 43],
[44, 45, 46, 47]]
and A[2,2,2] to return me 42 for example
but i got this error
IndexError: too many indices for array
You have to do
A = A.reshape(4,4,4)
instead of
A.reshape(4,4,4)
Because reshape is not inplace, you need to do this. Then you can do
A[2,2,2]
Out[301]: 42
after A.reshape(4,4,4) , A does not change
I am using nested arrays as a matrix representation. I created the following function for spliting quadratic matrices with size 2^k into four equal parts (used for Strassen algorithm):
import itertools
def splitmat(mat):
n = len(mat)
return map( \
lambda (x,y):map(lambda z:z[y[0]:y[1]],mat[x[0]:x[1]]), \
itertools.product([(0,n/2),(n/2,n)],repeat=2)
)
Now I'm trying to find an inverse function that joins the four parts back to a full matrix. I could use two nested loops, but may there be any pythonic way to achieve this? I would prefer to not use numpy but only builtin modules. Do you have any idea or hint how to achieve this?
Your inverse operation can be split into 2 simplier operation:
concatenate rows(numpy.vstack)
concatenate columns(numpy.hstack)
So, if you have matrix divided into 4 submatrix:
M = |m1|m2|
|m3|m4|
then M = hstack(vstack(m1, m2), vstack(m3, m4).
This operations can be code like this:
import itertools
import math
# iterators
def ihstack(*matrixes):
return map(lambda rows: itertools.chain(*rows), zip(*matrixes))
def ivstack(*matrixes):
return itertools.chain(*matrixes)
# main function
def squarejoin(*matrixes):
size = int(math.sqrt(len(matrixes)))
assert size ** 2 == len(matrixes), 'Incorrect number of matrices'
return _matrixjoin(matrixes, size, size)
def _matrixjoin(matrixes, hsize, vsize):
print(matrixes, hsize, vsize)
return ivstack(*(ihstack(*itertools.islice(matrixes, i*hsize, (i+1)*hsize)) for i in range(vsize)))
Here I have an example program where a 2 loops implementation works and is crystal clear in its intent, a 1 loop implementation works and is, imho, slightly less clear and eventually a 0 (explicit, btw) loops implementation that, alas, is buggy.
My vote goes to the two loops... further, I'd like to be shown what's wrong with my 0 loops attempt
Code
import itertools
def pm(m):
for row in m: print row
mat = []
n = 8
for i in range(n):
mat.append(range(i*n, i*n+n))
# this is shorthand for your splitmat function
res = map(lambda (x,y):
map(lambda z:z[y[0]:y[1]],mat[x[0]:x[1]]),
itertools.product([(0,n/2),(n/2,n)],repeat=2))
pm(res)
print "\n2 cycles"
mat = []
for i, j in ((0,1),(2,3)):
for a, b in zip(res[i],res[j]):
mat.append(a+b)
pm(mat)
print "\n1 cycle"
mat = []
for i, j in ((0,1),(2,3)):
map(lambda x: mat.append(x[0]+x[1]), zip(res[i],res[j]))
pm(mat)
print "\n0 cycles"
mat = map(lambda i_j:
map(lambda x: x[0]+x[1], zip(res[i_j[0]],res[i_j[1]])), ((0,1),(2,3)))
pm(mat)
Output
[[0, 1, 2, 3], [8, 9, 10, 11], [16, 17, 18, 19], [24, 25, 26, 27]]
[[4, 5, 6, 7], [12, 13, 14, 15], [20, 21, 22, 23], [28, 29, 30, 31]]
[[32, 33, 34, 35], [40, 41, 42, 43], [48, 49, 50, 51], [56, 57, 58, 59]]
[[36, 37, 38, 39], [44, 45, 46, 47], [52, 53, 54, 55], [60, 61, 62, 63]]
2 cicli
[0, 1, 2, 3, 4, 5, 6, 7]
[8, 9, 10, 11, 12, 13, 14, 15]
[16, 17, 18, 19, 20, 21, 22, 23]
[24, 25, 26, 27, 28, 29, 30, 31]
[32, 33, 34, 35, 36, 37, 38, 39]
[40, 41, 42, 43, 44, 45, 46, 47]
[48, 49, 50, 51, 52, 53, 54, 55]
[56, 57, 58, 59, 60, 61, 62, 63]
1 ciclo
[0, 1, 2, 3, 4, 5, 6, 7]
[8, 9, 10, 11, 12, 13, 14, 15]
[16, 17, 18, 19, 20, 21, 22, 23]
[24, 25, 26, 27, 28, 29, 30, 31]
[32, 33, 34, 35, 36, 37, 38, 39]
[40, 41, 42, 43, 44, 45, 46, 47]
[48, 49, 50, 51, 52, 53, 54, 55]
[56, 57, 58, 59, 60, 61, 62, 63]
0 cicli
[[0, 1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14, 15], [16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31]]
[[32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47], [48, 49, 50, 51, 52, 53, 54, 55], [56, 57, 58, 59, 60, 61, 62, 63]]