Related
Can someone tell me how to print a nested list containing numbers 1-100, and the row should be 10 numbers.
Like this:
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
new row
...
...and so on.
You can make a list comprehension that calls range for the outside list and again for the inside list:
[list(range(i, i+10)) for i in range(1,100,10)]
>>> [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27, 28, 29, 30],
[31, 32, 33, 34, 35, 36, 37, 38, 39, 40],
[41, 42, 43, 44, 45, 46, 47, 48, 49, 50],
[51, 52, 53, 54, 55, 56, 57, 58, 59, 60],
[61, 62, 63, 64, 65, 66, 67, 68, 69, 70],
[71, 72, 73, 74, 75, 76, 77, 78, 79, 80],
[81, 82, 83, 84, 85, 86, 87, 88, 89, 90],
[91, 92, 93, 94, 95, 96, 97, 98, 99, 100]]
I have an numpy array with x and y values of points. I have another array which contains pairs of start and end indices. Originally this data was in pandas DataFrame, but since it was over 60 millions items, the loc algorithm was very slow. Is there any numpy fast method to split this?
import numpy as np
xy_array = np.arange(100).reshape(2,-1)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])
split_paris = [[0, 10], [10, 13], [13, 17], [20, 22]]
expected_result = [
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59]],
[[10, 11, 12], [60, 61, 62]],
[[13, 14, 15, 16], [63, 64, 65, 66]],
[[20, 21], [70, 71]]
]
Update:
It is not always the case that, next pair will start from end of previous.
This will do it:
import numpy as np
xy_array = np.array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])
split_paris = [[0, 10], [10, 13], [13, 17]]
expected_result = [xy_array[:, x:y] for x, y in split_paris]
expected_result
#[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [50, 51, 52, 53, 54, 55, 56, 57, 58, 59]]), array([[10, 11, 12],
# [60, 61, 62]]), array([[13, 14, 15, 16],
# [63, 64, 65, 66]])]
It is using index slicing basically working in sense array[rows, columns] having : take all rows and x:y taking columns from x to y.
you can always use the np.array_split function provided by numpy. and use the ranges you want
x = np.arange(8.0)
>>> np.array_split(x, 3)
[array([ 0., 1., 2.]), array([ 3., 4., 5.]), array([ 6., 7.])]
I have a 3D tensor A x B x C. For each matrix B x C, I want to extract the leading diagonal.
Is there a vectorized way of doing this in numpy or pytorch instead of looping over A?
You can use numpy.diagonal()
np.diagonal(a, axis1=1, axis2=2)
Example:
In [10]: a = np.arange(3*4*5).reshape(3,4,5)
In [11]: a
Out[11]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
In [12]: np.diagonal(a, axis1=1, axis2=2)
Out[12]:
array([[ 0, 6, 12, 18],
[20, 26, 32, 38],
[40, 46, 52, 58]])
Assuming that the leading diagonal for a generic non-squared (BxC) slice starts off from the top-left corner, we can reshape and slice -
a.reshape(a.shape[0],-1)[:,::a.shape[-1]+1]
Sample run -
In [193]: np.random.seed(0)
In [194]: a = np.random.randint(11,99,(3,4,5))
In [195]: a
Out[195]:
array([[[55, 58, 75, 78, 78],
[20, 94, 32, 47, 98],
[81, 23, 69, 76, 50],
[98, 57, 92, 48, 36]],
[[88, 83, 20, 31, 91],
[80, 90, 58, 75, 93],
[60, 40, 30, 30, 25],
[50, 43, 76, 20, 68]],
[[43, 42, 85, 34, 46],
[86, 66, 39, 45, 11],
[11, 47, 64, 16, 49],
[28, 90, 15, 53, 69]]])
In [196]: a.reshape(a.shape[0],-1)[:,::a.shape[-1]+1]
Out[196]:
array([[55, 94, 69, 48],
[88, 90, 30, 20],
[43, 66, 64, 53]])
In PyTorch, use torch.diagonal():
t.diagonal(dim0=-2, dim1=-1)
I have data in a numpy array:
a = np.arange(100)
a = a.reshape((20,5))
When I type
a[:10]
it returns
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39],
[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49]])
Now i decided to reshape the array into 3d array.
b = a.reshape((5,4,5))
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]],
[[60, 61, 62, 63, 64],
[65, 66, 67, 68, 69],
[70, 71, 72, 73, 74],
[75, 76, 77, 78, 79]],
[[80, 81, 82, 83, 84],
[85, 86, 87, 88, 89],
[90, 91, 92, 93, 94],
[95, 96, 97, 98, 99]]])
How do I slice b to that I obtain the values like a[:10]?
I tried
b[:10,0,:5]
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44],
[50, 51, 52, 53, 54],
[60, 61, 62, 63, 64],
[70, 71, 72, 73, 74],
[80, 81, 82, 83, 84],
[90, 91, 92, 93, 94]])
But its not correct.
Thank you in advance!
When you use b = a.reshape((5,4,5)) you just create a different view on the same data used by the array a. (ie changes to the elements of a will appear in b). reshape() does not copy data in this case, so it is a very fast operation. Slicing b and slicing a accesses the same memory, so there shouldn't be any need for a different syntax for the b array (just use a[:10]). If you have created a copy of the data, perhaps with np.resize(), and discarded a, just reshape b: b.reshape((20,5))[:10].
By reshaping (20,5) to (5,4,5), there's no way you can pull out the 1st half of the values. You can't split those 5 rows into 2 even groups:
In [9]: b[:2]
Out[9]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]]])
In [10]: b[:3]
Out[10]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
The last row of a[:10] is in the middle of b[3,:,:].
Note that b[:2] is (2,4,5), 8 rows of a, grouped into 2 sets of 4.
Now if you'd done c=a.reshape(4,5,5), then c[:2] would have those same 10 rows - in 2 sets of 5. And c[:2].reshape(10,-1) will look just like a[:10].
There could be a programmatic way to get what you want, but not a python slice.
It is important to understand what every component in the shape tells us about the arrangement. I like to think in terms of vectors.
Let's talk about the shape (20, 5) - this would mean, I have 20 vectors where every vector has 5 elements.
For the shape (5, 4, 5) - this would mean, I have 5 vectors, where each vector again has 4 vectors where every vector within has 5 elements.
This might sound complicated but with some deliberation, this could be understood.
Coming to your question, by a[:10] you want to retrieve the first 10 rows where each row should be a vector containing 5 elements but using a shape of (5, 4, 5).
This is only possible if you retrieve the first 4 vectors from 1st vector of the leftmost dimension (5), next 4 vectors from the next vector and next 2 from the 3rd.
Python slicing might not be the best tool to achieve this.
I am using nested arrays as a matrix representation. I created the following function for spliting quadratic matrices with size 2^k into four equal parts (used for Strassen algorithm):
import itertools
def splitmat(mat):
n = len(mat)
return map( \
lambda (x,y):map(lambda z:z[y[0]:y[1]],mat[x[0]:x[1]]), \
itertools.product([(0,n/2),(n/2,n)],repeat=2)
)
Now I'm trying to find an inverse function that joins the four parts back to a full matrix. I could use two nested loops, but may there be any pythonic way to achieve this? I would prefer to not use numpy but only builtin modules. Do you have any idea or hint how to achieve this?
Your inverse operation can be split into 2 simplier operation:
concatenate rows(numpy.vstack)
concatenate columns(numpy.hstack)
So, if you have matrix divided into 4 submatrix:
M = |m1|m2|
|m3|m4|
then M = hstack(vstack(m1, m2), vstack(m3, m4).
This operations can be code like this:
import itertools
import math
# iterators
def ihstack(*matrixes):
return map(lambda rows: itertools.chain(*rows), zip(*matrixes))
def ivstack(*matrixes):
return itertools.chain(*matrixes)
# main function
def squarejoin(*matrixes):
size = int(math.sqrt(len(matrixes)))
assert size ** 2 == len(matrixes), 'Incorrect number of matrices'
return _matrixjoin(matrixes, size, size)
def _matrixjoin(matrixes, hsize, vsize):
print(matrixes, hsize, vsize)
return ivstack(*(ihstack(*itertools.islice(matrixes, i*hsize, (i+1)*hsize)) for i in range(vsize)))
Here I have an example program where a 2 loops implementation works and is crystal clear in its intent, a 1 loop implementation works and is, imho, slightly less clear and eventually a 0 (explicit, btw) loops implementation that, alas, is buggy.
My vote goes to the two loops... further, I'd like to be shown what's wrong with my 0 loops attempt
Code
import itertools
def pm(m):
for row in m: print row
mat = []
n = 8
for i in range(n):
mat.append(range(i*n, i*n+n))
# this is shorthand for your splitmat function
res = map(lambda (x,y):
map(lambda z:z[y[0]:y[1]],mat[x[0]:x[1]]),
itertools.product([(0,n/2),(n/2,n)],repeat=2))
pm(res)
print "\n2 cycles"
mat = []
for i, j in ((0,1),(2,3)):
for a, b in zip(res[i],res[j]):
mat.append(a+b)
pm(mat)
print "\n1 cycle"
mat = []
for i, j in ((0,1),(2,3)):
map(lambda x: mat.append(x[0]+x[1]), zip(res[i],res[j]))
pm(mat)
print "\n0 cycles"
mat = map(lambda i_j:
map(lambda x: x[0]+x[1], zip(res[i_j[0]],res[i_j[1]])), ((0,1),(2,3)))
pm(mat)
Output
[[0, 1, 2, 3], [8, 9, 10, 11], [16, 17, 18, 19], [24, 25, 26, 27]]
[[4, 5, 6, 7], [12, 13, 14, 15], [20, 21, 22, 23], [28, 29, 30, 31]]
[[32, 33, 34, 35], [40, 41, 42, 43], [48, 49, 50, 51], [56, 57, 58, 59]]
[[36, 37, 38, 39], [44, 45, 46, 47], [52, 53, 54, 55], [60, 61, 62, 63]]
2 cicli
[0, 1, 2, 3, 4, 5, 6, 7]
[8, 9, 10, 11, 12, 13, 14, 15]
[16, 17, 18, 19, 20, 21, 22, 23]
[24, 25, 26, 27, 28, 29, 30, 31]
[32, 33, 34, 35, 36, 37, 38, 39]
[40, 41, 42, 43, 44, 45, 46, 47]
[48, 49, 50, 51, 52, 53, 54, 55]
[56, 57, 58, 59, 60, 61, 62, 63]
1 ciclo
[0, 1, 2, 3, 4, 5, 6, 7]
[8, 9, 10, 11, 12, 13, 14, 15]
[16, 17, 18, 19, 20, 21, 22, 23]
[24, 25, 26, 27, 28, 29, 30, 31]
[32, 33, 34, 35, 36, 37, 38, 39]
[40, 41, 42, 43, 44, 45, 46, 47]
[48, 49, 50, 51, 52, 53, 54, 55]
[56, 57, 58, 59, 60, 61, 62, 63]
0 cicli
[[0, 1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14, 15], [16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31]]
[[32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47], [48, 49, 50, 51, 52, 53, 54, 55], [56, 57, 58, 59, 60, 61, 62, 63]]