I have a functional evaluation which has many parameters, and I want to vectorize the evaluation. Something like this:
I = 100
J = 34
K = 6
i, j, k = array(range(I)), array(range(J)), array(range(K))
i, j, k = meshgrid(i, j, k)
f = myfun(i, j, k)
This is excellent, however, now I also have a parameter that I want to send to myfun that I generate with some other function and that is invariant over some of the indices above, thus:
p = my_param_gen()
and let's say
p.shape()
will output
(100, 6)
This would correspond to p being invariant over the index J. Now, I would like to expand the shape of p to be
(100, 34, 6)
in a meshgrid-kind of fashion so that the new dimension is filled constant with the old dimensions. How do I do this the best? The approach should work also with adding many new dimensions. I have seen numpy.expand_dims, but it does not do this.
Your
In [116]: i.shape
Out[116]: (34, 100, 6)
If p.shape is (100,6), then p will broadcast with i,j,k without further change. That is p[None,:,:] expansion is automatic.
If you'd used i, j, k = np.meshgrid(i, j, k, indexing='ij'),
In [121]: i.shape
Out[121]: (100, 34, 6)
And p[:,None,:] would be needed for broadcasting (equilvalently np.expand_dims(p,1))
Related
I am attempting a numpy.matmul call using as variables
Matrix A of dimensions (p, t, q)
Matrix B of dimensions (r, t).
A categories vector of shape r and p categories, used to take slices of B and define the index of A do use.
The multiplications are done iteratively using the indices of each category. For each category p_i, I extract from A a submatrix (t, q). Then, I multiply those with a subset of B (x, t), where x is a mask defined by r == p_i. Finally, the matrix multiplication of (x, t) and (t, q) produces the output (x, q) which is stored at S[x].
I have noted that I cannot figure out a non-iterative version of this algorithm. The first snippet describes an iterative solution. The second one is an attempt at what I would wish to get, where everything is calculated as a single-step and would be presumably faster. However, it is incorrect because matrix A has three dimensions instead of two. Maybe there is no way to do this in NumPy with a single call, and in general, looking for advice/ideas to try out.
Thanks!
import numpy as np
p, q, r, t = 2, 9, 512, 4
# data initialization (random)
np.random.seed(500)
S = np.random.rand(r, q)
A = np.random.randint(0, 3, size=(p, t, q))
B = np.random.rand(r, t)
categories = np.random.randint(0, p, r)
print('iterative') # iterative
for i in range(p):
# print(i)
a = A[i, :, :]
mask = categories == i
b = B[mask]
print(b.shape, a.shape, S[mask].shape,
np.matmul(b, a).shape)
S[mask] = np.matmul(b, a)
print(S.shape)
a simple way to write it down
S = np.random.rand(r, q)
print(A[:p,:,:].shape)
result = np.matmul(B, A[:p,:,:])
# iterative assignment
i = 0
S[categories == i] = result[i, categories == i, :]
i = 1
S[categories == i] = result[i, categories == i, :]
The next snippet will produce an error during the multiplication step.
# attempt to multiply once, indexing all categories only once (not possible)
np.random.seed(500)
S = np.random.rand(r, q)
# attempt to use the categories vector
a = A[categories, :, :]
b = B[categories]
# due to the shapes of the arrays, this multiplication is not possible
print('\nsingle step (error due to shapes of the matrix a')
print(b.shape, a.shape, S[categories].shape)
S[categories] = np.matmul(b, a)
print(scores.shape)
iterative
(250, 4) (4, 9) (250, 9) (250, 9)
(262, 4) (4, 9) (262, 9) (262, 9)
(512, 9)
single step (error due to shapes of the 2nd matrix a).
(512, 4) (512, 4, 9) (512, 9)
In [63]: (np.ones((512,4))#np.ones((512,4,9))).shape
Out[63]: (512, 512, 9)
This because the first array is broadcasted to (1,512,4). I think you want instead to do:
In [64]: (np.ones((512,1,4))#np.ones((512,4,9))).shape
Out[64]: (512, 1, 9)
Then remove the middle dimension to get a (512,9).
Another way:
In [72]: np.einsum('ij,ijk->ik', np.ones((512,4)), np.ones((512,4,9))).shape
Out[72]: (512, 9)
To remove the loop altogether, you can try this
bigmask = np.arange(p)[:, np.newaxis] == categories
C = np.matmul(B, A)
res = C[np.broadcast_to(bigmask[..., np.newaxis], C.shape)].reshape(r, q)
# `res` has the same rows as the iterative `S` but in the wrong order
# so we need to reorder the rows
sort_index = np.argsort(np.broadcast_to(np.arange(r), bigmask.shape)[bigmask])
assert np.allclose(S, res[sort_index])
Though I'm not sure it's much faster than the iterative version.
I have a 2D numpy array arr of shape (m,n) with nonnegative values. I would like to find a pair (k,l) such that
the difference between sum(arr[:k, :]) and sum(arr[k:, :]) is minimal
similarly, the difference between sum(arr[:, :l]) and sum(arr[:, l:]) is minimal
If you can come up with an algorithm only for k, the rest is actually easy. We simply transpose the matrix to find l.
A note for the skeptical: We may assume that sum(arr[:k, :]) and sum(arr[:,:l]) are strictly increasing functions of k and l, respectively.
This works:
sum_to_k = np.pad(np.cumsum(np.sum(a, axis=1)), (1, 0))
sum_to_l = np.pad(np.cumsum(np.sum(a, axis=0)), (1, 0))
k = np.argmin(np.abs(sum_to_k - (sum_to_k[-1] - sum_to_k)))
l = np.argmin(np.abs(sum_to_l - (sum_to_l[-1] - sum_to_l)))
I have some code here (used for gradient calculation) - Example values are commented:
dE_dx_strided = np.einsum('wxyd,ijkd->wxyijk', dE_dy, f)
# dE_dx_strided.shape = (64, 25, 25, 4, 4, 3)
imax, jmax, di, dj = dE_dx_strided.shape[1:5]
# imax, jmax, di, dj = (25, 25, 4, 4)
dE_dx = np.zeros_like(x)
# dE_dx.shape = (64, 28, 28, 3)
for i in range(imax):
for j in range(jmax):
dE_dx[:, i:i+di, j:j+dj, :] += dE_dx_strided[:, i, j, ...]
where dE_dx is the object of interest and dE_dx_strided is a 6-tensor which is being summed over 'piecewise', effectively, and it looks reminiscent of a convolution operation along axes 1 and 2:
# Verbose convolution operation (not my actual implementation)
for i in range(imax):
for j in range(jmax):
# Vaguely similar, but with filter multiplication, and = instead of +=
y[i, j] = x[i:i+di, j:dj] * f[di, dj]
My original idea was to make all elements of dE_dx_strided that are to be added to a single dE_dx[:, i:i+di, j:j+dj, :] lie along one axis, and then sum over it; but I couldn't get this to work.
Now I know that for loops aren't inherently slow, but is there a numpy-esque way to optimise this further, perhaps by reshaping, summing, strides, etc.?
This question already has answers here:
Matrix multiplication for multidimensional matrix (/array) - how to avoid loop?
(3 answers)
Closed 3 years ago.
I'm trying to figure out a way to use numpy to perform the following algebra in the most time-efficient way possible:
Given a 3D matrix/tensor, A, with shape (n, m, p) and a 2D matrix/tensor, B, with shape (n, p), calculate C_ij = sum_over_k (A_ijk * B_ik), where the resulting matrix C would have dimension (n, m).
I've tried two ways to do this. One is to loop through the first dimension, and calculate a regular dot product each time.
The other method is to use np.tensordot(A, B.T) to calculate a result with shape (n, m, n), and then take the diagonal elements along 1st and 3rd dimension. Both methods are shown below.
First method:
C = np.zeros((n,m))
for i in range(n):
C[i] = np.dot(A[i], B[i])
Second method:
C = np.diagonal(np.tensordot(A, B.T, axes = 1), axis1=0, axis2=2).T
However, because n is a very large number, the loop over n in the first method is costing a lot of time. The second method calculates too many unnecessary entries to obtain that huge (n, m, n)matrix, and is also costing too much time, I'm wondering if there's any efficient way to do this?
Define 2 arrays:
In [168]: A = np.arange(2*3*4).reshape(2,3,4); B = np.arange(2*4).reshape(2,4)
Your iterative approach:
In [169]: [np.dot(a,b) for a,b in zip(A,B)]
Out[169]: [array([14, 38, 62]), array([302, 390, 478])]
The einsum practically writes itself from your C_ij = sum_over_k (A_ijk * B_ik):
In [170]: np.einsum('ijk,ik->ij', A, B)
Out[170]:
array([[ 14, 38, 62],
[302, 390, 478]])
#, matmul, was added to perform batch dot products; here the i dimension is the batch one. Since it uses the last of A and 2nd to the last of B for the dot summation, we have to temporarily expand B to (2,4,1):
In [171]: A#B[...,None]
Out[171]:
array([[[ 14],
[ 38],
[ 62]],
[[302],
[390],
[478]]])
In [172]: (A#B[...,None])[...,0]
Out[172]:
array([[ 14, 38, 62],
[302, 390, 478]])
Typically matmul is fastest, since it passes the task to BLAS like code.
here is my implementation:
B = np.expand_dims(B, axis=1)
E = A * B
E = np.sum(E, axis=-1)
Check :
import numpy as np
n, m, p = 2, 2, 2
np.random.seed(0)
A = np.random.randint(1, 10, (n, m, p))
B = np.random.randint(1, 10, (n, p))
C = np.diagonal(np.tensordot(A, B.T, axes = 1), axis1=0, axis2=2).T
# from here is my implementation
B = np.expand_dims(B, axis=1)
E = A * B
E = np.sum(E, axis=-1)
print(np.array_equal(C, E))
True
use the np.expand_dims() to add a new dimension.
And use the broadcast multiply. Finally, sum along the third dimension.
Thanks check code from user3483203
I am trying to index a large array, so that penultimately I can have a 4-d array with values to each of the points , I can do this in matlab using sub2ind, but I can't figure out how to do it in python, any help would be appreciated (I am also not sure if my indexing is right (I know matlab goes from 1, python goes from 0)
#Create the array
[Nx, Ny, Nz] = (60, 220, 85)
[I, J, K] = (np.arange(1,Nx+1,1),np.arange(1,Ny+1,1),np.arange(1,Nz+1,1))
[I, J, K] = np.meshgrid(I, J, K)
print([I])
ix=np.ravel_multi_index((Nx,Ny,Nz), (I[:], J[:], K[:]), order='F')
Thanks in advance
This is a 3d array
Seems to be working, think it was because of the indexing and didn't structure the arguments correctly
#Create the array
[Nx, Ny, Nz] = (60, 220, 85)
[I, J, K] = (np.arange(0,Nx,1),np.arange(0,Ny,1),np.arange(0,Nz,1))
[I, J, K] = np.meshgrid(I, J, K)
#Create the 1-d idexed array
ix = np.ravel_multi_index((I,J,K),(Nx,Ny,Nz),order='F')
print(ix)