I have a L x L matrix A, which I currently fill in using the following code:
A = np.zeros((L, L))
for J in range(X):
for a in range(L):
for b in range(L):
A[a][b] += alpha[J, a] * O[b, J] * A_old[a, b] * betas[J+2, b]
Where X is an integer defined elsewhere, alpha and betas is of shape (X, L), O is of shape (L, X) and A_old is of shape (L, L). I'm concerned about the speed of this code, and am trying to find a more numpythonic way to approach filling in this matrix. My instinct is to do something like:
for J in range(X):
A += alpha[J, :] * O[:, J] * A_old[:, :] * betas[J+2, :]
But this doesn't broadcast the operations correctly because of the A_old matrix (the resulting shape is right, but the values are not). What's a good way to condense this loop using numpy?
Related
I have a matrix M of shape (N, L) and a 3D tensor P of shape (N, L, K). I want to get matrix V of shape (N, K) where V[i] = M[i] # P[i]. I can do it with for loop but that's inefficient, I want to do it with a single or few operations so that it would run in parallel on CUDA.
I tried just multiplying it like so
V = M # P
but that results in a 3D tensor where V[i, j] = M[j] # P[i].
np.diagonal(M # P).T is basically what I want, but calculating it like that wastes a lot of computation.
You could use np.einsum:
>>> M = np.random.rand(5, 2)
>>> P = np.random.rand(5, 2, 3)
>>> V = np.einsum('nl,nlk->nk', M, P)
>>> V.shape
(5, 3)
Roughly I want to convert this (non-numpy) for-loop:
N = len(left)
M = len(right)
matrix = np.zeros(N, M)
for i in range(N):
for j in range(M):
matrix[i][j] = scipy.stats.binom.pmf(left[i], C, right[j])
It's sort of like a dot product but of course mathematically not a dot product. How would I normally vectorize or make something like this pythonic/numpythonic?
scipy.stats.binom.pmf already is vectorized. However, you have to broadcast your inputs in order to get your desired result.
broadcast_out = scipy.stats.binom.pmf(left[:, None], C, right)
Validation
np.random.seed(314)
left = np.arange(5, dtype=float)
right = np.random.rand(5)
C = 5
broadcast_out = scipy.stats.binom.pmf(left[:, None], C, right)
N = len(left)
M = len(right)
matrix = np.zeros((N, M))
for i in range(N):
for j in range(M):
matrix[i][j] = scipy.stats.binom.pmf(left[i], C, right[j])
print(np.array_equal(matrix, broadcast_out))
True
I have an m by n matrix A, implemented as a numpy array.
import numpy as np
m = 10
n = 7
A = np.random.rand(m, n)
I want to compute the m by m matrix B whose entries are
B[i, j] = sum_{k=1,...,n} sum_{l=1,...,n} A[i, k] * A[j, l]
What is the easiest way to do this without making explicit for loops?
Notice that the sum over k in your expression only affects the first factor, while the sum over l only involves the second:
sum_{k=1,...,n} sum_{l=1,...,n} A[i, k] * A[j, l] =
(sum_{k=1,...,n} A[i, k]) * (sum_{l=1,...,n} A[j, l])
The expressions in parentheses are, except for the names of the indices, the same, so define
sA = np.sum(A, axis=1)
Then your B is the so-called outer product of sA with itself:
B = np.outer(sA, sA)
I'd like to transform a tensor T of size (n x n x m x m) into a tensor U of size (n x m x m) while only retreiving the diagonal elements of T over the (NxN) chunks (i.e. Uikl=Tiikl). torch.diag() only works with 2-D tensors and I really fail to see how to do this without looping on the indexes of the elements (which I'd like to avoid given that I think that it is inefficient computationnally). In clear, I'd like to vectorize the following code:
U = torch.zeros(n, m, m)
for i in range(n):
for k in range(m):
for l in range(m):
U[i][k][l] = T[i][i][k][l]
I'm totally new to pytorch and I tried many combination of functions but none of them gives me a satisfying result. Has anyone an idea?
You can generate the indexes using np.meshgrid
i, k, l = np.meshgrid(range(n), range(m), range(m))
U[i, k, l] = T[i, i, k, l]
for completeness I did:
n = 3
m = 5
T = torch.arange(n * n * m * m).view(n, n, m, m)
U = torch.zeros(n, m, m)
U_ = torch.zeros(n, m, m)
i, k, l = np.meshgrid(range(n), range(m), range(m))
U_[i, k, l] = T[i, i, k, l]
for i in range(n):
for k in range(m):
for l in range(m):
U[i][k][l] = T[i][i][k][l]
U = U.view(-1)
U_ = U_.view(-1)
print ((U == U_).all())
The output is True so I assume it is correct.
When applied to 2d matrices, torch.diag() is an alias for torch.diagonal().
diagonal itself allows you to specify which two dimensions of an arbitrary rank tensor the diagonal is taken from, by default these are 0 and 1:
U = T.diagonal()
Given an n X m matrix with entries xi, j, the compositional variance is an m X m matrix, with the i, j entry including the expression
∑k = 1n [ ln2(xk, i / xk, j)]
(it includes other, easily calculated, expressions).
This is very easy to calculate in a loop, but how can it be calculated using vectorization?
Here is the crappy loop code:
x = np.array([[1, 2, 3], [4, 5, 6]], dtype=float)
v = np.zeros((3, 3))
for i in range(3):
for j in range(3):
for k in range(2):
v[i, j] += np.log(x[k, i] / x[k, j])**2
Assuming you meant something like (np.log(x[k, i] / x[k, j])**2) in NumPy terms, being summed over for k = 1:n, one vectorized approach could be suggested with broadcasting -
((np.log(x[:,:,None]/x[:,None])**2)).sum(0)