Given an n-by-n matrix A, where each row of A is a permutation of [n], e.g.,
import torch
n = 100
AA = torch.rand(n, n)
A = torch.argsort(AA, dim=1)
Also given another n-by-n matrix P, we want to construct a 3D tensor Q s.t.
Q[i, j, k] = P[A[i, j], k]
Is there any efficient way in pytorch?
I am aware of torch.gather but it seems hard to be directly applied here.
You can directly use:
Q = P[A]
Why not simply use A as an index:
Q = P[A, :]
Related
I have one matrix and one vector, of dimensions (N, d) and (N,) respectively. For each row, I want to divide each element by the corresponding value in the vector. I was wondering if there was a vectorized implementation (to save computation time). (I'm trying to create points on the surface of a d-dimensional sphere.) Right now I'm doing this:
x = np.random.randn(N,d)
norm = np.linalg.norm(x, axis=1)
for i in range(N):
for j in range(d):
x[i][j] = x[i][j] / norm[i]
np.linalg.norm has a keepdims argument just for this:
x /= np.linalg.norm(x, axis=1, keepdims=True)
I'm doing the online Computer Vision course by UMich and am new to PyTorch. One of the assignment questions is on batch matrix multiplication, where we have to find the batch matrix product with and without the bmm function. Here is the code.
def batched_matrix_multiply(x, y, use_loop=True):
"""
Perform batched matrix multiplication between the tensor x of shape (B, N, M)
and the tensor y of shape (B, M, P).
If use_loop=True, then you should use an explicit loop over the batch
dimension B. If loop=False, then you should instead compute the batched
matrix multiply without an explicit loop using a single PyTorch operator.
Inputs:
- x: Tensor of shape (B, N, M)
- y: Tensor of shape (B, M, P)
- use_loop: Whether to use an explicit Python loop.
Hint: torch.stack, bmm
Returns:
- z: Tensor of shape (B, N, P) where z[i] of shape (N, P) is the result of
matrix multiplication between x[i] of shape (N, M) and y[i] of shape
(M, P). It should have the same dtype as x.
"""
z = None
#############################################################################
# TODO: Implement this function #
#############################################################################
# Replace "pass" statement with your code
z = torch.zeros(x.shape[0], x.shape[1], y.shape[2])
if use_loop == True:
for i in range(x.shape[0]):
z[i] = torch.mm(x[i], y[i])
else:
z = torch.bmm(x,y)
#############################################################################
# END OF YOUR CODE #
#############################################################################
return z
I've managed to do it without bmm, but without using the torch.stack hint. I initialized a zeroes matrix 'z' with the dimensions of the output matrix and performed normal matrix multiplication for each batch using the for loop.
I'd like to know what the more efficient answer using torch.stack is.
great question. I just tried solving this myself for two hours now. Here's my solution and it really speeds up the computation, as needed.
if use_loop == False:
z = torch.bmm(x,y)
else:
z = torch.zeros(x.shape[0], x.shape[1], y.shape[2])
for i in range(x.shape[0],2):
z[i] = torch.stack([x[i] # y[i], x[i+1] # y[i+1]])
Hoped this helped!
I'm trying to decompose a Tensor (m, n, o) into matrices A(m, r), B (n, r) and C (k, r). This is known as PARAFAC decomposition. Tensorly already does this kind of a decomposition.
An important step is to multiply A, B, and C to get a tensor of shape (m, n, o).
Tensorly does this as follows:
def kt_to_tensor(A, B, C):
factors = [A, B, C]
for r in range(factors[0].shape[1]):
vecs = np.ix_(*[u[:, r] for u in factors])
if r:
res += reduce(np.multiply, vecs)
else:
res = reduce(np.multiply, vecs)
return res
However, the package I'm using (Autograd) does not support np.ix_ operations. I thus wrote a simpler definition as follows:
def new_kt_to_tensor(A, B, C):
m, n, o = A.shape[0], B.shape[0], C.shape[0]
out = np.zeros((m, n, o))
k_max = A.shape[1]
for alpha in range(0, m):
for beta in range(0, n):
for delta in range(0, o):
for k in range(0, k_max):
out[alpha, beta, delta]=out[alpha, beta, delta]+ A[alpha, k]*B[beta, k]*C[delta, k]
return out
However, it turns out that this implementation also has some aspects that autograd does not support. However, autograd does support np.tensordot.
I was wondering how to use np.tensordot to obtain this multiplication. I think that Tensorflow's tf.tensordot would also have a similar functionality.
Intended solution should be something like:
def tensordot_multplication(A, B, C):
"""
use np.tensordot
"""
Don't think np.tensordot would help you here, as it needs to spread-out the axes that don't participate in sum-reductions, as we have the alignment requirement of keeping the last axis aligned between the three inputs while performing multiplication. Thus, with tensordot, you would need extra processing and have more memory requirements there.
I would suggest two methods - One with broadcasting and another with np.einsum.
Approach #1 : With broadcasting -
(A[:,None,None,:]*B[:,None,:]*C).sum(-1)
Explanation :
Extend A to 4D, by introducing new axes at axis=(1,2) with None/np.newaxis.
Similarly extend B to 3D, by introducing new axis at axis=(1).
Keep C as it is and perform elementwise multiplications resulting in a 4D array.
Finally, the sum-reduction comes in along the last axis of the 4D array.
Schematically put -
A : m r
B : n r
C : k r
=> A*B*C : m n k r
=> out : m n k # (sum-reduction along last axis)
Approach #2 : With np.einsum -
np.einsum('il,jl,kl->ijk',A,B,C)
The idea is the same here as with the previous broadcasting one, but with string notations helping us out in conveying the axes info in a more concise manner.
Broadcasting is surely available on tensorflow as it has tools to expand dimensions, whereas np.einsum is probably not.
The code you refer to isn't actually how TensorLy implements it but simply an alternative implementation given in the doc.
The actual code used in TensorLy is:
def kruskal_to_tensor(factors):
shape = [factor.shape[0] for factor in factors]
full_tensor = np.dot(factors[0], khatri_rao(factors[1:]).T)
return fold(full_tensor, 0, shape)
where the khatri_rao is implemented using numpy.einsum in a way that generalizes what Divakar suggested.
I have two matrices A, B, NxKxD dimensions and I want get matrix C, NxKxDxD dimensions, where C[n, k] = A[n, k] x B[n, k].T (here "x" means product of matrices of dimensions Dx1 and 1xD, so the result must be DxD dimensional), so now my code looking like this (here A = B = X):
def square(X):
out = np.zeros((N, K, D, D))
for n in range(N):
for k in range(K):
out[n, k] = np.dot(X[n, k, :, np.newaxis], X[n, k, np.newaxis, :])
return out
It may be slow for big N and K because of python's for cycle. Is there some way to make this multiplication in one numpy function?
It seems you are not using np.dot for sum-reduction, but just for expansion that results in broadcasting. So, you can simply extend the array to have one more dimension with the use of np.newaxis/None and let the implicit broadcasting help out.
Thus, an implementation would be -
X[...,None]*X[...,None,:]
More info on broadcasting specifically how to add new axes could be found in this other post.
This is probably obvious on reflection, but it's not clear to me right now.
For a pair of numpy arrays of shapes (K, N, M) and (K, M, N) denoted by a and b respectively, is there a way to compute the following as a single vectorized operation:
import numpy as np
K = 5
N = 2
M = 3
a = np.random.randn(K, N, M)
b = np.random.randn(K, M, N)
output = np.empty((K, N, N))
for each_a, each_b, each_out in zip(a, b, output):
each_out[:] = each_a.dot(each_b)
A simple a.dot(b) returns the dot product for every pair of the first axis (so it returns an array of shape (K, N, K, N).
edit: fleshed out the code a bit for those that couldn't understand the question.
I answered a similar question a while back: Element-wise matrix multiplication in NumPy .
I think what you're looking for is:
output = np.einsum('ijk,ikl->ijl', a, b)
Good luck!