I want to multiply two big matrices and then to take only diagonal elements of the resulting matrix:
m1[i, j] = sum_k A[i, k] * B[k, j]
m2[i] = m1[i, i]
I can do it this way. However, doing it this way involves a lot of unnecessary operations. A better way to do it would be:
m[i, i] = sum_k A[i, k] * B[k, i]
Is there a way to "force" to do it the second way?
The solution:
T.sum(a * b.dimshuffle(1,0), axis = 1)
The explanation:
To "pair" indices of the first tensor with the indices of the second tensor we can use a pairwise multiplication. In this case all the dimensions of the first tensor will be paired with all the corresponding dimensions of the second tensor. However, to be able to do it we might need to reorder the dimensions since the i_th dimensions of the first tensor is always paired with i_th dimension of the second tensor. So, use transpose in the particular described cases.
After an element wise multiplication you can sum over some dimensions.
Related
I'm trying to assemble a tensor based on the contents of two other tensors, like so:
I have a 2D tensor called A, with shape I * J, and another 2D tensor called B, with shape M * N, whose elements are indices into the 1st dimension of A.
I want to obtain a 3D tensor C with shape M * N * J such that C[m,n,:] == A[B[m,n],:] for all m in [0, M) and n in [0, N).
I could do this using nested for-loops to iterate over all indices in M and N, assigning the right values to C at each one, but M and N are large so this is quite slow. I suspect there's some nicer, faster way of doing this using clever slicing or a built-in pytorch function, but I don't know what it would be. It looks a bit like somewhere one would use torch.gather(), but that requires all tensors to have the same number of dimensions. Does anyone know how this ought to be done?
EDIT: torch.index_select(input, dim, index) is almost what I want, but it won't work here because it requires that index be a 1D tensor, while my tensor of indices is 2D.
You could achieve this by flattening the first dimensions which let's you index A. A broadcast will be required to recover the final shape
>>> A[B.flatten(),:].reshape(*B.shape, A.size(-1))
Indexing with A[B.flatten(),:] is equivalent to torch.index_select(A, 0, B.flatten()).
I'm currently trying to fill a matrix K where each entry in the matrix is just a function applied to two entries of an array x.
At the moment I'm using the most obvious method of running through rows and columns one at a time using a double for-loop:
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
for j in range(x.shape[0]):
K[i,j] = f(x[i],x[j])
While this works fine the resulting matrix is a 10,000 by 10,000 matrix and takes very long to calculate. I was wondering if there is a more efficient way to do this built into NumPy?
EDIT: The function in question here is a gaussian kernel:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.dot(vec,vec)/(2*sigma**2))
where I set sigma in advance before calculating the matrix.
The array x is an array of shape (10000, 8). So the scalar product in the gaussian is between two vectors of dimension 8.
You can use a single for loop together with broadcasting. This requires to change the implementation of the gaussian function to accept 2D inputs:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.sum(vec**2, axis=-1)/(2*sigma**2))
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
K[i] = gaussian(x[i:i+1], x)
Theoretically you could accomplish this even without any for loop, again by using broadcasting, but here an intermediary array of size len(x)**2 * x.shape[1] will be created which might run out of memory for your array sizes:
K = gaussian(x[None, :, :], x[:, None, :])
I'm trying to calculate the summation of each pair of rows in a matrix. Suppose I have an m x n matrix, say one like
[[1,2,3],
[4,5,6],
[7,8,9]]
and I want to create a matrix of the summations of all pairs of rows. So, for the above matrix, we would want
[[5,7,9],
[8,10,12],
[11,13,15]]
In general, I think the new matrix will be (m choose 2) x n. For the above example in pytorch, I ran
import torch
x = torch.tensor([[1,2,3], [4,5,6], [7,8,9]])
y = x[None] + x[:, None]
torch.cat((y[0, 1:3, :], y[1, 2:3, :]))
which manually creates the matrix I am looking for. However, I am struggling to think of a way to create the output without manually specifying indices and without using a for-loop. Is there even a way to create such a matrix for an arbitrary matrix without the use of a for-loop?
You can try using this function:
def sum_rows(x):
y = x[None] + x[:, None]
ind = torch.tril_indices(x.shape[0], x.shape[0], offset=-1)
return y[ind[0], ind[1]]
Because you know you want pairs with the constraints of sum_matrix[i,j], where i<j (but i>j would also work), you can just specify that you want the lower/upper triangle indices of your 3D matrix. This still uses a for loop, AFAIK, but should do the job for variable-sized inputs.
Suppose that we are given a two dimensional matrix A of dtype=uint8 with N rows and M columns and a uint8 vector of size N called x. We need to bit-wise XOR each row of A, e.g. A[i], with the corresponding element in x, i.e. x[i].
Currently, I am doing this as follows, but think that there are more efficient ways of doing that with numpy vectorization capabilities.
for i in range(A.shape[0]):
A[i,:] = np.bitwise_xor(A[i,:], x[i]
This is the row wised XOR. Besides this, this XOR needs to be applied column-wise, too.
Thanks in advance.
I have a matrix that I initialized with numpy.random.uniform like so:
W = np.random.uniform(-1, 1, (V,N))
In my case, V = 10000 and N = 50, x is a positive integer
When I multiply W by a one hot vector x_vec of dimension V X 1, like W.T.dot(x_vec), I get a column vector with a shape of (50,1). When I try to get the same vector by indexing W, as in W[x].T or W[x,:].T I get shape (50,).
Can anyone explain to me why these two expression return different shapes and if it's possible to return a (50,1) matrix (vector) with the indexing method. The vector of shape (50,) is problematic because it doesn't behave the same way as the (50,1) vector when I multiply it with other matrices, but I'd like to use indexing to speed things up a little.
*Sorry in advance if this question should be in a place like Cross Validated instead of Stack Exchange
They are different operations. matrix (in the maths sense) times matrix gives matrix, some of your matrices just happen to have width 1.
Indexing with an integer scalar eats the dimension you are indexing into. Once you are down to a single dimension, .T does nothing because it doesn't have enough axes to shuffle.
If you want to go from (50,) to (50, 1) shape-wise, the recipe is indexing with None like so v[:, None]. In your case you have at least two one-line options:
W[x, :][:, None] # or W[x][:, None] or
W[x:x+1, :].T # or W[x:x+1].T
The second-line option preserves the first dimension of W by requesting a subrange of length one. The first option can be contracted into a single indexing operation - thanks to #hpaulj for pointing this out - which gives the arguably most readable option:
W[x, :, None]
The first index (scalar integer x) consumes the first dimension of W, the second dimension (unaffected by :) becomes the first and None creates a new dimension on the right.