How to perform element-wise product in PyTorch? - python

I have two torch tensors a and b. Tensor a has the shape of [batch_size, emb_size] and Tensor b has the shape of [num_of_words, emb_size]. I want to do the element-wise product on these two tensors instead of dot product.
I noticed that "*" can perform element-wise product but it doesn't fit my case.
For example, batch_size = 3, emb_size = 2, num_of_words = 5.
a = torch.rand((3,2))
b = torch.rand((5,2))
I want to get something like:
torch.cat([a[0]*b, a[1]*b, a[2]*b]).view(3, 5, 2)
but I want to do this in an efficient and elegant way.

You can use
a.unsqueeze(1) * b
PyTorch supports broadcasting semantics but you need to make sure the singleton dimensions are in the correct locations.

Related

Batch Kronecker product of tensors

I have two tensors that are batches of matrices:
x = torch.randn(100,10,10)
y = torch.randn(100,2,2)
I want to parallelize the kronecker on each matrix, not doing the kronecker product of the tensors. torch.kron(x,y) gives me a tensor of size (10000, 60,60), but I want a shape out of size (100, 60, 60) that computes the kronecker product for each matrix. Is there any way to do so ?
Something like torch.kron(x,y, start_dim = 1) is what I tried but it does not seem to be implemented. (I want to do this in torch for R but something that works in python would already be okay)
Thanks

Vectorized computation of numpys tensor dot

I have two vectors containing tensors of shape (3,3) and shape (3,3,3,3) respectively. The vectors have the same length, I am computing the element-wise tensor dot of these two vectors . For example, want to vectorise the following computation to improve performance:
a = np.arange(9.).reshape(3,3)
b = np.arange(81.).reshape(3,3,3,3)
c = np.tensordot(a,b)
a_vec = np.asanyarray([a,a])
b_vec = np.asanyarray([b,b])
c_vec = np.empty(a_vec.shape)
for i in range(c_vec.shape[0]):
c_vec[i, :, :] = np.tensordot(a_vec[i,:,:], b_vec[i,:,:,:,:])
print(np.allclose(c_vec[0], c))
# True
I thought about using numpy.einsum but can't figure out the correct subscripts. I have tried a lot of different approaches but failed so far on all of them:
# I am trying something like this
c_vec = np.einsum("ijk, ilmno -> ijo", a_vec, b_vec)
print(np.allclose(c_vec[0], c))
# False
But this does not reproduce the iterative computation I want above. If this can't be done using einsum or there is a more performant way to do this, I am open for any kind of solutions.
Vectorized way with np.einsum would be -
c_vec = np.einsum('ijk,ijklm->ilm',a_vec,b_vec)
tensor_dot has an axes argument you can use too:
c_vec = np.tensordot(a_vec, b_vec, axes=([1, 2], [1, 2]))

Use tf.gather to extract tensors row-wise based on another tensor row-wisely (first dimension)

I have two tensors with dimensions as A:[B,3000,3] and C:[B,4000] respectively. I want to use tf.gather() to use every single row from tensor C as index, and to use every row from tensor A as params, to get a result with size [B,4000,3].
Here is an example to make this more understandable: Say I have tensors as
A = [[1,2,3],[4,5,6],[7,8,9]],
C = [0,2,1,2,1],
result = [[1,2,3],[7,8,9],[4,5,6],[7,8,9],[4,5,6]],
by using tf.gather(A,C). It is all fine when applying to tensors with dimension less than 3.
But when it is the case as the description as the beginning, by applying tf.gather(A,C,axis=1), the shape of result tensor is
[B,B,4000,3]
It seems that tf.gather() just did the job for every element in tensor C as the indices to gather elements in tensor A. The only solution I am thinking about is to use a for loop, but that would extremely reduce the computational ability by using tf.gather(A[i,...],C[i,...]) to gain the correct size of tensor
[B,4000,3]
Thus, is there any function that is able to do this task similarly?
You need to use tf.gather_nd:
import tensorflow as tf
A = ... # B x 3000 x 3
C = ... # B x 4000
s = tf.shape(C)
B, cols = s[0], s[1]
# Make indices for first dimension
idx = tf.tile(tf.expand_dims(tf.range(B, dtype=C.dtype), 1), [1, cols])
# Complete index for gather_nd
gather_idx = tf.stack([idx, C], axis=-1)
# Gather result
result = tf.gather_nd(A, gather_idx)

Broadcasting np.dot vs tf.matmul for tensor-matrix multiplication (Shape must be rank 2 but is rank 3 error)

Let's say I have the following tensors:
X = np.zeros((3,201, 340))
Y = np.zeros((340, 28))
Making a dot product of X, Y is successful with numpy, and yields a tensor of shape (3, 201, 28).
However with tensorflow I get the following error: Shape must be rank 2 but is rank 3 error ...
minimal code example:
X = np.zeros((3,201, 340))
Y = np.zeros((340, 28))
print(np.dot(X,Y).shape) # successful (3, 201, 28)
tf.matmul(X, Y) # errornous
Any idea how to achieve the same result with tensorflow?
Since, you are working with tensors, it would be better (for performance) to use tensordot there than np.dot. NumPy allows it (numpy.dot) to work on tensors through lowered performance and it seems tensorflow simply doesn't allow it.
So, for NumPy, we would use np.tensordot -
np.tensordot(X, Y, axes=((2,),(0,)))
For tensorflow, it would be with tf.tensordot -
tf.tensordot(X, Y, axes=((2,),(0,)))
Related post to understand tensordot.
Tensorflow doesn't allow for multiplication of matrices with different ranks as numpy does.
To cope with this, you can reshape the matrix. This essentially casts a matrix of,
say, rank 3 to one with rank 2 by "stacking the matrices" one on top of the other.
You can use this:
tf.reshape(tf.matmul(tf.reshape(Aijk,[i*j,k]),Bkl),[i,j,l])
where i, j and k are the dimensions of matrix one and k and l are the dimensions of matrix 2.
Taken from here.

Tensorflow: searching for an op supporting vector- tensor broadcasting

I am trying to achieve something like:
inputs:
x: a vector with length n, [x1,x2,...,xn], elements (xi, i=1,2,...n) are scalars.
T: a tensor with length n in its first dimension, [t1,t2,...tn], elements (ti, i=1,2,..,n) are tensors with rank 3.
return: a tensor, [x1*t1, x2*t2, ... xn*tn].
I know this can be achieved by tf.stack([x[i]*T[i] for i in range(n)]), wonder if there are any elegant approaches without iteration.
Just bring the two vectors to the same dimensions:
T = tf.constant([[[[1,1]]],[[[2,2]]]])
x = tf.constant([3,4])
xr = tf.reshape(x, [-1,1,1,1])
res = T*xr
Running res will print:
[[[[3, 3]]],[[[8, 8]]]]
which is exactly what you're asking for.
Once the two tensor are of the same dimension, tf will take care of broadcasting the op (reshaping is needed to perform the broadcasting correctly)

Categories