Batch Kronecker product of tensors - python

I have two tensors that are batches of matrices:
x = torch.randn(100,10,10)
y = torch.randn(100,2,2)
I want to parallelize the kronecker on each matrix, not doing the kronecker product of the tensors. torch.kron(x,y) gives me a tensor of size (10000, 60,60), but I want a shape out of size (100, 60, 60) that computes the kronecker product for each matrix. Is there any way to do so ?
Something like torch.kron(x,y, start_dim = 1) is what I tried but it does not seem to be implemented. (I want to do this in torch for R but something that works in python would already be okay)
Thanks

Related

How to perform element-wise product in PyTorch?

I have two torch tensors a and b. Tensor a has the shape of [batch_size, emb_size] and Tensor b has the shape of [num_of_words, emb_size]. I want to do the element-wise product on these two tensors instead of dot product.
I noticed that "*" can perform element-wise product but it doesn't fit my case.
For example, batch_size = 3, emb_size = 2, num_of_words = 5.
a = torch.rand((3,2))
b = torch.rand((5,2))
I want to get something like:
torch.cat([a[0]*b, a[1]*b, a[2]*b]).view(3, 5, 2)
but I want to do this in an efficient and elegant way.
You can use
a.unsqueeze(1) * b
PyTorch supports broadcasting semantics but you need to make sure the singleton dimensions are in the correct locations.

How to do a scalar product along the right axes with numpy and vectorize the process

I have numpy array 'test' of dimension (100, 100, 16, 16) which gives me a different 16x16 array for points on a 100x100 grid.
I also have some eigenvalues and vectors where vals has the dimension (100, 100, 16) and vecs (100, 100, 16, 16) where vecs[x, y, :, i] would be the ith eigenvector of the matrix at the point (x, y) corresponding to the ith eigenvalue vals[x, y, i].
Now I want to take the first eigenvector of the array at ALL points on the grid, do a matrix product with the test matrix and then do a scalar product of the resulting vector with all the other eigenvectors of the array at all points on the grid and sum them.
The resulting array should have the dimension (100, 100). After this I would like to take the 2nd eigenvector of the array, matrix multiply it with test and then take the scalar product of the result with all the eigenvectors that is not the 2nd and so on so that in the end I have 16 (100, 100) or rather a (100, 100, 16) array. I only succeded sofar with a lot of for loops which I would like to avoid, but using tensordot gives me the wrong dimension and I don't see how to pick the axis which is vectorized along for the np.dot function.
I heard that einsum might be suitable to this task, but everything that doesn't rely on the python loops is fine by me.
import numpy as np
from numpy import linalg as la
test = np.arange(16*16*100*100).reshape((100, 100, 16, 16))
vals, vecs = la.eig(test + 1)
np.tensordot(vecs, test, axes=[2, 3]).shape
>>> (10, 10, 16, 10, 10, 16)
EDIT: Ok, so I used np.einsum to get a correct intermediate result.
np.einsum('ijkl, ijkm -> ijlm', vecs, test)
But in the next step I want to do the scalarproduct only with all the other entries of vec. Can I implement maybe some inverse Kronecker delta in this einsum formalism? Or should I switch back to the usual numpy now?
Ok, I played around and with np.einsum I found a way to do what is described above. A nice feature of einsum is that if you repeat doubly occuring indices in the 'output' (so right of the '->'-thing) you can have element-wise multiplication along some and contraction along some other axes (something that you don't have in handwritten tensor algebra notation).
result = np.einsum('ijkl, ijlm -> ijkm', np.einsum('ijkl, ijkm -> ijlm', vecs, test), vecs)
This nearly does the trick. Now only the diagonal terms have to be taken out. We could do this by just substracting the diagonal terms like this:
result = result - result * np.eye(np.shape(test)[-1])[None, None, ...]

Broadcasting np.dot vs tf.matmul for tensor-matrix multiplication (Shape must be rank 2 but is rank 3 error)

Let's say I have the following tensors:
X = np.zeros((3,201, 340))
Y = np.zeros((340, 28))
Making a dot product of X, Y is successful with numpy, and yields a tensor of shape (3, 201, 28).
However with tensorflow I get the following error: Shape must be rank 2 but is rank 3 error ...
minimal code example:
X = np.zeros((3,201, 340))
Y = np.zeros((340, 28))
print(np.dot(X,Y).shape) # successful (3, 201, 28)
tf.matmul(X, Y) # errornous
Any idea how to achieve the same result with tensorflow?
Since, you are working with tensors, it would be better (for performance) to use tensordot there than np.dot. NumPy allows it (numpy.dot) to work on tensors through lowered performance and it seems tensorflow simply doesn't allow it.
So, for NumPy, we would use np.tensordot -
np.tensordot(X, Y, axes=((2,),(0,)))
For tensorflow, it would be with tf.tensordot -
tf.tensordot(X, Y, axes=((2,),(0,)))
Related post to understand tensordot.
Tensorflow doesn't allow for multiplication of matrices with different ranks as numpy does.
To cope with this, you can reshape the matrix. This essentially casts a matrix of,
say, rank 3 to one with rank 2 by "stacking the matrices" one on top of the other.
You can use this:
tf.reshape(tf.matmul(tf.reshape(Aijk,[i*j,k]),Bkl),[i,j,l])
where i, j and k are the dimensions of matrix one and k and l are the dimensions of matrix 2.
Taken from here.

Tensorflow: searching for an op supporting vector- tensor broadcasting

I am trying to achieve something like:
inputs:
x: a vector with length n, [x1,x2,...,xn], elements (xi, i=1,2,...n) are scalars.
T: a tensor with length n in its first dimension, [t1,t2,...tn], elements (ti, i=1,2,..,n) are tensors with rank 3.
return: a tensor, [x1*t1, x2*t2, ... xn*tn].
I know this can be achieved by tf.stack([x[i]*T[i] for i in range(n)]), wonder if there are any elegant approaches without iteration.
Just bring the two vectors to the same dimensions:
T = tf.constant([[[[1,1]]],[[[2,2]]]])
x = tf.constant([3,4])
xr = tf.reshape(x, [-1,1,1,1])
res = T*xr
Running res will print:
[[[[3, 3]]],[[[8, 8]]]]
which is exactly what you're asking for.
Once the two tensor are of the same dimension, tf will take care of broadcasting the op (reshaping is needed to perform the broadcasting correctly)

Theano function that can take input arrays of different shapes in python

In theano, I want to make a function that can take several different inputs, such as both matrices and vectors.
Normally I would do something like this:
import theano
import numpy
x = theano.tensor.matrix(dtype=theano.config.floatX)
y = 3*x
f = theano.function([x],y)
However, then when I enter a vector instead of a matrix, for example:
f(numpy.array([1,2,3]))
Then I get an error of dimension mismatch: 'Wrong number of dimensions: expected 2, got 1 with shape (3,).'
Is there any way to define a more general input symbol in theano that can take matrices but also different shaped arrays such as vectors or 3-dimensional arrays and still works?
Thanks.
The number of dimensions must be fixed at the time the Theano function is compiled. Part of the compilation process is to select operation variants that depend on the number of dimensions.
You could always compile the function for a high-dimensional tensor and just stack your inputs such that they have the required shape.
So
x = theano.tensor.tensor3()
y = 3*x
f = theano.function([x],y)
will accept and of these
f(numpy.array([[[1,2]]])) # (1,1,3) vector wrapped as a tensor3
f(numpy.array([[[1,2],[3,4]]])) # (1,2,2) matrix wrapped as a tensor3
f(numpy.array([[[1,2],[3,4]],[[5,6],[7,8]]])) # (2,2,2) tensor3

Categories