Theano function that can take input arrays of different shapes in python - python

In theano, I want to make a function that can take several different inputs, such as both matrices and vectors.
Normally I would do something like this:
import theano
import numpy
x = theano.tensor.matrix(dtype=theano.config.floatX)
y = 3*x
f = theano.function([x],y)
However, then when I enter a vector instead of a matrix, for example:
f(numpy.array([1,2,3]))
Then I get an error of dimension mismatch: 'Wrong number of dimensions: expected 2, got 1 with shape (3,).'
Is there any way to define a more general input symbol in theano that can take matrices but also different shaped arrays such as vectors or 3-dimensional arrays and still works?
Thanks.

The number of dimensions must be fixed at the time the Theano function is compiled. Part of the compilation process is to select operation variants that depend on the number of dimensions.
You could always compile the function for a high-dimensional tensor and just stack your inputs such that they have the required shape.
So
x = theano.tensor.tensor3()
y = 3*x
f = theano.function([x],y)
will accept and of these
f(numpy.array([[[1,2]]])) # (1,1,3) vector wrapped as a tensor3
f(numpy.array([[[1,2],[3,4]]])) # (1,2,2) matrix wrapped as a tensor3
f(numpy.array([[[1,2],[3,4]],[[5,6],[7,8]]])) # (2,2,2) tensor3

Related

Dynamically broadcast a numpy array

I currently have a 1D numpy array, epsilons, that needs to perform element-wise multiplication on array x. However, the dimensionality of x is dynamic and changes with each iteration of the following for loop:
for x in grads:
x = x * epsilons
print(grad)
epsilons always has the shape (M,). However, for the first iteration, x takes the shape (M,4,2) while it takes the shape (M,4) for the second iteration (the shape of x changes as the code iterates over grads). Is there a way I can automatically broadcast epsilons to the shape of x so that I can perform this element-wise multiplication for any shape of x?
You can just reshape epsilons to the correct shape. Indeed, Numpy automatically broadcast the vector shape (like the broadcast_to call) if is has a compatible shape: the same number of dimension should be at least the same and the shape should be either 1 of full for each dimension.
Thanks to #hpaulj for the improved solution.
# Reshape epsilons so that the vector value are along the first dimension (the least contiguous one)
reshapedEpsilons = epsilons.reshape((M,)+(1,)*(x.ndim-1))
# Broadcast automatically the vector values in the other dimensions so the result have the same shape than x
# Actual element-wise multiplication
x *= reshapedEpsilons
PS: note that a = a * b should create a new matrix and is less efficient than a *= b which modify the values in-place.

Why is numpy.dot() throwing a ValueError: shapes not aligned?

I want to write a program that finds the eigenvectors and eigenvalues of a Hermitian matrix by iterating over a guess (Rayleigh quotient iteration). I have a test matrix that I know the eigenvectors and eigenvalues of, however when I run my code I receive
ValueError: shapes (3,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)
By splitting each numerator and denominator into separate variables I've traced the problem to the line:
nm=np.dot(np.conj(b1),np.dot(A,b1))
My code:
import numpy as np
import numpy.linalg as npl
def eigen(A,mu,b,err):
mu0=mu
mu1=mu+10*err
while mu1-mu > err:
n=np.dot((npl.inv(A-mu*np.identity(np.shape(A)[0]))),b)
d=npl.norm(np.dot((npl.inv(A-(mu*np.identity(np.shape(A)[0])))),b))
b1=n/d
b=b1
nm=np.dot(np.conj(b1),np.dot(A,b1))
dm=np.dot(np.conj(b1),b1)
mu1=nm/dm
mu=mu1
return(mu,b)
A=np.array([[1,2,3],[1,2,1],[3,2,1]])
mu=4
b=np.array([[1],[2],[1]])
err=0.1
eigen(A,mu,b,err)
I believe the dimensions of the variables being input into the np.dot() function are wrong, but I cannot find where. Everything is split up and renamed as part of my debugging, I know it looks very difficult to read.
The mathematical issue is with matrix multiplication of shapes (3,1) and (3,1). That's essentially two vectors. Maybe you wanted to use the transposed matrix to do this?
nm = np.dot(np.conj(b1).T, np.dot(A, b1))
dm = np.dot(np.conj(b1).T, b1)
Have a look at the documentation of np.dot to see what arguments are acceptable.
If both a and b are 1-D arrays, it is inner product of vectors (...)
If both a and b are 2-D arrays, it is matrix multiplication (...)
The variables you're using are of shape (3, 1) and therefore 2-D arrays.
Also, this means, alternatively, instead of transposing the first matrix, you could use a flattened view of the array. This way, it's shape (3,) and a 1-D array and you'll get the inner product:
nm = np.dot(np.conj(b1).ravel(), np.dot(A, b1).ravel())
dm = np.dot(np.conj(b1).ravel(), b1.ravel())

Matlab to Python numpy indexing and multiplication issue

I have the following line of code in MATLAB which I am trying to convert to Python numpy:
pred = traindata(:,2:257)*beta;
In Python, I have:
pred = traindata[ : , 1:257]*beta
beta is a 256 x 1 array.
In MATLAB,
size(pred) = 1389 x 1
But in Python,
pred.shape = (1389L, 256L)
So, I found out that multiplying by the beta array is producing the difference between the two arrays.
How do I write the original Python line, so that the size of pred is 1389 x 1, like it is in MATLAB when I multiply by my beta array?
I suspect that beta is in fact a 1D numpy array. In numpy, 1D arrays are not row or column vectors where MATLAB clearly makes this distinction. These are simply 1D arrays agnostic of any shape. If you must, you need to manually introduce a new singleton dimension to the beta vector to facilitate the multiplication. On top of this, the * operator actually performs element-wise multiplication. To perform matrix-vector or matrix-matrix multiplication, you must use numpy's dot function to do so.
Therefore, you must do something like this:
import numpy as np # Just in case
pred = np.dot(traindata[:, 1:257], beta[:,None])
beta[:,None] will create a 2D numpy array where the elements from the 1D array are populated along the rows, effectively making a column vector (i.e. 256 x 1). However, if you have already done this on beta, then you don't need to introduce the new singleton dimension. Just use dot normally:
pred = np.dot(traindata[:, 1:257], beta)

Decomposing 3rd Order Tensor in Python

I have a tensor in the shape (n_samples, n_steps, n_features). I want to decompose this into a tensor of shape (n_samples, n_components).
I need a method of decomposition that has a .fit(...) so that I can apply the same decomposition to a new batch of samples. I have been looking at Tucker Decomposition and PARAFAC Decomposition, but neither have that crucial .fit(...) and .transform(...) functionality. (Or at least I think they don't?)
I could use PCA and train it on a representative sample and then call .transform(...) on the remaining samples, but I would rather have some sort of tensor decomposition that can handle all of the samples at once, so as to get a better idea of the differences between each sample.
This is what I mean by "tensor":
In fact tensors are merely a generalisation of scalars and vectors; a scalar is a zero rank tensor, and a vector is a first rank tensor. The rank (or order) of a tensor is defined by the number of directions (and hence the dimensionality of the array) required to describe it.
If you have any questions, please ask, I'll try to clarify my problem if needed.
EDIT: The best solution would be some type of kernel but I have yet to find a kernel that can deal with n-rank Tensors and not just 2D data
You can do this using the development (master) version of TensorLy. Specifically, you can use the new partial_tucker function (it is not yet updated in the documentation...).
Note that the following solution preserves the structure of the tensor, i.e. a tensor of shape (n_samples, n_steps, n_features) is decomposed into a (smaller) tensor of shape (n_samples, n_components_1, n_components_2).
Code
Short answer: this is a very basic class that does what you want (and it would work on tensors of arbitrary order).
import tensorly as tl
from tensorly.decomposition._tucker import partial_tucker
class TensorPCA:
def __init__(self, ranks, modes):
self.ranks = ranks
self.modes = modes
def fit(self, tensor):
self.core, self.factors = partial_tucker(tensor, modes=self.modes, ranks=self.ranks)
return self
def transform(self, tensor):
return tl.tenalg.multi_mode_dot(tensor, self.factors, modes=self.modes, transpose=True)
Usage
Given an input tensor, you can use the previous class by first instantiating it with the desired ranks (size of the core tensor) and modes on which to perform the decomposition (in your 3D case, 1 and 2 since indexing starts at zero):
tpca = TensorPCA(ranks=[4, 5], modes=[1, 2])
tpca.fit(tensor)
Given a new tensor originally called new_tensor, you can project it using the transform method:
tpca.transform(new_tensor)
Explanation
Let's go through the code with an example: first let's import the necessary bits:
import numpy as np
import tensorly as tl
from tensorly.decomposition._tucker import partial_tucker
We then generate a random tensor:
tensor = np.random.random((10, 11, 12))
The next step is to decompose it along its second and third dimensions, or modes (as the first dimension corresponds to the samples):
core, factors = partial_tucker(tensor, modes=[1, 2], ranks=[4, 5])
The core corresponds to the transformed input tensor while factors is a list of two projection matrices, one for the second mode and one for the third mode. Given a new tensor, you can project it to the same subspace (the transform method) by projecting each of its last two dimensions:
tl.tenalg.multi_mode_dot(tensor, factors, modes=[1, 2], transpose=True)
The transposition here is equivalent to an inverse since the factors are orthogonal.
Finally, a note on the terminology: in general, even though it is sometimes done, it is probably best to not use interchangeably order and rank of a tensor. The order of a tensor is simply its number of dimensions while the rank of a tensor is usually a much more complicated notion which you could think of as a generalization of the notion of matrix rank.

Numpy function to get shape of added arrays

tl;dr: How do I predict the shape returned by numpy broadcasting across several arrays without having to actually add the arrays?
I have a lot of scripts that make use of numpy (Python) broadcasting rules so that essentially 1D inputs result in a multiple-dimension output. For a basic example, the ideal gas law (pressure = rho * R_d * temperature) might look like
def rhoIdeal(pressure,temperature):
rho = np.zeros_like(pressure + temperature)
rho += pressure / (287.05 * temperature)
return rho
It's not necessary here, but in more complicated functions it's very useful to initialize the array with the right shape. If pressure and temperature have the same shape, then rho also has that shape. If pressure has shape (n,) and temperature has shape (m,), I can call
rhoIdeal(pressure[:,np.newaxis], temperature[np.newaxis,:])
to get rho with shape (n,m). This lets me make plots with multiple values of temperature without having to loop over rhoIdeal, while still allowing the script to accept arrays of the same shape and compute the result element-by-element.
My question is: Is there a built-in function to return the shape compatible with several inputs? Something that behaves like
def returnShape(list_of_arrays):
return np.zeros_like(sum(list_of_arrays)).shape
without actually having to sum the arrays? If there's no built-in function, what would a good implementation look like?
You could use np.broadcast. This function returns an object encapsulating the result of broadcasting two or more arrays together. No actual operation (e.g. addition) is performed - the object simply has some of the same attributes that an array produced by means of other operations would have (shape, ndim, etc.).
For example:
x = np.array([1,2,3]) # shape (3,)
y = x.reshape(3,1) # shape (3, 1)
z = np.ones((5,1,1)) # shape (5, 1, 1)
Then you can check what the shape of the array returned by broadcasting x, y and z would be by inspecting the shape attribute:
>>> np.broadcast(x, y, z).shape
(5, 3, 3)
This means that you could implement your function simply as follows:
def returnShape(*args):
return np.broadcast(*args).shape

Categories