Use of unsqueeze():
input = torch.Tensor(2, 4, 3) # input: 2 x 4 x 3
print(input.unsqueeze(0).size()) # prints - torch.size([1, 2, 4, 3])
Use of view():
input = torch.Tensor(2, 4, 3) # input: 2 x 4 x 3
print(input.view(1, -1, -1, -1).size()) # prints - torch.size([1, 2, 4, 3])
According to documentation, unsqueeze() inserts singleton dim at position given as parameter and view() creates a view with different dimensions of the storage associated with tensor.
What view() does is clear to me, but I am unable to distinguish it from unsqueeze(). Moreover, I don't understand when to use view() and when to use unsqueeze()?
Any help with good explanation would be appreciated!
view() can only take a single -1 argument.
So, if you want to add a singleton dimension, you would need to provide all the dimensions as arguments. For e.g., if A is a 2x3x4 tensor, to add a singleton dimension, you would need to do A:view(2, 1, 3, 4).
However, sometimes, the dimensionality of the input is unknown when the operation is being used. Thus, we dont know that A is 2x3x4, but we would still like to insert a singleton dimension. This happens a lot when using minibatches of tensors, where the last dimension is usually unknown. In these cases, the nn.Unsqueeze is useful and lets us insert the dimension without explicitly being aware of the other dimensions when writing the code.
unsqueeze() is a special case of view()
For convenience, many python libraries have short-hand aliases for common uses of more general functions.
view() reshapes a tensor to the specified shape
unsqueeze() reshapes a tensor by adding a new dimension of depth 1
(i.e. turning an n.d tensor into an n+1.d tensor)
When to use unsqueeze()?
Some example use cases:
You have a model designed to intake RGB image tensors (3d: CxHxW), but your data is 2d greyscale images (HxW)
Your model is designed to intake batches of data (batch_size x dim1 x dim2 x ...), and you want to feed it a single sample (i.e. a batch of size 1).
Related
My apologies if this has been answered elsewhere. I have searched for two days without luck. It seems like a pretty straightforward problem, though I cannot solve it.
I have a 4-dimensional numpy array of shape (40, 320, 320, 8). This array is the output of a convolutional layer of a CNN, where dimension 1 represents outputs for the 40 inputs to the model, dimensions 2 and 3 represent the feature map outputs of a given filter, and dimension 4 represents the 8 filters utilized in the convolutional layer. My python experience is still very novice.
What I am trying to do is split this 4-dimensional array into 8 separate 3-dimensional arrays, where each new array corresponds to one of the 8 filters represented in dimension 4.
Currently, I am able to do this one at a time with the following....
filter_out = squeeze(intermediate_output[:,:,:,1])
How can I do this for all 8 (or n) at once?
Since you need a list of arrays, there's no harm in using a straight forward comprehension.
alist = [arr[...,i] for i in range(8)]
Transposing and wrapping with list won't be any faster since list(...) just iterates on the first dimension. Array split also iterates taking slices.
No need to squeeze - unless using np.split.
But do you really need separate arrays?
This will produce a list of the 3-dimensional arrays:
list(intermediate_output.transpose(3, 0, 1, 2))
Another possibility:
split_array_list = [x.squeeze() for x in np.split(my_array, my_array.shape[-1])]
In the following,
x_6 = torch.cat((x_1, x_2_1, x_3_1, x_5_1), dim=-3)
Sizes of tensors x_1, x_2_1, x_3_1, x_5_1 are
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7])
torch.Size([1, 256, 7, 7]) respectively.
The size of x_6 turns out to be torch.Size([1, 1024, 7, 7])
I couldn't understand & visualise this concatenation along a negative dimension(-3 in this case).
What exactly is happening here?
How does the same go if dim = 3?
Is there any constraint on dim for a given set of tensors?
The answer by danin is not completely correct, actually wrong when looked from the perspective of tensor algebra, since the answer indicates that the problem has to do with accessing or indexing a Python list. It isn't.
The -3 means that we concatenate the tensors along the 2nd dimension. (you could've very well used 1 instead of the confusing -3).
From taking a closer look at the tensor shapes, it seems that they represent (b, c, h, w) where b stands for batch_size, c stands for number of channels, h stands for height and w stands for width.
This is usually the case, somewhere at the final stages of encoding (possibly) images in a deep neural network and we arrive at these feature maps.
The torch.cat() operation with dim=-3 is meant to say that we concatenate these 4 tensors along the dimension of channels c (see above).
4 * 256 => 1024
Hence, the resultant tensor ends up with a shape torch.Size([1, 1024, 7, 7]).
Notes: It is hard to visualize a 4 dimensional space since we humans live in an inherently 3D world. Nevertheless, here are some answers that I wrote a while ago which will help to get some mental picture.
How to understand the term tensor in TensorFlow?
Very Basic Numpy array dimension visualization
Python provides negative indexing, so you can access elements starting from the end of the list e.g, -1 is the last element of a list.
In this case the tensor has 4 dimensions, so -3 is actually the 2nd element.
I have a tensor T of the shape (8, 5, 300), where 8 is the batch size, 5 is the number of documents in each batch, and 300 is the encoding of each of the document. If I reshape the Tensor as follows, does the properties of my Tensor remain the same?
T = T.reshape(5, 300, 8)
T.shape
>> Size[5, 300, 8]
So, does this new Tensor indicate the same properties as the original one? By the properties, I mean, can I say that this is also a Tensor of batch size 8, with 5 documents for each batch, and a 300 dimensional encoding for each document?
Does this affect the training of the model? If reshaping of Tensor messes up the datapoints, then there is no point in training. For example, If reshaping like above gives output as a batch of 5 samples, with 300 documents of size 8 each. If it happens so, then it's useless, since I do not have 300 documents, neither do I have batch of 5 samples.
I need to reshape it like this because my model in between produces output of the shape [8, 5, 300], and the next layer accepts input as [5, 300, 8].
NO
You need to understand the difference between reshape/view and permute.
reshape and view only changes the "shape" of the tensor, without re-ordering the elements. Therefore
orig = torch.rand((8, 5, 300))
resh = orig.reshape(5, 300, 8)
orig[0, 0, :] != resh[0, :, 0]
If you want to change the order of the elements as well, you need to permute it:
perm = orig.permute(1, 2, 0)
orig[0, 0, :] == perm[0, :, 0]
NOOO!
I made a similar mistake.
Imagine you converting 2-d Tensor( Matrix) into 1-D Tensor(Array) and applying transform functionality on it. This would create serious issues in code as your new tensor has characteristic of an array.
Hope you got my point.
I have a tensor in the shape (n_samples, n_steps, n_features). I want to decompose this into a tensor of shape (n_samples, n_components).
I need a method of decomposition that has a .fit(...) so that I can apply the same decomposition to a new batch of samples. I have been looking at Tucker Decomposition and PARAFAC Decomposition, but neither have that crucial .fit(...) and .transform(...) functionality. (Or at least I think they don't?)
I could use PCA and train it on a representative sample and then call .transform(...) on the remaining samples, but I would rather have some sort of tensor decomposition that can handle all of the samples at once, so as to get a better idea of the differences between each sample.
This is what I mean by "tensor":
In fact tensors are merely a generalisation of scalars and vectors; a scalar is a zero rank tensor, and a vector is a first rank tensor. The rank (or order) of a tensor is defined by the number of directions (and hence the dimensionality of the array) required to describe it.
If you have any questions, please ask, I'll try to clarify my problem if needed.
EDIT: The best solution would be some type of kernel but I have yet to find a kernel that can deal with n-rank Tensors and not just 2D data
You can do this using the development (master) version of TensorLy. Specifically, you can use the new partial_tucker function (it is not yet updated in the documentation...).
Note that the following solution preserves the structure of the tensor, i.e. a tensor of shape (n_samples, n_steps, n_features) is decomposed into a (smaller) tensor of shape (n_samples, n_components_1, n_components_2).
Code
Short answer: this is a very basic class that does what you want (and it would work on tensors of arbitrary order).
import tensorly as tl
from tensorly.decomposition._tucker import partial_tucker
class TensorPCA:
def __init__(self, ranks, modes):
self.ranks = ranks
self.modes = modes
def fit(self, tensor):
self.core, self.factors = partial_tucker(tensor, modes=self.modes, ranks=self.ranks)
return self
def transform(self, tensor):
return tl.tenalg.multi_mode_dot(tensor, self.factors, modes=self.modes, transpose=True)
Usage
Given an input tensor, you can use the previous class by first instantiating it with the desired ranks (size of the core tensor) and modes on which to perform the decomposition (in your 3D case, 1 and 2 since indexing starts at zero):
tpca = TensorPCA(ranks=[4, 5], modes=[1, 2])
tpca.fit(tensor)
Given a new tensor originally called new_tensor, you can project it using the transform method:
tpca.transform(new_tensor)
Explanation
Let's go through the code with an example: first let's import the necessary bits:
import numpy as np
import tensorly as tl
from tensorly.decomposition._tucker import partial_tucker
We then generate a random tensor:
tensor = np.random.random((10, 11, 12))
The next step is to decompose it along its second and third dimensions, or modes (as the first dimension corresponds to the samples):
core, factors = partial_tucker(tensor, modes=[1, 2], ranks=[4, 5])
The core corresponds to the transformed input tensor while factors is a list of two projection matrices, one for the second mode and one for the third mode. Given a new tensor, you can project it to the same subspace (the transform method) by projecting each of its last two dimensions:
tl.tenalg.multi_mode_dot(tensor, factors, modes=[1, 2], transpose=True)
The transposition here is equivalent to an inverse since the factors are orthogonal.
Finally, a note on the terminology: in general, even though it is sometimes done, it is probably best to not use interchangeably order and rank of a tensor. The order of a tensor is simply its number of dimensions while the rank of a tensor is usually a much more complicated notion which you could think of as a generalization of the notion of matrix rank.
I generally use MATLAB and Octave, and i recently switching to python numpy.
In numpy when I define an array like this
>>> a = np.array([[2,3],[4,5]])
it works great and size of the array is
>>> a.shape
(2, 2)
which is also same as MATLAB
But when i extract the first entire column and see the size
>>> b = a[:,0]
>>> b.shape
(2,)
I get size (2,), what is this? I expect the size to be (2,1). Perhaps i misunderstood the basic concept. Can anyone make me clear about this??
A 1D numpy array* is literally 1D - it has no size in any second dimension, whereas in MATLAB, a '1D' array is actually 2D, with a size of 1 in its second dimension.
If you want your array to have size 1 in its second dimension you can use its .reshape() method:
a = np.zeros(5,)
print(a.shape)
# (5,)
# explicitly reshape to (5, 1)
print(a.reshape(5, 1).shape)
# (5, 1)
# or use -1 in the first dimension, so that its size in that dimension is
# inferred from its total length
print(a.reshape(-1, 1).shape)
# (5, 1)
Edit
As Akavall pointed out, I should also mention np.newaxis as another method for adding a new axis to an array. Although I personally find it a bit less intuitive, one advantage of np.newaxis over .reshape() is that it allows you to add multiple new axes in an arbitrary order without explicitly specifying the shape of the output array, which is not possible with the .reshape(-1, ...) trick:
a = np.zeros((3, 4, 5))
print(a[np.newaxis, :, np.newaxis, ..., np.newaxis].shape)
# (1, 3, 1, 4, 5, 1)
np.newaxis is just an alias of None, so you could do the same thing a bit more compactly using a[None, :, None, ..., None].
* An np.matrix, on the other hand, is always 2D, and will give you the indexing behavior you are familiar with from MATLAB:
a = np.matrix([[2, 3], [4, 5]])
print(a[:, 0].shape)
# (2, 1)
For more info on the differences between arrays and matrices, see here.
Typing help(np.shape) gives some insight in to what is going on here. For starters, you can get the output you expect by typing:
b = np.array([a[:,0]])
Basically numpy defines things a little differently than MATLAB. In the numpy environment, a vector only has one dimension, and an array is a vector of vectors, so it can have more. In your first example, your array is a vector of two vectors, i.e.:
a = np.array([[vec1], [vec2]])
So a has two dimensions, and in your example the number of elements in both dimensions is the same, 2. Your array is therefore 2 by 2. When you take a slice out of this, you are reducing the number of dimensions that you have by one. In other words, you are taking a vector out of your array, and that vector only has one dimension, which also has 2 elements, but that's it. Your vector is now 2 by _. There is nothing in the second spot because the vector is not defined there.
You could think of it in terms of spaces too. Your first array is in the space R^(2x2) and your second vector is in the space R^(2). This means that the array is defined on a different (and bigger) space than the vector.
That was a lot to basically say that you took a slice out of your array, and unlike MATLAB, numpy does not represent vectors (1 dimensional) in the same way as it does arrays (2 or more dimensions).