Input dimension error on pytorch's forward check - python

I am creating an RNN with pytorch, it looks like this:
class MyRNN(nn.Module):
def __init__(self, batch_size, n_inputs, n_neurons, n_outputs):
super(MyRNN, self).__init__()
self.n_neurons = n_neurons
self.batch_size = batch_size
self.n_inputs = n_inputs
self.n_outputs = n_outputs
self.basic_rnn = nn.RNN(self.n_inputs, self.n_neurons)
self.FC = nn.Linear(self.n_neurons, self.n_outputs)
def init_hidden(self, ):
# (num_layers, batch_size, n_neurons)
return torch.zeros(1, self.batch_size, self.n_neurons)
def forward(self, X):
self.batch_size = X.size(0)
self.hidden = self.init_hidden()
lstm_out, self.hidden = self.basic_rnn(X, self.hidden)
out = self.FC(self.hidden)
return out.view(-1, self.n_outputs)
My input x looks like this:
tensor([[-1.0173e-04, -1.5003e-04, -1.0218e-04, -7.4541e-05, -2.2869e-05,
-7.7171e-02, -4.4630e-03, -5.0750e-05, -1.7911e-04, -2.8082e-04,
-9.2992e-06, -1.5608e-05, -3.5471e-05, -4.9127e-05, -3.2883e-01],
[-1.1193e-04, -1.6928e-04, -1.0218e-04, -7.4541e-05, -2.2869e-05,
-7.7171e-02, -4.4630e-03, -5.0750e-05, -1.7911e-04, -2.8082e-04,
-9.2992e-06, -1.5608e-05, -3.5471e-05, -4.9127e-05, -3.2883e-01],
...
[-6.9490e-05, -8.9197e-05, -1.0218e-04, -7.4541e-05, -2.2869e-05,
-7.7171e-02, -4.4630e-03, -5.0750e-05, -1.7911e-04, -2.8082e-04,
-9.2992e-06, -1.5608e-05, -3.5471e-05, -4.9127e-05, -3.2883e-01]],
dtype=torch.float64)
and is a batch of 64 vectors with size 15.
When trying to test this model by doing:
BATCH_SIZE = 64
N_INPUTS = 15
N_NEURONS = 150
N_OUTPUTS = 1
model = MyRNN(BATCH_SIZE, N_INPUTS, N_NEURONS, N_OUTPUTS)
model(x)
I get the following error:
File "/home/tt/anaconda3/envs/venv/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 126, in check_forward_args
expected_input_dim, input.dim()))
RuntimeError: input must have 3 dimensions, got 2
How can I fix it?

You are missing one of the required dimensions for the RNN layer.
Per the documentation, your input size needs to be of shape (sequence length, batch, input size).
So - with the example above, you are missing one of these. Based on your variable names, it appears you are trying to pass 64 examples of 15 inputs each... if that’s true, you are missing sequence length.
With an RNN, the sequence length is the number of times you want the layer to recur. For example, in NLP your sequence length might be equal to the number of words in a sentence, while batch size would be the number of sentences you are passing, and input size would be the vector size of each word.
You might not need an RNN here if you are just trying to do use 64 samples of size 15.

See the documentation, the RNN layer expects
input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence.
In your case it seems that your "size" is the length of the sequence, and you have one feature at every timestep. Edited for 15 features, one timestep
# 15 features, 150 neurons
rnn = nn.RNN(15, 150)
# sequence of length 1, batch size 64, 15 features
x = torch.rand(1, 64, 15)
res, _ = rnn(x)
print(res.shape)
# => torch.Size([1, 64, 150])
Also note that you don't need to prespecify batch size.

Related

Input and output shape to GRU layer in PyTorch

I am getting confused about the input shape to GRU layer.
I have a batch of 128 images and I extracted 9 features from each images.
So now my shape is (1,128,9).
This is the GRU layer
gru=torch.nn.GRU(input_size=128,hidden_size=8,batch_first=True)
Question 1: Is the input_size=128 correctly defined?
Here is the code of forward function
def forward(features):
features=features.permute(0,2,1)#[1, 9, 128]
x2,_=self.gru(features)
Question 2: Is the `code in forward function is correctly defined?
Thanks
No, input_size is not correctly defined. Here, input_size means the number of features in a single input vector of the sequence. The input to the GRU is a sequence of vectors, each input being a 1-D tensor of length input_size. In case of batched input, the input to GRU is a batch of sequence of vectors, so the shape should be (batch_size, sequence_length, input_size) when batch_first=True otherwise the expected shape is (sequence_length, batch_size, input_size) when batch_first=False
import torch
batch_size = 128
input_size = 9 # features in the input
seq_len = 5 # seqence length - how many input vectors in one sequence
hidden_size = 20 # the no of fetures in the output of GRU
gru=torch.nn.GRU(input_size=input_size,hidden_size=hidden_size,batch_first=True)
X = torch.rand( (batch_size, seq_len, input_size), dtype = torch.float32 )
print(f'{X.shape=}')
Y,_ = gru(X)
print(f'{Y.shape=}')
output
"""
X.shape=torch.Size([128, 5, 9])
Y.shape=torch.Size([128, 5, 20])
"""
Using batch_first=False
gru=torch.nn.GRU(input_size=input_size,hidden_size=hidden_size,batch_first=False)
X = torch.rand( (seq_len, batch_size, input_size), dtype = torch.float32 )
print(f'{X.shape=}')
Y,_ = gru(X)
print(f'{Y.shape=}')
output
"""
X.shape=torch.Size([5, 128, 9])
Y.shape=torch.Size([5, 128, 20])
"""

PyTorch CNN Different Input Size

Hello Guys I have a question about different Input Sizes.
My training set and validation dataset have an input Size of 256 and for my prediction (with an unseen Test Dataset) I have an input size of 496.
class Net(nn.Module):
def __init__(self, shape):
super(Net,self).__init__()
self.conv1 = nn.Conv1d(shape,1,1)
self.batch1 = nn.BatchNorm1d(1)
self.avgpl1 = nn.AvgPool1d(1, stride=1)
self.fc1 = nn.Linear(1,3)
#forward method
def forward(self,x):
x = self.conv1(x)
x = self.batch1(x)
x = F.relu(x)
x = self.avgpl1(x)
x = torch.flatten(x,1)
x = F.log_softmax(self.fc1(x))
return x
I saved the model and wanna use it also for my prediction.
Error Message is:
Input In [244], in predict_data(prediction_data, model_path, data_config, context)
25 new_model = Net(shape_preprocessed_data)
26 # load the previously saved state_dict
---> 27 new_model.load_state_dict(torch.load("NetModel.pth"))
29 # check if predictions of models are equal
30
31 # generate random input of size (N,C,H,W)
32
33 # switch to eval mode for both models
34 model = model.eval()
RuntimeError: Error(s) in loading state_dict for Net:
size mismatch for conv1.weight: copying a param with shape
torch.Size([1, 256, 1]) from checkpoint, the shape in current model is torch.Size([1, 494, 1]).
How can I solve this?
You could reshape/downsample the input as the first step of the forward pass in your model. This can be done using the torch.nn.functional.interpolate function.
For example:
class Net(nn.Module):
def __init__(self, shape):
super(Net,self).__init__()
self.input_shape = shape
self.conv1 = nn.Conv1d(shape,1,1)
self.batch1 = nn.BatchNorm1d(1)
self.avgpl1 = nn.AvgPool1d(1, stride=1)
self.fc1 = nn.Linear(1,3)
#forward method
def forward(self,x):
x = torch.nn.functional.interpolate(x, size=self.input_shape)
x = self.conv1(x)
x = self.batch1(x)
x = F.relu(x)
x = self.avgpl1(x)
x = torch.flatten(x,1)
x = F.log_softmax(self.fc1(x))
return x
Your test images would then be downsampled to size 256 in order to be compatible with the model.
It seems that the saved model was initialized with shape, the number of input channels equal to 256, while the model you are trying to load the weight onto new_model was initialized with 494.
But this value refers to the feature size, not the sequence length. I believe you might have mixed up the two things. The feature size should remain constant. But in your case it's hard to say what you are trying to do since you are not providing information about the kind of dataset used.
Try using nn.AdaptiveAvgPool1d(output_size) instead of nn.AvgPool1d, and mention the desired output size. Refer to this for detailed explanation of how Adaptive average pooling works in Pytorch.

PyTorch: Sizes of tensors must match on 2 input neural network

I am attempting to recreate a 2 input neural network from this article: https://towardsdatascience.com/moving-from-keras-to-pytorch-f0d4fff4ce79
I have copied the network described in the post and adjusted it so that it fits my data. The first input is from GloVe Word embeddings while the other is numerical features about the text data.
class Net(nn.Module):
def __init__(self,hidden_size,lin_size, embedding_matrix=embedding_weights):
super(Alex_NeuralNet_Meta, self).__init__()
# Initialize some parameters for your model
self.hidden_size = hidden_size
drp = 0.1
# Layer 1: Embeddings.
self.embedding = nn.Embedding(size_of_vocabulary, pretrained_embedding_dim)
self.embedding.weight = nn.Parameter(torch.tensor(embedding_matrix, dtype=torch.float32))
self.embedding.weight.requires_grad = False
# Layer 2: Dropout1D(0.1)
self.embedding_dropout = nn.Dropout2d(0.1)
# Layer 3: Bidirectional CuDNNLSTM
self.lstm = nn.LSTM(pretrained_embedding_dim, hidden_size, bidirectional=True, batch_first=True)
# Layer 4: Bidirectional CuDNNGRU
self.gru = nn.GRU(hidden_size*2, hidden_size, bidirectional=True, batch_first=True)
# Layer 7: A dense layer
self.linear = nn.Linear(hidden_size*6 + X2_train.shape[1], lin_size)
self.relu = nn.ReLU()
# Layer 8: A dropout layer
self.dropout = nn.Dropout(drp)
# Layer 9: Output dense layer with one output for our Binary Classification problem.
self.out = nn.Linear(lin_size, 1)
def forward(self, x):
'''
here x[0] represents the first element of the input that is going to be passed.
We are going to pass a tuple where first one contains the sequences(x[0])
and the second one is a additional feature vector(x[1])
'''
h_embedding = self.embedding(x[0].long())
h_embedding = torch.squeeze(self.embedding_dropout(torch.unsqueeze(h_embedding, 0)))
#print("emb", h_embedding.size())
h_lstm, _ = self.lstm(h_embedding)
# print("lst",h_lstm.size())
h_gru, hh_gru = self.gru(h_lstm)
hh_gru = hh_gru.view(-1, 2*self.hidden_size )
print("gru", h_gru.size())
print("h_gru", hh_gru.size())
# Layer 5: is defined dynamically as an operation on tensors.
avg_pool = torch.mean(h_gru, 1)
max_pool, _ = torch.max(h_gru, 1)
print("avg_pool", avg_pool.size())
print("max_pool", max_pool.size())
# the extra features you want to give to the model
f = torch.tensor(x[1], dtype=torch.float).cuda()
print("f", f.size())
# Layer 6: A concatenation of the last state, maximum pool, average pool and
# additional features
conc = torch.cat(( hh_gru, avg_pool, max_pool, f), 1)
#print("conc", conc.size())
# passing conc through linear and relu ops
conc = self.relu(self.linear(conc))
conc = self.dropout(conc)
out = self.out(conc)
# return the final output
return out
And during runtime I get an error on the concatenation line:
RuntimeError: Sizes of tensors must match except in dimension 0. Got 33164 and 20 (The offending index is 0)
From the dimensions of the outputs, I can see where the problem lies but I am not sure how I can fix it
The data inputs to the network is:
torch.Size([20, 150])
torch.Size([33164, 40])
The sizes of each layer output is:
gru torch.Size([20, 150, 80])
h_gru torch.Size([20, 80])
avg_pool torch.Size([20, 80])
max_pool torch.Size([20, 80])
f torch.Size([33164, 40])
For the example above the batch size is 20, hidden_size is 40, the number of rows in numerical data features is 33164 and its feature size is 40.
Thanks for any help in advance

Which output should I use for prediction with LSTM for sequenced data?

I'm still new to machine learning and deep learning. I am currently trying to predict time series data with LSTM in PyTorch. The problem I am having is that I don't understand which output should I use for my final prediction.
My code is given below:
class Model(nn.Module):
def __init__(self, input_size, hidden_size, output_size, seq_len, dropout):
super(Model, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.dropout = dropout
self.seq_len = seq_len
self.lstm = nn.LSTM(
input_size = self.input_size,
hidden_size = self.hidden_size,
dropout = self.dropout
)
self.linear = nn.Linear(self.hidden_size, self.output_size)
def reset_hidden_state(self):
self.hidden = (
torch.zeros(1, self.seq_len, self.hidden_size),
torch.zeros(1, self.seq_len, self.hidden_size)
)
def forward(self, sequences):
lstm_out, self.hidden = self.lstm(sequences, self.hidden)
y_pred = self.linear(lstm_out[-1, :, :])
return y_pred
mymodel = Model(5, 10, 1, 3, 0.0)
inps = torch.randn(10, 3, 5) #input
#print(inps)
mymodel.reset_hidden_state()
out = mymodel.forward(inps)
print(out.shape)
print(out)
output:
torch.Size([3, 1])
tensor([[-0.0996],
[-0.0587],
[-0.0421]], grad_fn=)
As you can see, this gives me three outputs, but my output size is 1 as I am trying to predict only 1 variable. So, in this case which variable should I use for my final prediction? Or, is it is even possible to predict only 1 value for sequential data like this?
NB: My python version is 3.7.4
and my PyTorch version is 1.4.0
And, sorry if I have made any mistake while asking the question. This is my first time asking question here.
You are already using the correct output of the LSTM, which is the last hidden state. Conveniently that is also the last element in lstm_out, which you are using as lstm_out[-1, :, :].
The input of the model inps are multiple sequences, because their size is [seq_len, batch_size, num_featuers] = [10, 3, 5]. That means you have 3 independent sequences, which have 10 time steps each with 5 features per time step.
Therefore, the out (size: [3, 1]) contains the predictions for each of the 3 sequences. out[0][0] is the prediction of the first sequence, out[1][0] of the second, and out[2][0] of the third. You can also get rid of the singular second sequence with out.unsqueeze(1), so you have a 1D tensor with the 3 predictions.
If you want to predict only a single sequence, you would use a batch size of 1, which means the input would have size [10, 1, 5] instead, then you get a single value back, even though it's in a tensor of size [1, 1].

Pytorch inconsistent size with pad_packed_sequence, seq2seq

I'm having some inconsistencies with the output of a encoder I got from this github .
The encoder looks as follows:
class Encoder(nn.Module):
r"""Applies a multi-layer LSTM to an variable length input sequence.
"""
def __init__(self, input_size, hidden_size, num_layers,
dropout=0.0, bidirectional=True, rnn_type='lstm'):
super(Encoder, self).__init__()
self.input_size = 40
self.hidden_size = 512
self.num_layers = 8
self.bidirectional = True
self.rnn_type = 'lstm'
self.dropout = 0.0
if self.rnn_type == 'lstm':
self.rnn = nn.LSTM(input_size, hidden_size, num_layers,
batch_first=True,
dropout=dropout,
bidirectional=bidirectional)
def forward(self, padded_input, input_lengths):
"""
Args:
padded_input: N x T x D
input_lengths: N
Returns: output, hidden
- **output**: N x T x H
- **hidden**: (num_layers * num_directions) x N x H
"""
total_length = padded_input.size(1) # get the max sequence length
packed_input = pack_padded_sequence(padded_input, input_lengths,
batch_first=True,enforce_sorted=False)
packed_output, hidden = self.rnn(packed_input)
pdb.set_trace()
output, _ = pad_packed_sequence(packed_output, batch_first=True, total_length=total_length)
return output, hidden
So it only consists of a rnn lstm cell, if I print the encoder this is the output:
LSTM(40, 512, num_layers=8, batch_first=True, bidirectional=True)
So it should have a 512 sized output right? But when I feed a tensor with size torch.Size([16, 1025, 40]) 16 samples of 1025 vectors with size 40 (that gets packed to fit the RNN) the output that I get from the RNN has a new encoded size of 1024 torch.Size([16, 1025, 1024]) when it should have been encoded to 512 right?
Is there something Im missing?
Setting bidirectional=True makes the LSTM bidirectional, which means there will be two LSTMs, one that goes from left to right and the other that goes from right to left.
From the nn.LSTM documentation - Outputs:
output of shape (seq_len, batch, num_directions * hidden_size): tensor containing the output features (h_t) from the last layer of the LSTM, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
For the unpacked case, the directions can be separated using output.view(seq_len, batch, num_directions, hidden_size), with forward and backward being direction 0 and 1 respectively. Similarly, the directions can be separated in the packed case.
Your output has the size [batch, seq_len, 2 * hidden_size] (batch and seq_len are swapped in your case due to setting batch_first=True) because of using a bidirectional LSTM. The outputs of the two are concatenated in order to have the information of both, which you could easily separate if you wanted to treat them differently.

Categories