Dimensionality problem with PyTorch Conv layers - python

I'm trying to train a neural network in PyTorch with some input signals. The layers are conv1d. The shape of my input is [100, 10], meaning 100 signals of a length of 10.
But when I execute the training, I have this error:
Given groups=1, weight of size [100, 10, 1], expected input[1, 1, 10] to have 10 channels, but got 1 channels instead
config = [10, 100, 100, 100, 100, 100, 100, 100]
batch_size = 1
epochs = 10
learning_rate = 0.001
kernel_size = 1
class NeuralNet(nn.Module):
def __init__(self, config, kernel_size=1):
super().__init__()
self.config = config
self.layers = nn.ModuleList([nn.Sequential(
nn.Conv1d(self.config[i], self.config[i + 1], kernel_size = kernel_size),
nn.ReLU())
for i in range(len(self.config)-1)])
self.last_layer = nn.Linear(self.config[-1], 3)
self.layers.append(nn.Flatten())
self.layers.append(self.last_layer)
def forward(self, x):
for i, l in enumerate(self.layers):
x = l(x)
return x
def loader(train_data, batch_size):
inps = torch.tensor(train_data[0])
tgts = torch.tensor(train_data[1])
inps = torch.unsqueeze(inps, 1)
dataset = TensorDataset(inps, tgts)
train_dataloader = DataLoader(dataset, batch_size = batch_size)
return train_dataloader
At first, my code was without the unsqueez(inps) line and I had the exact same error, but then I added this line thinking that I must have an input of size (num_examples, num_channels, lenght_of_signal) but it didn't resolve the problem at all.
Thank you in advance for your answers

nn.Conv1d expects input with shape of form (batch_size, num_of_channels, seq_length). It's parameters allow to directly set number of ouput channels (out_channels) and change length of output using, for example, stride. For conv1d layer to work correctly it should know number of input channels (in_channels), which is not the case on first convolution: input.shape == (batch_size, 1, 10), therefore num_of_channels = 1, while convolution in self.layers[0] expects this value to be equal 10 (because in_channels set by self.config[0] and self.config[0] == 10). Hence to fix this append one more value to config:
config = [10, 100, 100, 100, 100, 100, 100, 100] # as in snippet above
config = [1] + config
At this point convs should be working fine, but there is another obstacle in self.layers -- linear layer at the end. So if kernel_size of 1 was used, then after final convolution batch will have shape (batch_size, 100, 10), and after flatten (batch_size, 100 * 10), while last_layer expects input of shape (batch_size, 100). So, if length of sequence after final conv layer is known (which is certainly the case if you're using kernel_size of 1 with default stride of 1 and default padding of 0 -- length stays same), last_layer should be defined as:
self.last_layer = nn.Linear(final_length * self.config[-1], 3)
and in snippet above final_length can be set to 10 (since conditions in previous brackets satisfied). To catch idea of how shapes in conv1d transformed take look at simple example in gif below (here batch_size is equal to 1):

Related

Input 0 of layer "lstm" is incompatible with the layer: expected shape=(1, None, 1), found shape=(32, 9, 1)

This is the code in question:
batch_size = 1
epochs = 1
begin_from_row = 3
rows_to_train = 1000
data = data.loc[begin_from_row:rows_to_train, :]
data['Close_next'] = data['Close'].shift(-1)
data = data.dropna()
output_data = data['Close_next']
input_data = data.drop(columns=['Close_next'])
input_size = 9
output_size = 1
hidden_size_1 = 9
input_layer = tf.keras.Input(batch_shape=(batch_size, input_size))
input_layer_expanded = tf.expand_dims(input_layer, axis=-1)
hidden_1 = tf.keras.layers.LSTM(hidden_size_1, stateful=True)(input_layer_expanded)
output_layer = tf.keras.layers.Dense(1, activation='relu')(hidden_1)
model = tf.keras.Model(inputs=input_layer, outputs=output_layer)
model.compile(loss='mean_squared_error', optimizer='adam', run_eagerly=True)
model.fit(input_data, output_data, epochs=epochs)
model.save("model_1.h5")
It returns the following error:
Input 0 of layer "lstm" is incompatible with the layer: expected shape=(1, None, 1), found shape=(32, 9, 1)
I can’t quite get where it gets the number 32 from, since it soesn’™ appear anywhere in my code
The code works when I specify the batch_size=32, just for one batch. The number 32 doesn’t appear anywhere in the code, so I would like to know where it’s coming from.
The vector (32, 9, 1) represents the size of your input data, whereas (1, None, 1) is the expected shape that you defined in the InputLayer (batch_size = 1).
Batch_size 32 is the default value in the fit method. The default value is being used since you did not specify the batch_size argument and you are not using tensorflow datasets:
batch_size: Integer or None. Number of samples per gradient update. If
unspecified, batch_size will default to 32. Do not specify the
batch_size if your data is in the form of datasets, generators, or
keras.utils.Sequence instances (since they generate batches).
From: Tensorflow fit function

Issues with the output size of a Many-to-Many CNN-LSTM in PyTorch

I am trying to build a binary temporal image classifier by combining ResNet18 and an LSTM. However, I have never really used RNNs before and have been struggling on getting the correct output shape.
I am using a batch size of 128 and a sequence size of 32. The images are 80x80 grayscale images.
The current model is:
class CNNLSTM(nn.Module):
def __init__(self):
super(CNNLSTM, self).__init__()
self.resnet = models.resnet18(pretrained=False)
self.resnet.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3)
self.resnet.fc = nn.Sequential(nn.Linear(in_features=512, out_features=256, bias=True))
self.lstm = nn.LSTM(input_size=256, hidden_size=256, num_layers=3)
self.fc1 = nn.Linear(256, 128)
self.fc2 = nn.Linear(128, 1)
def forward(self, x_3d):
#x3d: torch.Size([128, 32, 1, 80, 80])
hidden = None
toret = []
for t in range(x_3d.size(1)):
x = self.resnet(x_3d[:, t, :, :, :])
out, hidden = self.lstm(x.unsqueeze(0), hidden)
x = self.fc1(out[-1, :, :])
x = F.relu(x)
x = self.fc2(x)
print("x shape: ", x.shape)
toret.append(x)
return torch.stack(toret)
Which returns a tensor of shape torch.Size([32, 128, 1]) which, according to what I understand, means that every nth row represents the nth time step of each element in the sequence.
How can I get output of shape 128x1x32 instead?
And is there a better way to do this?
You could permute the dimensions:
a = torch.rand(32, 128, 1)
a = a.permute(1, 2, 0) # these are the indices of the original dimensions
print(a.shape)
>> torch.Size([128, 1, 32])
But you could also set batch_first=True in the LSTM module:
self.lstm = nn.LSTM(input_size=256, hidden_size=256, num_layers=3, batch_first=True)
This will expect that the input to the LSTM has the shape batch-size x seq-len x features and will output a tensor in the same way.

Why return self.head(x.view(x.size(0), -1)) in the nn.Module for pyTorch reinforcement learning example

I understand that the balancing the pole example requires 2 outputs. Reinforcement Learning (DQN) Tutorial
Here is the output for self.head
print ('x',self.head)
x = Linear(in_features=512, out_features=2, bias=True)
When I run the epochs below is the outputs:
print (self.head(x.view(x.size(0), -1)))
return self.head(x.view(x.size(0), -1))
tensor([[-0.6945, -0.1930]])
tensor([[-0.0195, -0.1452]])
tensor([[-0.0906, -0.1816]])
tensor([[ 0.0631, -0.9051]])
tensor([[-0.0982, -0.5109]])
...
The size of x is:
x = torch.Size([121, 32, 2, 8])
So I am trying to understand what x.view(x.size(0), -1) is doing?
I understand from the comment in the code that it's returning:
Returns tensor([[left0exp,right0exp]...]).
But how does x which is torch.Size([121, 32, 2, 8]) being reduced to a tensor of size 2?
Is there an alternative way of writing that makes more sense? What if I had 4 outputs. How would I represent that? Why x.size(0). Why -1?
So appears to take self.head with 4 outputs to 2 outputs. Is that correct?
At the bottom is that class I am referring:
class DQN(nn.Module):
def __init__(self, h, w, outputs):
super(DQN, self).__init__()
self.conv1 = nn.Conv2d(3, 16, kernel_size=5, stride=2)
self.bn1 = nn.BatchNorm2d(16)
self.conv2 = nn.Conv2d(16, 32, kernel_size=5, stride=2)
self.bn2 = nn.BatchNorm2d(32)
self.conv3 = nn.Conv2d(32, 32, kernel_size=5, stride=2)
self.bn3 = nn.BatchNorm2d(32)
# Number of Linear input connections depends on output of conv2d layers
# and therefore the input image size, so compute it.
def conv2d_size_out(size, kernel_size = 5, stride = 2):
return (size - (kernel_size - 1) - 1) // stride + 1
convw = conv2d_size_out(conv2d_size_out(conv2d_size_out(w)))
convh = conv2d_size_out(conv2d_size_out(conv2d_size_out(h)))
linear_input_size = convw * convh * 32
self.head = nn.Linear(linear_input_size, outputs)
# Called with either one element to determine next action, or a batch
# during optimization. Returns tensor([[left0exp,right0exp]...]).
def forward(self, x):
x = F.relu(self.bn1(self.conv1(x)))
x = F.relu(self.bn2(self.conv2(x)))
x = F.relu(self.bn3(self.conv3(x)))
return self.head(x.view(x.size(0), -1))
x.view(x.size(0), -1) is flattening the tensor, this is because the Linear layer only accepts a vector (1d array). To break it down, x.view() reshapes the tensor of the specified shape (more info). x.shape(0) returns 1st dimension of the tensor (which is the batch size, this should remain the constant). The -1 in x.view() is a filler, in other words, its dimensions that we don't know, so PyTorch automatically calculates it. For example, if x = torch.tensor([1,2,3,4]), to reshape the tensor to a 2x2, you could do x.view(2,2) or x.view(2,-1) or x.view(-1,2).
The output shape is not a tensor shape of 2, but that of 121,2 (the 121 is the batch size, and the 2 comes from the Linear layers output). So to change the output size from 2, to 4, you would have to change the outputs argument in the __init__ function to 4.

Input dimension error on pytorch's forward check

I am creating an RNN with pytorch, it looks like this:
class MyRNN(nn.Module):
def __init__(self, batch_size, n_inputs, n_neurons, n_outputs):
super(MyRNN, self).__init__()
self.n_neurons = n_neurons
self.batch_size = batch_size
self.n_inputs = n_inputs
self.n_outputs = n_outputs
self.basic_rnn = nn.RNN(self.n_inputs, self.n_neurons)
self.FC = nn.Linear(self.n_neurons, self.n_outputs)
def init_hidden(self, ):
# (num_layers, batch_size, n_neurons)
return torch.zeros(1, self.batch_size, self.n_neurons)
def forward(self, X):
self.batch_size = X.size(0)
self.hidden = self.init_hidden()
lstm_out, self.hidden = self.basic_rnn(X, self.hidden)
out = self.FC(self.hidden)
return out.view(-1, self.n_outputs)
My input x looks like this:
tensor([[-1.0173e-04, -1.5003e-04, -1.0218e-04, -7.4541e-05, -2.2869e-05,
-7.7171e-02, -4.4630e-03, -5.0750e-05, -1.7911e-04, -2.8082e-04,
-9.2992e-06, -1.5608e-05, -3.5471e-05, -4.9127e-05, -3.2883e-01],
[-1.1193e-04, -1.6928e-04, -1.0218e-04, -7.4541e-05, -2.2869e-05,
-7.7171e-02, -4.4630e-03, -5.0750e-05, -1.7911e-04, -2.8082e-04,
-9.2992e-06, -1.5608e-05, -3.5471e-05, -4.9127e-05, -3.2883e-01],
...
[-6.9490e-05, -8.9197e-05, -1.0218e-04, -7.4541e-05, -2.2869e-05,
-7.7171e-02, -4.4630e-03, -5.0750e-05, -1.7911e-04, -2.8082e-04,
-9.2992e-06, -1.5608e-05, -3.5471e-05, -4.9127e-05, -3.2883e-01]],
dtype=torch.float64)
and is a batch of 64 vectors with size 15.
When trying to test this model by doing:
BATCH_SIZE = 64
N_INPUTS = 15
N_NEURONS = 150
N_OUTPUTS = 1
model = MyRNN(BATCH_SIZE, N_INPUTS, N_NEURONS, N_OUTPUTS)
model(x)
I get the following error:
File "/home/tt/anaconda3/envs/venv/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 126, in check_forward_args
expected_input_dim, input.dim()))
RuntimeError: input must have 3 dimensions, got 2
How can I fix it?
You are missing one of the required dimensions for the RNN layer.
Per the documentation, your input size needs to be of shape (sequence length, batch, input size).
So - with the example above, you are missing one of these. Based on your variable names, it appears you are trying to pass 64 examples of 15 inputs each... if that’s true, you are missing sequence length.
With an RNN, the sequence length is the number of times you want the layer to recur. For example, in NLP your sequence length might be equal to the number of words in a sentence, while batch size would be the number of sentences you are passing, and input size would be the vector size of each word.
You might not need an RNN here if you are just trying to do use 64 samples of size 15.
See the documentation, the RNN layer expects
input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence.
In your case it seems that your "size" is the length of the sequence, and you have one feature at every timestep. Edited for 15 features, one timestep
# 15 features, 150 neurons
rnn = nn.RNN(15, 150)
# sequence of length 1, batch size 64, 15 features
x = torch.rand(1, 64, 15)
res, _ = rnn(x)
print(res.shape)
# => torch.Size([1, 64, 150])
Also note that you don't need to prespecify batch size.

Tensorflow: Recurrent Neural Network Batch Training

I am trying to implement RNN in Tensorflow. I am writing my own functions instead of using RNN cells to practice.
The problem is sequence tagging, input size is [32, 48, 900] where 32 is batch size, 48 is time steps and 900 is vocab size which is one-hot encoded vector. Output is [32, 48, 145] where first two dimensions are same as input, but the last dimension is output vocabulary size (one-hot). Basically this is a NLP tagging problem.
I am getting following error:
InvalidArgumentError (see above for traceback): logits and labels must
be same size: logits_size=[48,145] labels_size=[1536,145]
The actual labels_size is [32, 48, 145] but it merges first two dimensions without my control. FYI 32*48 = 1536
If I run my RNN with batch size 1, it works fine as expected. I could not figure out how to solve the issue. I am getting the problem in the last line of the code.
I pasted the related part of the code:
inputs = tf.placeholder(shape=[None, self.seq_length, self.vocab_size], dtype=tf.float32, name="inputs")
targets = tf.placeholder(shape=[None, self.seq_length, self.output_vocab_size], dtype=tf.float32, name="targets")
init_state = tf.placeholder(shape=[1, self.hidden_size], dtype=tf.float32, name="state")
initializer = tf.random_normal_initializer(stddev=0.1)
with tf.variable_scope("RNN") as scope:
hs_t = init_state
ys = []
for t, xs_t in enumerate(tf.split(inputs[0], self.seq_length, axis=0)):
if t > 0: scope.reuse_variables()
Wxh = tf.get_variable("Wxh", [self.vocab_size, self.hidden_size], initializer=initializer)
Whh = tf.get_variable("Whh", [self.hidden_size, self.hidden_size], initializer=initializer)
Why = tf.get_variable("Why", [self.hidden_size, self.output_vocab_size], initializer=initializer)
bh = tf.get_variable("bh", [self.hidden_size], initializer=initializer)
by = tf.get_variable("by", [self.output_vocab_size], initializer=initializer)
hs_t = tf.tanh(tf.matmul(xs_t, Wxh) + tf.matmul(hs_t, Whh) + bh)
ys_t = tf.matmul(hs_t, Why) + by
ys.append(ys_t)
hprev = hs_t
output_softmax = tf.nn.softmax(ys) # Get softmax for sampling
#outputs = tf.concat(ys, axis=0)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=targets, logits=ys))
The problem may fall in the size of the ys, ys should have the size of [32, 48, 145], but the output ys only have the size of [48,145], so if the batchsize is 1, the taget size is [1, 48, 145], which just have the same size of [48,145] after dimensionality reduction.
To solve the problem you can add a loop to deal with the batchsize ( inputs[0] ) :
such as :
for i in range(inputs.getshape(0)):
for t, xs_t in enumerate(tf.split(inputs[i], self.seq_length, axis=0)):

Categories