I'm a bit confused when it comes to the average pooling layers of Keras. The documentation states the following:
AveragePooling1D: Average pooling for temporal data.
Arguments
pool_size: Integer, size of the average pooling windows.
strides: Integer, or None. Factor by which to downscale. E.g. 2 will halve the input. If None, it will default to pool_size.
padding: One of "valid" or "same" (case-insensitive).
data_format: A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch, steps,
features) while channels_first corresponds to inputs with shape
(batch, features, steps).
Input shape
If data_format='channels_last': 3D tensor with shape: (batch_size, steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, steps)
Output shape
If data_format='channels_last': 3D tensor with shape: (batch_size, downsampled_steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, downsampled_steps)
and
GlobalAveragePooling1D: Global average pooling operation for temporal data.
Arguments
data_format: A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch, steps,
features) while channels_first corresponds to inputs with shape
(batch, features, steps).
Input shape
If data_format='channels_last': 3D tensor with shape: (batch_size, steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, steps)
Output shape
2D tensor with shape: (batch_size, features)
I (think that I) do get the concept of average pooling but I don't really understand why the GlobalAveragePooling1D layer simply drops the steps parameter. Thank you very much for your answers.
GlobalAveragePooling1D is same as AveragePooling1D with pool_size=steps. So, for each feature dimension, it takes average among all time steps. The output thus have shape (batch_size, 1, features) (if data_format='channels_last'). They just flatten the second (or third if data_format='channels_first') dimension, that is how you get output shape equal to (batch_size, features).
Related
I am using keras tuner to optimize hyperparameters: hidden layers, neurons, activation function, and learning rate. I have time series regression problem with 31 inputs, 32 outputs with N number of data samples.
My original X_train shape is (N,31) and Y_train shape is (N,32). I transform it to work for keras shape and I reshape X_train and Y_train as following:
X_train.shape: (N,31,1)
Y_train.shape: (N,32).
In the above code, X_train.shape(1) is 31 and Y_train.shape(1) is 32. When I used hyperparameter tuning, it says ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 20).
Following Error exists:
What I am missing and what is its issues.
LSTM layers expects a 3D tensor input with the shape [batch, timesteps, feature]. Since you are using number of layers are a tuning parameter along with LSTM layers, when the number of LSTM layers is 2 and above, the LSTM layers after the first LSTM layer will also expect a 3D tensor as input which means that you will need to add the 'return_sequences=True' parameter to the setup so that the output tensor from previous LSTM layer has ndim=3 (i.e. batch size, timesteps, hidden state) which is fed into the next LSTM layer.
This question already has answers here:
Pytorch - Inferring linear layer in_features
(2 answers)
Closed 1 year ago.
I was trying to learn PyTorch and came across a tutorial where a CNN is defined like below,
class Net(Module):
def __init__(self):
super(Net, self).__init__()
self.cnn_layers = Sequential(
# Defining a 2D convolution layer
Conv2d(1, 4, kernel_size=3, stride=1, padding=1),
BatchNorm2d(4),
ReLU(inplace=True),
MaxPool2d(kernel_size=2, stride=2),
# Defining another 2D convolution layer
Conv2d(4, 4, kernel_size=3, stride=1, padding=1),
BatchNorm2d(4),
ReLU(inplace=True),
MaxPool2d(kernel_size=2, stride=2),
)
self.linear_layers = Sequential(
Linear(4 * 7 * 7, 10)
)
# Defining the forward pass
def forward(self, x):
x = self.cnn_layers(x)
x = x.view(x.size(0), -1)
x = self.linear_layers(x)
return x
I understood how the cnn_layers are made. After the cnn_layers, the data should be flattened and given to linear_layers.
I don't understand how the number of features to Linear is 4*7*7. I understand that 4 is the output dimension from the last Conv2d layer.
How is 7*7 coming in to picture? Does stride or padding got any role in that?
Input image shape is [1, 28, 28]
Conv2d layers have a kernel size of 3, stride and padding of 1, which means it doesn't change the spatial size of an image. There are two MaxPool2d layers which reduce the spatial dimensions from (H, W) to (H/2, W/2). So, for each batch, output of the last convolution with 4 output channels has a shape of (batch_size, 4, H/4, W/4). In the forward pass feature tensor is flattened by x = x.view(x.size(0), -1) which makes it in the shape (batch_size, H*W/4). I assume H and W are 28, for which the linear layer would take inputs of shape (batch_size, 196).
Actually,
in the 2D convolution layers features [values] in a matric [2D-tensor],
As usual neural network end up with a fully connected layer followed by the logist later.
so, features in the fully-connected layer in the vector [1D-tensor].
therefore we have to map each feature [value] in the last metric into the fully-connected layer follows.
in pytorch implementation of the fully-connected layer is Linear class.
the first parameter is the number of input features:
in this case
input_image : (28,28,1)
after_Conv2d_1 : (28,28,4) <- because of the padding : if padding := 0 then (26,26,1)
after_maxPool_1 : (14,14,4) <- due to the stride of 2
after_Conv2D_2 : (14,14,4) <- because this is "same" padding
after_maxPool_2 : (7,7,4)
in the end, the total number of features before the fully connected layer is 4*7*7.
Also, here shows why we use an odd number for the kernel size and start from images with even number of pixels
When I tried to experiment with CNN pruning I was stopped right in the beginning because I couldn't explain the weight dimensions to myself.
The CNN has the following structure (exported from model.layers()):
conv2d (64 filters with filter dimension 5x5)
max_pooling2d
dropout
conv2d (128 filters with filter dimension 5x5)
max_pooling2d
flatten
dense (128 units)
dense (39 classes)
The corresponding weights have the following dimensions (from .get_weights()):
conv2d: shape(5,5,1,64)
max_pooling2d: shape(64,)
dropout: shape(5,5,64,128)
conv2d: shape(128,)
max_pooling2d: shape(6272,128)
flatten: shape(128,)
dense: shape(128,39)
dense: shape(39,)
Please have a look at the Conv2D layers and their parameters and dimensions. The first Conv2D layer (conv2d: shape(5,5,1,64)) seems to have an explainable number of weights: 5 x 5 (filter size) and 64 filters.
What is unclear to me is why the second Conv2D layer (conv2d: shape(128,)) only has 128 entries in the weights array. The dropout layer before (dropout: shape(5,5,64,128)) seems to have the weights dimensions I would expect the Con2D layer to have.
I'm solving a regression problem with Convolutional Neural Network(CNN) using Keras library. I have gone through many examples but failed to understand the concept of input shape to 1D Convolution
This my data set, 1 target variable with 3 raw signals.
For visualization the 5 segments of sensor signal are shown here, each segment has its own meaning
I want to give segment wise sensor values as input to the 1D Convolution layer but problem is that segments are of varibale length.
This is my CNN architecture
I tired to build my CNN model but confused
model = Sequential()
model.add(Conv1D(5, 7, activation='relu',input_shape=input_shape))
model.add(MaxPooling1D(pool_length=4))
model.add(Conv1D(4, 7, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
So, How can I give input to Conv1D of CNN in Keras? OR should I set fixed size input to Conv1D? but how?
My understanding is that the input_shape should be (time_steps, n_features), where time_steps would be the length of the segments (sequence of sensor signals) and n_features the number of channels (3 in your case, as you have 3 different sensors).
Therefore, the input to your network should have 3 dimensions (batch, steps, channels), where batch is the different segments.
I've only worked with fixed time_steps, If you really can't use segments with same length you might try to pad them with zeros.
On the Keras Documentation they say that you may use (None, 3) as the input_shape for variable-length sequences of 3-dimensional vectors, but I never used this way.
I have a question about the 4D tensor on keras about Convolution2D Layers.
The Keras doc says:
4D tensor with shape: (samples, channels, rows, cols) if dim_ordering='th' or 4D tensor with shape: (samples, rows, cols, channels) if dim_ordering='tf'.
I use 'tf', how about my input? When I use (samples, channels, rows, cols), it is ok, but when I use (samples, rows, cols, channels) as input, it has some problems.
Keras assumes that if you are using tensorflow, you are going with (samples, channels, rows, cols)