How to understand dimension of Keras Conv2D layer weights - python

When I tried to experiment with CNN pruning I was stopped right in the beginning because I couldn't explain the weight dimensions to myself.
The CNN has the following structure (exported from model.layers()):
conv2d (64 filters with filter dimension 5x5)
max_pooling2d
dropout
conv2d (128 filters with filter dimension 5x5)
max_pooling2d
flatten
dense (128 units)
dense (39 classes)
The corresponding weights have the following dimensions (from .get_weights()):
conv2d: shape(5,5,1,64)
max_pooling2d: shape(64,)
dropout: shape(5,5,64,128)
conv2d: shape(128,)
max_pooling2d: shape(6272,128)
flatten: shape(128,)
dense: shape(128,39)
dense: shape(39,)
Please have a look at the Conv2D layers and their parameters and dimensions. The first Conv2D layer (conv2d: shape(5,5,1,64)) seems to have an explainable number of weights: 5 x 5 (filter size) and 64 filters.
What is unclear to me is why the second Conv2D layer (conv2d: shape(128,)) only has 128 entries in the weights array. The dropout layer before (dropout: shape(5,5,64,128)) seems to have the weights dimensions I would expect the Con2D layer to have.

Related

Hyperparameter Tuning with Keras Tuner RandomSearch Error

I am using keras tuner to optimize hyperparameters: hidden layers, neurons, activation function, and learning rate. I have time series regression problem with 31 inputs, 32 outputs with N number of data samples.
My original X_train shape is (N,31) and Y_train shape is (N,32). I transform it to work for keras shape and I reshape X_train and Y_train as following:
X_train.shape: (N,31,1)
Y_train.shape: (N,32).
In the above code, X_train.shape(1) is 31 and Y_train.shape(1) is 32. When I used hyperparameter tuning, it says ValueError: Input 0 of layer lstm_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 20).
Following Error exists:
What I am missing and what is its issues.
LSTM layers expects a 3D tensor input with the shape [batch, timesteps, feature]. Since you are using number of layers are a tuning parameter along with LSTM layers, when the number of LSTM layers is 2 and above, the LSTM layers after the first LSTM layer will also expect a 3D tensor as input which means that you will need to add the 'return_sequences=True' parameter to the setup so that the output tensor from previous LSTM layer has ndim=3 (i.e. batch size, timesteps, hidden state) which is fed into the next LSTM layer.

Keras: Difference between AveragePooling1D layer and GlobalAveragePooling1D layer

I'm a bit confused when it comes to the average pooling layers of Keras. The documentation states the following:
AveragePooling1D: Average pooling for temporal data.
Arguments
pool_size: Integer, size of the average pooling windows.
strides: Integer, or None. Factor by which to downscale. E.g. 2 will halve the input. If None, it will default to pool_size.
padding: One of "valid" or "same" (case-insensitive).
data_format: A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch, steps,
features) while channels_first corresponds to inputs with shape
(batch, features, steps).
Input shape
If data_format='channels_last': 3D tensor with shape: (batch_size, steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, steps)
Output shape
If data_format='channels_last': 3D tensor with shape: (batch_size, downsampled_steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, downsampled_steps)
and
GlobalAveragePooling1D: Global average pooling operation for temporal data.
Arguments
data_format: A string, one of channels_last (default) or channels_first. The ordering of the dimensions in the inputs.
channels_last corresponds to inputs with shape (batch, steps,
features) while channels_first corresponds to inputs with shape
(batch, features, steps).
Input shape
If data_format='channels_last': 3D tensor with shape: (batch_size, steps, features)
If data_format='channels_first': 3D tensor with shape: (batch_size, features, steps)
Output shape
2D tensor with shape: (batch_size, features)
I (think that I) do get the concept of average pooling but I don't really understand why the GlobalAveragePooling1D layer simply drops the steps parameter. Thank you very much for your answers.
GlobalAveragePooling1D is same as AveragePooling1D with pool_size=steps. So, for each feature dimension, it takes average among all time steps. The output thus have shape (batch_size, 1, features) (if data_format='channels_last'). They just flatten the second (or third if data_format='channels_first') dimension, that is how you get output shape equal to (batch_size, features).

What strategy should I use in my CNN to go from a 3D volume to a 2D plane?

What strategy should I use in my CNN to go from a 3D volume to a 2D plane as the output layer. Can I even have a 2D layer as output?
I am trying to develop a network which input is a 320x320x3 image and output should be 68x2.
I know one way to do it would be to start from 320x320x3 and after a few layer I could flatten my 3D layers and then shorten it down to a 1D array of 136. But I am trying to understand if I could somehow go down to a desired 2d dimension at the final layer.
Thanks,
Shubham
Edit: I might have misread your question initially. If your intention is to have 136 output nodes that can be arranged in a 68x2 matrix (and not to have a 68x68x2 image in the output, as I though at first), then you can use a Reshape layer after your final dense layer with 136 units:
import keras
from keras.models import Sequential
from keras.layers import Conv2D, Flatten, Dense, Reshape
model = Sequential()
model.add(Conv2D(32, 3, input_shape=(320, 320, 3)))
model.add(Flatten())
model.add(Dense(136))
model.add(Reshape((68, 2)))
model.summary()
This will give you the following model, with the desired shape in the output:
Layer (type) Output Shape Param #
=================================================================
conv2d_2 (Conv2D) (None, 318, 318, 32) 896
_________________________________________________________________
flatten_2 (Flatten) (None, 3235968) 0
_________________________________________________________________
dense_2 (Dense) (None, 136) 440091784
_________________________________________________________________
reshape_1 (Reshape) (None, 68, 2) 0
=================================================================
Total params: 440,092,680
Trainable params: 440,092,680
Non-trainable params: 0
Make sure to provide your training labels in the same shape when fitting the model.
(original answer, might still be relevant)
Yes, this is commonly done in semantic segmentation models, where the inputs are images and the outputs are tensors of the same height and width of the images, and with the number of channels equal to the number of classes in the output. If you want to do this in TensorFlow or Keras, you can look up existing implementations, for instance of U-Net architectures.
A core feature of these models is that these networks are fully convolutional: they only consist of convolutional layers. Typically, the feaure maps in these models go from 'wide and shallow' (big feature maps in the spatial dimensions with few channels) at first, to 'small and deep' (small spatial dimensions, high-dimensional channel dimension) and back to the desired output dimension. Hence the U-shape:
There are a lot of ways to go from 320x320x3 to 68x2 with a fully convolutional network, but the input and output of your model would basically look like this:
import keras
from keras import Sequential
from keras.layers import Conv2D
model = Sequential()
model.add(Conv2D(32, 3, activation='relu', input_shape=(320,320,3)))
# Include more convolutional layers, pooling layers, upsampling layers etc
...
# At the end of the model, add your final Conv2dD layer with 2 filters
# and the required activation function
model.add(Conv2D(2, 3, activation='softmax'))

What is the architecture behind the Keras LSTM Layer implementation?

How does the input dimensions get converted to the output dimensions for the LSTM Layer in Keras? From reading Colah's blog post, it seems as though the number of "timesteps" (AKA the input_dim or the first value in the input_shape) should equal the number of neurons, which should equal the number of outputs from this LSTM layer (delineated by the units argument for the LSTM layer).
From reading this post, I understand the input shapes. What I am baffled by is how Keras plugs the inputs into each of the LSTM "smart neurons".
Keras LSTM reference
Example code that baffles me:
model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
model.add(Dense(2))
From this, I would think that the LSTM layer has 10 neurons and each neuron is fed a vector of length 64. However, it seems it has 32 neurons and I have no idea what is being fed into each. I understand that for the LSTM to connect to the Dense layer, we can just plug all 32 outputs to each of the 2 neurons. What confuses me is the InputLayer to the LSTM.
(similar SO post but not quite what I need)
Revisited and updated in 2020: I was partially correct! The architecture is 32 neurons. The 10 represents the timestep value. Each neuron is being fed a 64 length vector (maybe representing a word vector), representing 64 features (perhaps 64 words that help identify a word) over 10 timesteps.
The 32 represents the number of neurons. It represents how many hidden states there are for this layer and also represents the output dimension (since we output a hidden state at the end of each LSTM neuron).
Lastly, the 32-dimensional output vector generated from the 32 neurons at the last timestep is then fed to a Dense layer of 2 neurons, which basically means plug the 32 length vector to both neurons, with weights on the input and activation.
More reading with somewhat helpful answers:
Understanding Keras LSTMs
What exactly am I configuring when I create a stateful LSTM layer with N units
Initializing LSTM hidden states with
Keras
I dont think you are right. Actually timestep number does not impact the number of parameters in LSTM.
from keras.layers import LSTM
from keras.models import Sequential
time_step = 13
featrue = 5
hidenfeatrue = 10
model = Sequential()
model.add(LSTM(hidenfeatrue, input_shape=(time_step, featrue)))
model.summary()
time_step=100
model2 = Sequential()
model2.add(LSTM(hidenfeatrue, input_shape=(time_step, featrue)))
model2.summary()
the reuslt:
Using TensorFlow backend.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 10) 640
=================================================================
Total params: 640
Trainable params: 640
Non-trainable params: 0
_________________________________________________________________
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 10) 640
=================================================================
Total params: 640
Trainable params: 640
Non-trainable params: 0
_________________________________________________________________
#Sticky, you are wrong in your interpretation.
Input_shape =(batch_size,sequence_length/timesteps,feature_size).So, your input tensor is 10x64 (like 10 words and its 64 features.Just like word embedding).32 are neurons to make output vector size 32.
The output will have shape structure:
(batch, arbitrary_steps, units) if return_sequences=True.
(batch, units) if return_sequences=False.
The memory states will have a size of "units".

How to give the 1D input to Convolutional Neural Network(CNN) using Keras?

I'm solving a regression problem with Convolutional Neural Network(CNN) using Keras library. I have gone through many examples but failed to understand the concept of input shape to 1D Convolution
This my data set, 1 target variable with 3 raw signals.
For visualization the 5 segments of sensor signal are shown here, each segment has its own meaning
I want to give segment wise sensor values as input to the 1D Convolution layer but problem is that segments are of varibale length.
This is my CNN architecture
I tired to build my CNN model but confused
model = Sequential()
model.add(Conv1D(5, 7, activation='relu',input_shape=input_shape))
model.add(MaxPooling1D(pool_length=4))
model.add(Conv1D(4, 7, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
So, How can I give input to Conv1D of CNN in Keras? OR should I set fixed size input to Conv1D? but how?
My understanding is that the input_shape should be (time_steps, n_features), where time_steps would be the length of the segments (sequence of sensor signals) and n_features the number of channels (3 in your case, as you have 3 different sensors).
Therefore, the input to your network should have 3 dimensions (batch, steps, channels), where batch is the different segments.
I've only worked with fixed time_steps, If you really can't use segments with same length you might try to pad them with zeros.
On the Keras Documentation they say that you may use (None, 3) as the input_shape for variable-length sequences of 3-dimensional vectors, but I never used this way.

Categories