Keras time series prediction with CNN+LSTM model and TimeDistributed layer wrapper - python

I have several data files of human activity recognition data consisting of time-ordered rows of recorded raw samples. Each row has 8 columns of EMG sensor data and 1 corresponding column of target sensor data. I'm trying to feed the 8 channels of EMG sensor data into a CNN+LSTM deep model in order to predict the 1 channel of target data. I do this by breaking down a dataset (a in the image below) into 50-row windows of raw samples (b in the image below) and then reshaping these windows into blocks of 4 windows, to act as time steps for the LSTM part of the model (c in the image below). The following image will hopefully explain it better:
I've been following the tutorial here as to how to implement my model: https://medium.com/smileinnovation/how-to-work-with-time-distributed-data-in-a-neural-network-b8b39aa4ce00
I have reshaped the data and built the model but keep coming back to the following error that I cannot figure out how to resolve:
"ValueError: Error when checking target: expected FC_out to have 2 dimensions, but got array with shape (808, 50, 1)"
My code follows and is written in Python using Keras and Tensorflow:
from keras.models import Sequential
from keras.layers import CuDNNLSTM
from keras.layers.convolutional import Conv2D
from keras.layers.core import Dense, Dropout
from keras.layers import Flatten
from keras.layers import TimeDistributed
#Code that reads in file data and shapes it into 4-window blocks omitted. That code produces the following arrays:
#x_train - shape of (808, 4, 50, 8) which equates to (samples, time steps, window length, number of channels)
#x_valid - shape of (223, 4, 50, 8) which equates to the same as x_train
#y_train - shape of (808, 50, 1) which equates to (samples, window length, number of target channels)
# Followed machine learning mastery style for ease of reading
numSteps = x_train.shape[1]
windowLength = x_train.shape[2]
numChannels = x_train.shape[3]
numOutputs = 1
# Reshape x data for use with TimeDistributed wrapper, adding extra dimension at the end
x_train = x_train.reshape(x_train.shape[0], numSteps, windowLength, numChannels, 1)
x_valid = x_valid.reshape(x_valid.shape[0], numSteps, windowLength, numChannels, 1)
# Build model
model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3,3), activation=activation, name="Conv2D_1"),
input_shape=(numSteps, windowLength, numChannels, 1)))
model.add(TimeDistributed(Conv2D(64, (3,3), activation=activation, name="Conv2D_2")))
model.add(Dropout(0.4, name="CNN_Drop_01"))
# Flatten for passing to LSTM layer
model.add(TimeDistributed(Flatten(name="Flatten_1")))
# LSTM and Dropout
model.add(CuDNNLSTM(28, return_sequences=True, name="LSTM_01"))
model.add(Dropout(0.4, name="Drop_01"))
# Second LSTM and Dropout
model.add(CuDNNLSTM(28, return_sequences=False, name="LSTM_02"))
model.add(Dropout(0.3, name="Drop_02"))
# Fully Connected layer and further Dropout
model.add(Dense(16, activation=activation, name="FC_1"))
model.add(Dropout(0.4)) # For example, for 3 outputs classes
# Final fully Connected layer specifying outputs
model.add(Dense(numOutputs, activation=activation, name="FC_out"))
# Compile model, produce summary and save model image to file
# NOTE: coeffDetermination refers to a function for calculating R2 and is not included in this code
model.compile(optimizer='Adam', loss='mse', metrics=[coeffDetermination])
# Now train the model
history_cb = model.fit(x_train, y_train, validation_data=(x_valid, y_valid), epochs=30, batch_size=64)
I'd be grateful if anyone can figure out what I've done wrong. Or am I just going about this the incorrect way, with trying to use this model configuration for time series prediction?

"ValueError: Error when checking target: expected FC_out to have 2 dimensions, but got array with shape (808, 50, 1)"
Your input is (808, 4, 50, 8, 1) and output is (808, 50, 1)
However, from the model.summary() shows that output shape should be (None, 4, 1)
Since the # of time steps is 4, y_train should be something like (808, 4, 1).
Or, if you want to have (888, 50, 1), you need to change model to get the last part as (None, 50, 1).
Model: "sequential_10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
time_distributed_18 (TimeDis (None, 4, 48, 6, 64) 640
_________________________________________________________________
time_distributed_19 (TimeDis (None, 4, 46, 4, 64) 36928
_________________________________________________________________
CNN_Drop_01 (Dropout) (None, 4, 46, 4, 64) 0
_________________________________________________________________
time_distributed_20 (TimeDis (None, 4, 11776) 0
_________________________________________________________________
LSTM_01 (LSTM) (None, 4, 28) 1322160
_________________________________________________________________
Drop_01 (Dropout) (None, 4, 28) 0
_________________________________________________________________
Drop_02 (Dropout) (None, 4, 28) 0
_________________________________________________________________
FC_1 (Dense) (None, 4, 16) 464
_________________________________________________________________
dropout_3 (Dropout) (None, 4, 16) 0
_________________________________________________________________
FC_out (Dense) (None, 4, 1) 17
=================================================================
Total params: 1,360,209
Trainable params: 1,360,209
Non-trainable params: 0
For Many to many sequence prediction with different sequence length, check this link https://github.com/keras-team/keras/issues/6063
dataX or input : (nb_samples, nb_timesteps, nb_features) -> (1000, 50, 1)
dataY or output: (nb_samples, nb_timesteps, nb_features) -> (1000, 10, 1)
model = Sequential()
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))
model.add(RepeatVector(10))
model.add(LSTM(output_dim=hidden_neurons, return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])

Related

How to fix ValueError: Input 0 is incompatible with layer CNN: expected shape=(None, 35), found shape=(None, 31)

I am using Convolutional Neural Network to train a text classification task, using Keras, Conv1D. When I run the model below to my multi class text classification task, I get error such as following. I put time to undrestand the error but I don't know how to fix it. can anyone help me please?
The data set and evaluation set shape is such as following:
df_train shape: (7198,)
df_val shape: (1800,)
np.random.seed(42)
#You needs to reshape your input data according to Conv1D layer input format - (batch_size, steps, input_dim). Try
# set parameters of matrices and convolution
embedding_dim = 300
nb_filter = 64
filter_length = 5
hidden_dims = 32
stride_length = 1
from keras.layers import Embedding
embedding_layer = Embedding(len(tokenizer.word_index) + 1,
embedding_dim,
input_length=35,
name="Embedding")
inp = Input(shape=(35,), dtype='int32')
embeddings = embedding_layer(inp)
conv1 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV1",
kernel_regularizer=regularizers.l2(l=0.0367))(embeddings)
conv2 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV2",kernel_regularizer=regularizers.l2(l=0.02))(embeddings)
conv3 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV2",kernel_regularizer=regularizers.l2(l=0.01))(embeddings)
max1 = MaxPool1D(10, strides=1,name="MaxPool1D1")(conv1)
max2 = MaxPool1D(10, strides=1,name="MaxPool1D2")(conv2)
max3 = MaxPool1D(10, strides=1,name="MaxPool1D2")(conv3)
conc = concatenate([max1, max2,max3])
flat = Flatten(name="FLATTEN")(max1)
....
Error is like following:
ValueError: Input 0 is incompatible with layer CNN: expected shape=(None, 35), found shape=(None, 31)
The model :
Model: "CNN"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_19 (InputLayer) [(None, 35)] 0
_________________________________________________________________
Embedding (Embedding) (None, 35, 300) 4094700
_________________________________________________________________
CONV1 (Conv1D) (None, 35, 32) 48032
_________________________________________________________________
MaxPool1D1 (MaxPooling1D) (None, 26, 32) 0
_________________________________________________________________
FLATTEN (Flatten) (None, 832) 0
_________________________________________________________________
Dropout (Dropout) (None, 832) 0
_________________________________________________________________
Dense (Dense) (None, 3) 2499
=================================================================
Total params: 4,145,231
Trainable params: 4,145,231
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
That error comes when you have not matched the network's input layer shape and the dataset's shape. If are you receiving an error like this, then you should try:
Set the network input shape at (None, 31) so that it matches the Dataset's shape.
Check that the dataset's shape is equal to (num_of_examples, 35).(Preferable)
If all of this informations are correct and there is no problem with the Dataset, it might be an error of the net itself, where the shapes af two adjcent layers don't match.

Multi-feature causal CNN - Keras implementation

I'm currently using a basic LSTM to make regression predictions and I would like to implement a causal CNN as it should be computationally more efficient.
I'm struggling to figure out how to reshape my current data to fit the causal CNN cell and represent the same data/timestep relationship as well as what the dilation rate should be set at.
My current data is of this shape: (number of examples, lookback, features) and here's a basic example of the LSTM NN I'm using right now.
lookback = 20 # height -- timeseries
n_features = 5 # width -- features at each timestep
# Build an LSTM to perform regression on time series input/output data
model = Sequential()
model.add(LSTM(units=256, return_sequences=True, input_shape=(lookback, n_features)))
model.add(Activation('elu'))
model.add(LSTM(units=256, return_sequences=True))
model.add(Activation('elu'))
model.add(LSTM(units=256))
model.add(Activation('elu'))
model.add(Dense(units=1, activation='linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train,
epochs=50, batch_size=64,
validation_data=(X_val, y_val),
verbose=1, shuffle=True)
prediction = model.predict(X_test)
I then created a new CNN model (although not causal as the 'causal' padding is only an option for Conv1D and not Conv2D, per Keras documentation. If I understand correctly, by having multiple features, I need to use Conv2D, rather than Conv1D but then if I set Conv2D(padding='causal'), I get the following error - Invalid padding: causal)
Anyways, I was also able to fit the data with a new shape (number of examples, lookback, features, 1) and run the following model using the Conv2D Layer:
lookback = 20 # height -- timeseries
n_features = 5 # width -- features at each timestep
model = Sequential()
model.add(Conv2D(128, 3, activation='elu', input_shape=(lookback, n_features, 1)))
model.add(MaxPool2D())
model.add(Conv2D(128, 3, activation='elu'))
model.add(MaxPool2D())
model.add(Flatten())
model.add(Dense(1, activation='linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train,
epochs=50, batch_size=64,
validation_data=(X_val, y_val),
verbose=1, shuffle=True)
prediction = model.predict(X_test)
However, from my understanding, this does not propagate the data as causal, rather just the entire set (lookback, features, 1) as an image.
Is there any way to either reshape my data to fit into a Conv1D(padding='causal') Layer, with multiple features or somehow run the same data and input shape as Conv2D with 'causal' padding?
I believe that you can have causal padding with dilation for any number of input features. Here is the solution I would propose.
The TimeDistributed layer is key to this.
From Keras Documentation: "This wrapper applies a layer to every temporal slice of an input. The input should be at least 3D, and the dimension of index one will be considered to be the temporal dimension."
For our purposes, we want this layer to apply "something" to each feature, so we move the features to the temporal index, which is 1.
Also relevant is the Conv1D documentation.
Specifically about channels: "The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape (batch, steps, channels) (default format for temporal data in Keras)"
from tensorflow.python.keras import Sequential, backend
from tensorflow.python.keras.layers import GlobalMaxPool1D, Activation, MaxPool1D, Flatten, Conv1D, Reshape, TimeDistributed, InputLayer
backend.clear_session()
lookback = 20
n_features = 5
filters = 128
model = Sequential()
model.add(InputLayer(input_shape=(lookback, n_features, 1)))
# Causal layers are first applied to the features independently
model.add(Permute(dims=(2, 1))) # UPDATE must permute prior to adding new dim and reshap
model.add(Reshape(target_shape=(n_features, lookback, 1)))
# After reshape 5 input features are now treated as the temporal layer
# for the TimeDistributed layer
# When Conv1D is applied to each input feature, it thinks the shape of the layer is (20, 1)
# with the default "channels_last", therefore...
# 20 times steps is the temporal dimension
# 1 is the "channel", the new location for the feature maps
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**0)))
# You could add pooling here if you want.
# If you want interaction between features AND causal/dilation, then apply later
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**1)))
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**2)))
# Stack feature maps on top of each other so each time step can look at
# all features produce earlier
model.add(Permute(dims=(2, 1, 3))) # UPDATED to fix issue with reshape
model.add(Reshape(target_shape=(lookback, n_features * filters))) # (20 time steps, 5 features * 128 filters)
# Causal layers are applied to the 5 input features dependently
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**0))
model.add(MaxPool1D())
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**1))
model.add(MaxPool1D())
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**2))
model.add(GlobalMaxPool1D())
model.add(Dense(units=1, activation='linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()
Final Model Summary
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
reshape (Reshape) (None, 5, 20, 1) 0
_________________________________________________________________
time_distributed (TimeDistri (None, 5, 20, 128) 512
_________________________________________________________________
time_distributed_1 (TimeDist (None, 5, 20, 128) 49280
_________________________________________________________________
time_distributed_2 (TimeDist (None, 5, 20, 128) 49280
_________________________________________________________________
reshape_1 (Reshape) (None, 20, 640) 0
_________________________________________________________________
conv1d_3 (Conv1D) (None, 20, 128) 245888
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 10, 128) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 10, 128) 49280
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 5, 128) 0
_________________________________________________________________
conv1d_5 (Conv1D) (None, 5, 128) 49280
_________________________________________________________________
global_max_pooling1d (Global (None, 128) 0
_________________________________________________________________
dense (Dense) (None, 1) 129
=================================================================
Total params: 443,649
Trainable params: 443,649
Non-trainable params: 0
_________________________________________________________________
Edit:
"why you need to reshape and use n_features as the temporal layer"
The reason why n_features needs to be at the temporal layer initially is because Conv1D with dilation and causal padding only works with one feature at a time, and because of how the TimeDistributed layer is implemented.
From their documentation "Consider a batch of 32 samples, where each sample is a sequence of 10 vectors of 16 dimensions. The batch input shape of the layer is then (32, 10, 16), and the input_shape, not including the samples dimension, is (10, 16).
You can then use TimeDistributed to apply a Dense layer to each of the 10 timesteps, independently:"
By applying the TimeDistributed layer independently to each feature, it reduces the dimension of the problem as if there was only one feature (which would easily allow for dilation and causal padding). With 5 features, they need to each be handled separately at first.
After your edits this recommendation still applies.
There shouldn't be a difference in terms of the network whether InputLayer is included in the first layer or separate so you can definitely put it in the first CNN if that resolves the issue.
In Conv1D with causal padding is dilation convolution. For Conv2D you can use dilation_rate parameter of the Conv2D class. You have to assign dilation_rate with 2-tuple of integers. For more information you can read in the keras documentation or here.

How can I use different sizes of training data and test data and different batch_sizes during fitting and predicting data using Keras LSTM?

My source code looks in the way shown below. It works only when batch_sizes have the same size, in this case 300 and the same shape, in this case (300, 50, 74). Does anyone have an idea how can I use different sizes of training data and test data and different batch_sizes during fitting and predicting data using Keras LSTM?
shape = input_one_hot_encoded.shape
print('input_one_hot_encoded: ' + str(shape))
shape = output_one_hot_encoded.shape
print('output_one_hot_encoded: ' + str(shape))
shape = test_input_one_hot_encoded.shape
print('test_input_one_hot_encoded: ' + str(shape))
model = Sequential()
model.add(LSTM(len(dict), return_sequences=True, stateful=True,
batch_input_shape=shape))
model.add(LSTM(len(dict), return_sequences=True, stateful=True))
model.add(LSTM(len(dict), return_sequences=True, stateful=True))
model.add(Dense(len(dict), activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
print(model.summary())
model.fit(input_one_hot_encoded, output_one_hot_encoded, epochs=20, batch_size=300)
data = model.predict(test_input_one_hot_encoded, batch_size=300)
It returns:
input_one_hot_encoded: (300, 50, 74)
output_one_hot_encoded: (300, 50, 74)
test_input_one_hot_encoded: (300, 50, 74)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (300, 50, 74) 44104
_________________________________________________________________
lstm_2 (LSTM) (300, 50, 74) 44104
_________________________________________________________________
lstm_3 (LSTM) (300, 50, 74) 44104
_________________________________________________________________
dense_1 (Dense) (300, 50, 74) 5550
=================================================================
Total params: 137,862
Trainable params: 137,862
Non-trainable params: 0
The reason you can't use different batch sizes while training and test run is that your model has stateful LSTM i.e. the value of stateful parameter is set to True
Now, there are two ways to solve this:
Use stateful LSTM while training. At the end of the training process save the weights of your model locally to a file and define a new model architecture same as the existing one with only difference that lstms are not statefule:
model.save_weights("your_weights.h5")
e.g. lstm layers
model.add(LSTM(len(dict), return_sequences=True, stateful=False, batch_input_shape=shape))
Just make your lstm layers non-stateful i.e. set the value of stateful argument to False as above.
For more detail description please refer this link https://machinelearningmastery.com/use-different-batch-sizes-training-predicting-python-keras/
Judging from the docs regarding the input shape of LSTM cells:
3D tensor with shape (batch_size, timesteps, input_dim).
Which implies that you you're going to need timesteps with a constant size for each batch, hence it wont be possible to have different batch sizes for training and testing.
However, you can change the length of your input sequence for example by using pad_sequences (see https://keras.io/preprocessing/sequence/#pad_sequence for more details)
Example:
from keras.preprocessing.sequence import pad_sequences
# define sequences
sequences = [
[1, 2, 3, 4],
[1, 2, 3],
[1]
]
# pad sequence
padded = pad_sequences(sequences, maxlen=5)
print(padded)
[[0 1 2 3 4]
[0 0 1 2 3]
[0 0 0 0 1]]
EDIT after comments:
you need to adjust your test-data dimensions. For example see the docs on sequential models (https://keras.io/getting-started/sequential-model-guide/). Here x_train and y_train are defined as follows:
data_dim = 16
timesteps = 8
num_classes = 10
batch_size = 32
# Generate dummy training data
x_train = np.random.random((batch_size * 10, timesteps, data_dim))
y_train = np.random.random((batch_size * 10, num_classes))
note that the shapes are
x_train.shape
>> (320, 8, 16)
y_train.shape
>> (320, 10)
Your shapes should read:
input_one_hot_encoded.shape
>> (300, timesteps, data_dim)
output_one_hot_encoded.shape
>>(300, num_classes)
respectively

Mutli Step Forecast LSTM model

I am trying to implement a multi step forecasting LSTM model in Keras. The shapes of data is like this:
X : (5831, 48, 1)
y : (5831, 1, 12)
The model that I am trying to use is:
power_in = Input(shape=(X.shape[1], X.shape[2]))
power_lstm = LSTM(50, recurrent_dropout=0.4128,
dropout=0.412563, kernel_initializer=power_lstm_init, return_sequences=True)(power_in)
main_out = TimeDistributed(Dense(12, kernel_initializer=power_lstm_init))(power_lstm)
While trying to train the model like this:
hist = forecaster.fit([X], y, epochs=325, batch_size=16, validation_data=([X_valid], y_valid), verbose=1, shuffle=False)
I am getting the following error:
ValueError: Error when checking target: expected time_distributed_16 to have shape (48, 12) but got array with shape (1, 12)
How to fix this?
According to your comment:
[The] data i have is like t-48, t-47, t-46, ..... , t-1 as the past data and
t+1, t+2, ......, t+12 as the values that I want to forecast
you may not need to use a TimeDistributed layer at all:
first, just remove the resturn_sequences=True argument of the LSTM layer. After doing it, the LSTM layer would encode the input timeseries of the past in a vector of shape (50,). Now you can feed it directly to a Dense layer with 12 units:
# make sure the labels have are in shape (num_samples, 12)
y = np.reshape(y, (-1, 12))
power_in = Input(shape=(X.shape[1:],))
power_lstm = LSTM(50, recurrent_dropout=0.4128,
dropout=0.412563,
kernel_initializer=power_lstm_init)(power_in)
main_out = Dense(12, kernel_initializer=power_lstm_init)(power_lstm)
Alternatively, if you would like to use a TimeDistributed layer and considering that the output is a sequence itself, we can explicitly enforce this temporal dependency in our model by using another LSTM layer before the Dense layer (with the addition of a RepeatVector layer after the first LSTM layer to make its output a timseries of length 12, i.e. same as the output timeseries length):
# make sure the labels have are in shape (num_samples, 12, 1)
y = np.reshape(y, (-1, 12, 1))
power_in = Input(shape=(48,1))
power_lstm = LSTM(50, recurrent_dropout=0.4128,
dropout=0.412563,
kernel_initializer=power_lstm_init)(power_in)
rep = RepeatVector(12)(power_lstm)
out_lstm = LSTM(32, return_sequences=True)(rep)
main_out = TimeDistributed(Dense(1))(out_lstm)
model = Model(power_in, main_out)
model.summary()
Model summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) (None, 48, 1) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 50) 10400
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 12, 50) 0
_________________________________________________________________
lstm_4 (LSTM) (None, 12, 32) 10624
_________________________________________________________________
time_distributed_1 (TimeDist (None, 12, 1) 33
=================================================================
Total params: 21,057
Trainable params: 21,057
Non-trainable params: 0
_________________________________________________________________
Of course, in both models you may need to tune the hyper-parameters (e.g. number of LSTM layers, the dimension of LSTM layers, etc.) to be able to accurately compare them and achieve good results.
Side note: actually, in your scenario, you don't need to use TimeDistributed layer at all because (currently) Dense layer is applied on the last axis. Therefore, TimeDistributed(Dense(...)) and Dense(...) are equivalent.

Stacking convolutional network and recurrent layer

I'm working on a classifier for video sequences. It should take several video frames on input and output a label, either 0 or 1. So, it is a many-to-one network.
I already have a classifier for single frames. This classifier makes several convolutions with Conv2D, then applies GlobalAveragePooling2D. This results in 1D vector of length 64. Then original per-frame classifier has a Dence layer with softmax activation.
Now I would like to extend this classifier to work with sequences. Ideally, sequences should be of varying length, but for now I fix the length to 4.
To extend my classifier, I'm going to replace Dense with an LSTM layer with 1 unit. So, my goal is to have the LSTM layer to take several 1D vectors of length 64, one by one, and output a label.
Schematically, what I have now:
input(99, 99, 3) - [convolutions] - features(1, 64) - [Dense] - [softmax] - label(1, 2)
Desired architecture:
4x { input(99, 99, 3) - [convolutions] - features(1, 64) } - [LSTM] - label(1, 2)
I cannot figure out, how to do it with Keras.
Here is my code for convolutions
from keras.layers import Conv2D, BatchNormalization, GlobalAveragePooling2D, \
LSTM, TimeDistributed
IMAGE_WIDTH=99
IMAGE_HEIGHT=99
IMAGE_CHANNELS=3
convolutional_layers = Sequential([
Conv2D(input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS),
filters=6, kernel_size=(3, 3), strides=(2, 2), activation='relu',
name='conv1'),
BatchNormalization(),
Conv2D(filters=64, kernel_size=(1, 1), strides=(1, 1), activation='relu',
name='conv5_pixel'),
BatchNormalization(),
GlobalAveragePooling2D(name='avg_pool6'),
])
Here is the summary:
In [24]: convolutional_layers.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1 (Conv2D) (None, 49, 49, 6) 168
_________________________________________________________________
batch_normalization_3 (Batch (None, 49, 49, 6) 24
_________________________________________________________________
conv5_pixel (Conv2D) (None, 49, 49, 64) 448
_________________________________________________________________
batch_normalization_4 (Batch (None, 49, 49, 64) 256
_________________________________________________________________
avg_pool6 (GlobalAveragePool (None, 64) 0
=================================================================
Total params: 896
Trainable params: 756
Non-trainable params: 140
Now I want a recurrent layer to process sequences of these 64-dimensional vectors and output a label for each sequence.
I've read in manuals that TimeDistributed layer applies its input layer to every time slice of the input data.
I continue my code:
FRAME_NUMBER = 4
td = TimeDistributed(convolutional_layers, input_shape=(FRAME_NUMBER, 64))
model = Sequential([
td,
LSTM(units=1)
])
Result is the exception IndexError: list index out of range
Same exception for
td = TimeDistributed(convolutional_layers, input_shape=(None, FRAME_NUMBER, 64))
What am I doing wrong?
Expanding on the comments to an answer; the TimeDistributed layer applies the given layer to every time step of the input. Hence, your TimeDistributed would apply to every frame giving an input shape=(F_NUM, W, H, C). After applying the convolution to every image, you get back (F_NUM, 64) which are features for every frame.

Categories