Stacking convolutional network and recurrent layer - python

I'm working on a classifier for video sequences. It should take several video frames on input and output a label, either 0 or 1. So, it is a many-to-one network.
I already have a classifier for single frames. This classifier makes several convolutions with Conv2D, then applies GlobalAveragePooling2D. This results in 1D vector of length 64. Then original per-frame classifier has a Dence layer with softmax activation.
Now I would like to extend this classifier to work with sequences. Ideally, sequences should be of varying length, but for now I fix the length to 4.
To extend my classifier, I'm going to replace Dense with an LSTM layer with 1 unit. So, my goal is to have the LSTM layer to take several 1D vectors of length 64, one by one, and output a label.
Schematically, what I have now:
input(99, 99, 3) - [convolutions] - features(1, 64) - [Dense] - [softmax] - label(1, 2)
Desired architecture:
4x { input(99, 99, 3) - [convolutions] - features(1, 64) } - [LSTM] - label(1, 2)
I cannot figure out, how to do it with Keras.
Here is my code for convolutions
from keras.layers import Conv2D, BatchNormalization, GlobalAveragePooling2D, \
LSTM, TimeDistributed
IMAGE_WIDTH=99
IMAGE_HEIGHT=99
IMAGE_CHANNELS=3
convolutional_layers = Sequential([
Conv2D(input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS),
filters=6, kernel_size=(3, 3), strides=(2, 2), activation='relu',
name='conv1'),
BatchNormalization(),
Conv2D(filters=64, kernel_size=(1, 1), strides=(1, 1), activation='relu',
name='conv5_pixel'),
BatchNormalization(),
GlobalAveragePooling2D(name='avg_pool6'),
])
Here is the summary:
In [24]: convolutional_layers.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1 (Conv2D) (None, 49, 49, 6) 168
_________________________________________________________________
batch_normalization_3 (Batch (None, 49, 49, 6) 24
_________________________________________________________________
conv5_pixel (Conv2D) (None, 49, 49, 64) 448
_________________________________________________________________
batch_normalization_4 (Batch (None, 49, 49, 64) 256
_________________________________________________________________
avg_pool6 (GlobalAveragePool (None, 64) 0
=================================================================
Total params: 896
Trainable params: 756
Non-trainable params: 140
Now I want a recurrent layer to process sequences of these 64-dimensional vectors and output a label for each sequence.
I've read in manuals that TimeDistributed layer applies its input layer to every time slice of the input data.
I continue my code:
FRAME_NUMBER = 4
td = TimeDistributed(convolutional_layers, input_shape=(FRAME_NUMBER, 64))
model = Sequential([
td,
LSTM(units=1)
])
Result is the exception IndexError: list index out of range
Same exception for
td = TimeDistributed(convolutional_layers, input_shape=(None, FRAME_NUMBER, 64))
What am I doing wrong?

Expanding on the comments to an answer; the TimeDistributed layer applies the given layer to every time step of the input. Hence, your TimeDistributed would apply to every frame giving an input shape=(F_NUM, W, H, C). After applying the convolution to every image, you get back (F_NUM, 64) which are features for every frame.

Related

How to fix ValueError: Input 0 is incompatible with layer CNN: expected shape=(None, 35), found shape=(None, 31)

I am using Convolutional Neural Network to train a text classification task, using Keras, Conv1D. When I run the model below to my multi class text classification task, I get error such as following. I put time to undrestand the error but I don't know how to fix it. can anyone help me please?
The data set and evaluation set shape is such as following:
df_train shape: (7198,)
df_val shape: (1800,)
np.random.seed(42)
#You needs to reshape your input data according to Conv1D layer input format - (batch_size, steps, input_dim). Try
# set parameters of matrices and convolution
embedding_dim = 300
nb_filter = 64
filter_length = 5
hidden_dims = 32
stride_length = 1
from keras.layers import Embedding
embedding_layer = Embedding(len(tokenizer.word_index) + 1,
embedding_dim,
input_length=35,
name="Embedding")
inp = Input(shape=(35,), dtype='int32')
embeddings = embedding_layer(inp)
conv1 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV1",
kernel_regularizer=regularizers.l2(l=0.0367))(embeddings)
conv2 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV2",kernel_regularizer=regularizers.l2(l=0.02))(embeddings)
conv3 = Conv1D(filters=32, # Number of filters to use
kernel_size=filter_length, # n-gram range of each filter.
padding='same', #valid: don't go off edge; same: use padding before applying filter
activation='relu',
name="CONV2",kernel_regularizer=regularizers.l2(l=0.01))(embeddings)
max1 = MaxPool1D(10, strides=1,name="MaxPool1D1")(conv1)
max2 = MaxPool1D(10, strides=1,name="MaxPool1D2")(conv2)
max3 = MaxPool1D(10, strides=1,name="MaxPool1D2")(conv3)
conc = concatenate([max1, max2,max3])
flat = Flatten(name="FLATTEN")(max1)
....
Error is like following:
ValueError: Input 0 is incompatible with layer CNN: expected shape=(None, 35), found shape=(None, 31)
The model :
Model: "CNN"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_19 (InputLayer) [(None, 35)] 0
_________________________________________________________________
Embedding (Embedding) (None, 35, 300) 4094700
_________________________________________________________________
CONV1 (Conv1D) (None, 35, 32) 48032
_________________________________________________________________
MaxPool1D1 (MaxPooling1D) (None, 26, 32) 0
_________________________________________________________________
FLATTEN (Flatten) (None, 832) 0
_________________________________________________________________
Dropout (Dropout) (None, 832) 0
_________________________________________________________________
Dense (Dense) (None, 3) 2499
=================================================================
Total params: 4,145,231
Trainable params: 4,145,231
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
That error comes when you have not matched the network's input layer shape and the dataset's shape. If are you receiving an error like this, then you should try:
Set the network input shape at (None, 31) so that it matches the Dataset's shape.
Check that the dataset's shape is equal to (num_of_examples, 35).(Preferable)
If all of this informations are correct and there is no problem with the Dataset, it might be an error of the net itself, where the shapes af two adjcent layers don't match.

How to concatenate different layer outputs to feed as input to a new layer in Tensorflow?

I am new in Tensorflow and Python. I have built a multi layer network and I have pretrained it. Now I want to build another multilayer network next to the first. The weights of the first network are frozen and I want to concatenate features I got from the first network with the normal input of the layer of the new network. How can I concatenate the output of a specific layer in the first network with the input to this network and feed the layer of the new network?
Below is the simple example of concatenating 2 input layers of different input shape and feeding to next layer.
import tensorflow.keras.backend as K
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, concatenate, Conv2D, ZeroPadding2D, Dense
from tensorflow.keras.optimizers import Adagrad
input_img1 = Input(shape=(44,44,3))
x1 = Conv2D(3, (3, 3), activation='relu', padding='same')(input_img1)
input_img2 = Input(shape=(34,34,3))
x2 = Conv2D(3, (3, 3), activation='relu', padding='same')(input_img2)
# Zero Padding of 5 at the top, bottom, left and right side of an image tensor
x3 = ZeroPadding2D(padding = (5,5))(x2)
# Concatenate works as layers have same size output
x4 = concatenate([x1,x3])
output = Dense(18, activation='relu')(x4)
model = Model(inputs=[input_img1,input_img2], outputs=output)
model.summary()
Output -
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_4 (InputLayer) [(None, 34, 34, 3)] 0
__________________________________________________________________________________________________
input_3 (InputLayer) [(None, 44, 44, 3)] 0
__________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 34, 34, 3) 84 input_4[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 44, 44, 3) 84 input_3[0][0]
__________________________________________________________________________________________________
zero_padding2d_1 (ZeroPadding2D (None, 44, 44, 3) 0 conv2d_3[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 44, 44, 6) 0 conv2d_2[0][0]
zero_padding2d_1[0][0]
__________________________________________________________________________________________________
dense (Dense) (None, 44, 44, 18) 126 concatenate_1[0][0]
==================================================================================================
Total params: 294
Trainable params: 294
Non-trainable params: 0
__________________________________________________________________________________________________
If the above answer is not what you are looking for, then would request you to share the pseudo code or flow diagram of the models to answer better.

Tensorflow: Prediction of float value from image always returns 0

I have a model as follows:
from tensorflow import keras
from tensorflow.keras.layers import Dense, Conv2D, Flatten, MaxPooling2D
model = keras.Sequential([
Conv2D(16, (3, 3), padding='same', activation='relu', input_shape=(480, 640, 3), data_format="channels_last"),
MaxPooling2D(),
Conv2D(32, (3, 3), padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, (3, 3), padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(480, activation='relu'),
Dense(1, activation="relu")
])
model.compile(optimizer='adam',
loss=keras.losses.MeanSquaredError(),
metrics=['accuracy'])
epochs = 3
model.fit(
x=train_images,
y=train_values,
epochs=epochs
)
The variable train_images is an array of PNG images (640x480 pixels) and train_values is an array of floats (e.g: [1.11842, -17.894, 2.03, ...].
My goal is to predict the float value (at least, to find some approximate value), so I suppose that MSE should be the loss function in this case.
However, after training the model, I only get zeros not only with model.predict(test_images) but also with model.predict(train_images).
Note: I have to recall that my batch contains only 37 images, and my test sample contains 14. I know that the size is ridiculous, but this script is just a concept for something bigger.
If it helps, here is the result of model.summary():
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 480, 640, 16) 448
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 240, 320, 16) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 240, 320, 32) 4640
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 120, 160, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 120, 160, 64) 18496
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 60, 80, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 307200) 0
_________________________________________________________________
dense (Dense) (None, 480) 147456480
_________________________________________________________________
dense_1 (Dense) (None, 1) 481
=================================================================
Total params: 147,480,545
Trainable params: 147,480,545
Non-trainable params: 0
To start with change your activation function, relu limits values so anything below 0 = 0, thats not desire-able
Secondly normalize your y values, as it stands your values could be anywhere from -inf to +inf, range normalize them and store the normalization parameters. At run time you could always reverse this and get actual values
Also pump up the epochs, with that small a train set i suggest try overfitting the network, if a network overfits it will most likely train well
For now try these suggestions out, i think normalization is quite important
ALSO :: I suggest make the network much deeper, you need to extract the shapes and textures in the image and your network might not be deep enough (or as a matter of fact even be dense enough) to do that. I suggest use keras to load a pre trained model like VGG16, strip the head off add regression layers and transfer learn it onto your dataset. That could be better

Keras time series prediction with CNN+LSTM model and TimeDistributed layer wrapper

I have several data files of human activity recognition data consisting of time-ordered rows of recorded raw samples. Each row has 8 columns of EMG sensor data and 1 corresponding column of target sensor data. I'm trying to feed the 8 channels of EMG sensor data into a CNN+LSTM deep model in order to predict the 1 channel of target data. I do this by breaking down a dataset (a in the image below) into 50-row windows of raw samples (b in the image below) and then reshaping these windows into blocks of 4 windows, to act as time steps for the LSTM part of the model (c in the image below). The following image will hopefully explain it better:
I've been following the tutorial here as to how to implement my model: https://medium.com/smileinnovation/how-to-work-with-time-distributed-data-in-a-neural-network-b8b39aa4ce00
I have reshaped the data and built the model but keep coming back to the following error that I cannot figure out how to resolve:
"ValueError: Error when checking target: expected FC_out to have 2 dimensions, but got array with shape (808, 50, 1)"
My code follows and is written in Python using Keras and Tensorflow:
from keras.models import Sequential
from keras.layers import CuDNNLSTM
from keras.layers.convolutional import Conv2D
from keras.layers.core import Dense, Dropout
from keras.layers import Flatten
from keras.layers import TimeDistributed
#Code that reads in file data and shapes it into 4-window blocks omitted. That code produces the following arrays:
#x_train - shape of (808, 4, 50, 8) which equates to (samples, time steps, window length, number of channels)
#x_valid - shape of (223, 4, 50, 8) which equates to the same as x_train
#y_train - shape of (808, 50, 1) which equates to (samples, window length, number of target channels)
# Followed machine learning mastery style for ease of reading
numSteps = x_train.shape[1]
windowLength = x_train.shape[2]
numChannels = x_train.shape[3]
numOutputs = 1
# Reshape x data for use with TimeDistributed wrapper, adding extra dimension at the end
x_train = x_train.reshape(x_train.shape[0], numSteps, windowLength, numChannels, 1)
x_valid = x_valid.reshape(x_valid.shape[0], numSteps, windowLength, numChannels, 1)
# Build model
model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3,3), activation=activation, name="Conv2D_1"),
input_shape=(numSteps, windowLength, numChannels, 1)))
model.add(TimeDistributed(Conv2D(64, (3,3), activation=activation, name="Conv2D_2")))
model.add(Dropout(0.4, name="CNN_Drop_01"))
# Flatten for passing to LSTM layer
model.add(TimeDistributed(Flatten(name="Flatten_1")))
# LSTM and Dropout
model.add(CuDNNLSTM(28, return_sequences=True, name="LSTM_01"))
model.add(Dropout(0.4, name="Drop_01"))
# Second LSTM and Dropout
model.add(CuDNNLSTM(28, return_sequences=False, name="LSTM_02"))
model.add(Dropout(0.3, name="Drop_02"))
# Fully Connected layer and further Dropout
model.add(Dense(16, activation=activation, name="FC_1"))
model.add(Dropout(0.4)) # For example, for 3 outputs classes
# Final fully Connected layer specifying outputs
model.add(Dense(numOutputs, activation=activation, name="FC_out"))
# Compile model, produce summary and save model image to file
# NOTE: coeffDetermination refers to a function for calculating R2 and is not included in this code
model.compile(optimizer='Adam', loss='mse', metrics=[coeffDetermination])
# Now train the model
history_cb = model.fit(x_train, y_train, validation_data=(x_valid, y_valid), epochs=30, batch_size=64)
I'd be grateful if anyone can figure out what I've done wrong. Or am I just going about this the incorrect way, with trying to use this model configuration for time series prediction?
"ValueError: Error when checking target: expected FC_out to have 2 dimensions, but got array with shape (808, 50, 1)"
Your input is (808, 4, 50, 8, 1) and output is (808, 50, 1)
However, from the model.summary() shows that output shape should be (None, 4, 1)
Since the # of time steps is 4, y_train should be something like (808, 4, 1).
Or, if you want to have (888, 50, 1), you need to change model to get the last part as (None, 50, 1).
Model: "sequential_10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
time_distributed_18 (TimeDis (None, 4, 48, 6, 64) 640
_________________________________________________________________
time_distributed_19 (TimeDis (None, 4, 46, 4, 64) 36928
_________________________________________________________________
CNN_Drop_01 (Dropout) (None, 4, 46, 4, 64) 0
_________________________________________________________________
time_distributed_20 (TimeDis (None, 4, 11776) 0
_________________________________________________________________
LSTM_01 (LSTM) (None, 4, 28) 1322160
_________________________________________________________________
Drop_01 (Dropout) (None, 4, 28) 0
_________________________________________________________________
Drop_02 (Dropout) (None, 4, 28) 0
_________________________________________________________________
FC_1 (Dense) (None, 4, 16) 464
_________________________________________________________________
dropout_3 (Dropout) (None, 4, 16) 0
_________________________________________________________________
FC_out (Dense) (None, 4, 1) 17
=================================================================
Total params: 1,360,209
Trainable params: 1,360,209
Non-trainable params: 0
For Many to many sequence prediction with different sequence length, check this link https://github.com/keras-team/keras/issues/6063
dataX or input : (nb_samples, nb_timesteps, nb_features) -> (1000, 50, 1)
dataY or output: (nb_samples, nb_timesteps, nb_features) -> (1000, 10, 1)
model = Sequential()
model.add(LSTM(input_dim=1, output_dim=hidden_neurons, return_sequences=False))
model.add(RepeatVector(10))
model.add(LSTM(output_dim=hidden_neurons, return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.add(Activation('linear'))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])

Multi-feature causal CNN - Keras implementation

I'm currently using a basic LSTM to make regression predictions and I would like to implement a causal CNN as it should be computationally more efficient.
I'm struggling to figure out how to reshape my current data to fit the causal CNN cell and represent the same data/timestep relationship as well as what the dilation rate should be set at.
My current data is of this shape: (number of examples, lookback, features) and here's a basic example of the LSTM NN I'm using right now.
lookback = 20 # height -- timeseries
n_features = 5 # width -- features at each timestep
# Build an LSTM to perform regression on time series input/output data
model = Sequential()
model.add(LSTM(units=256, return_sequences=True, input_shape=(lookback, n_features)))
model.add(Activation('elu'))
model.add(LSTM(units=256, return_sequences=True))
model.add(Activation('elu'))
model.add(LSTM(units=256))
model.add(Activation('elu'))
model.add(Dense(units=1, activation='linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train,
epochs=50, batch_size=64,
validation_data=(X_val, y_val),
verbose=1, shuffle=True)
prediction = model.predict(X_test)
I then created a new CNN model (although not causal as the 'causal' padding is only an option for Conv1D and not Conv2D, per Keras documentation. If I understand correctly, by having multiple features, I need to use Conv2D, rather than Conv1D but then if I set Conv2D(padding='causal'), I get the following error - Invalid padding: causal)
Anyways, I was also able to fit the data with a new shape (number of examples, lookback, features, 1) and run the following model using the Conv2D Layer:
lookback = 20 # height -- timeseries
n_features = 5 # width -- features at each timestep
model = Sequential()
model.add(Conv2D(128, 3, activation='elu', input_shape=(lookback, n_features, 1)))
model.add(MaxPool2D())
model.add(Conv2D(128, 3, activation='elu'))
model.add(MaxPool2D())
model.add(Flatten())
model.add(Dense(1, activation='linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train,
epochs=50, batch_size=64,
validation_data=(X_val, y_val),
verbose=1, shuffle=True)
prediction = model.predict(X_test)
However, from my understanding, this does not propagate the data as causal, rather just the entire set (lookback, features, 1) as an image.
Is there any way to either reshape my data to fit into a Conv1D(padding='causal') Layer, with multiple features or somehow run the same data and input shape as Conv2D with 'causal' padding?
I believe that you can have causal padding with dilation for any number of input features. Here is the solution I would propose.
The TimeDistributed layer is key to this.
From Keras Documentation: "This wrapper applies a layer to every temporal slice of an input. The input should be at least 3D, and the dimension of index one will be considered to be the temporal dimension."
For our purposes, we want this layer to apply "something" to each feature, so we move the features to the temporal index, which is 1.
Also relevant is the Conv1D documentation.
Specifically about channels: "The ordering of the dimensions in the inputs. "channels_last" corresponds to inputs with shape (batch, steps, channels) (default format for temporal data in Keras)"
from tensorflow.python.keras import Sequential, backend
from tensorflow.python.keras.layers import GlobalMaxPool1D, Activation, MaxPool1D, Flatten, Conv1D, Reshape, TimeDistributed, InputLayer
backend.clear_session()
lookback = 20
n_features = 5
filters = 128
model = Sequential()
model.add(InputLayer(input_shape=(lookback, n_features, 1)))
# Causal layers are first applied to the features independently
model.add(Permute(dims=(2, 1))) # UPDATE must permute prior to adding new dim and reshap
model.add(Reshape(target_shape=(n_features, lookback, 1)))
# After reshape 5 input features are now treated as the temporal layer
# for the TimeDistributed layer
# When Conv1D is applied to each input feature, it thinks the shape of the layer is (20, 1)
# with the default "channels_last", therefore...
# 20 times steps is the temporal dimension
# 1 is the "channel", the new location for the feature maps
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**0)))
# You could add pooling here if you want.
# If you want interaction between features AND causal/dilation, then apply later
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**1)))
model.add(TimeDistributed(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**2)))
# Stack feature maps on top of each other so each time step can look at
# all features produce earlier
model.add(Permute(dims=(2, 1, 3))) # UPDATED to fix issue with reshape
model.add(Reshape(target_shape=(lookback, n_features * filters))) # (20 time steps, 5 features * 128 filters)
# Causal layers are applied to the 5 input features dependently
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**0))
model.add(MaxPool1D())
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**1))
model.add(MaxPool1D())
model.add(Conv1D(filters, 3, activation="elu", padding="causal", dilation_rate=2**2))
model.add(GlobalMaxPool1D())
model.add(Dense(units=1, activation='linear'))
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()
Final Model Summary
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
reshape (Reshape) (None, 5, 20, 1) 0
_________________________________________________________________
time_distributed (TimeDistri (None, 5, 20, 128) 512
_________________________________________________________________
time_distributed_1 (TimeDist (None, 5, 20, 128) 49280
_________________________________________________________________
time_distributed_2 (TimeDist (None, 5, 20, 128) 49280
_________________________________________________________________
reshape_1 (Reshape) (None, 20, 640) 0
_________________________________________________________________
conv1d_3 (Conv1D) (None, 20, 128) 245888
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 10, 128) 0
_________________________________________________________________
conv1d_4 (Conv1D) (None, 10, 128) 49280
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 5, 128) 0
_________________________________________________________________
conv1d_5 (Conv1D) (None, 5, 128) 49280
_________________________________________________________________
global_max_pooling1d (Global (None, 128) 0
_________________________________________________________________
dense (Dense) (None, 1) 129
=================================================================
Total params: 443,649
Trainable params: 443,649
Non-trainable params: 0
_________________________________________________________________
Edit:
"why you need to reshape and use n_features as the temporal layer"
The reason why n_features needs to be at the temporal layer initially is because Conv1D with dilation and causal padding only works with one feature at a time, and because of how the TimeDistributed layer is implemented.
From their documentation "Consider a batch of 32 samples, where each sample is a sequence of 10 vectors of 16 dimensions. The batch input shape of the layer is then (32, 10, 16), and the input_shape, not including the samples dimension, is (10, 16).
You can then use TimeDistributed to apply a Dense layer to each of the 10 timesteps, independently:"
By applying the TimeDistributed layer independently to each feature, it reduces the dimension of the problem as if there was only one feature (which would easily allow for dilation and causal padding). With 5 features, they need to each be handled separately at first.
After your edits this recommendation still applies.
There shouldn't be a difference in terms of the network whether InputLayer is included in the first layer or separate so you can definitely put it in the first CNN if that resolves the issue.
In Conv1D with causal padding is dilation convolution. For Conv2D you can use dilation_rate parameter of the Conv2D class. You have to assign dilation_rate with 2-tuple of integers. For more information you can read in the keras documentation or here.

Categories