I am trying to use the following model in Keras, where ConvLSTM2D output is followed by Conv2D to generate segmentation-like output. Input and output should be time series of the size (2*WINDOW_H+1, 2*WINDOW_W+1) each
model = Sequential()
model.add(ConvLSTM2D(3, kernel_size=3, padding = "same", batch_input_shape=(1, None, 2*WINDOW_H+1, 2*WINDOW_W+1, 1), return_sequences=True, stateful=True))
model.add(Conv2D(1, kernel_size=3, padding = "same"))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
However, this gives the following error (when adding Conv2D):
Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5
Any pointers on where I might be wrong really appreciated. Thanks!
I think you would need to do a time distributed Conv2D layer so that the dimensions match. Like this maybe:
model = Sequential()
model.add(ConvLSTM2D(3, kernel_size=3, padding = "same", batch_input_shape=(1, None, 2*WINDOW_H+1, 2*WINDOW_W+1, 1), return_sequences=True, stateful=True))
model.add(TimeDistributed((Conv2D(1, kernel_size=3, padding = "same")))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
The problem in your model that you try to use the sequence for the regular convolutional layer.
The only thing that you need is to remove return_sequences=True in ConvLSTM2D.
So this line:
model.add(ConvLSTM2D(3, kernel_size=3, padding = "same", batch_input_shape=(1, None, 2*WINDOW_H+1, 2*WINDOW_W+1, 1), return_sequences=True, stateful=True))
should be like this:
model.add(ConvLSTM2D(3, kernel_size=3, padding = "same", batch_input_shape=(1, None, 2*WINDOW_H+1, 2*WINDOW_W+1, 1), stateful=True))
In the ConvLSTM2D layer before Conv2D, you use return_sequence is False. After you run the program, you can get the result ndim=4.
Related
I have a CNN-LSTM that looks as follows;
SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
I'm using this CNN-LSTM for a multivariate time series forecasting problem. the CNN-LSTM input data comes in the 4D format: [samples, subsequences, timesteps, features]. For some reason, I need TimeDistributed Layers; or I get errors like ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]. I think this has to do with the fact that Conv1D is officially not meant for time series, so to preserve time-series data shape we need to use a wrapper layer like TimeDistributed. I don't really mind using TimeDistributed layers - They're wrappers and if they make my model work I am happy. However, when I try to visualize my model with
file = 'CNN_LSTM_Visualization.png'
tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)
The resulting visualization only shows the Sequential():
I suspect this has to do with the TimeDistributed layers and the model not being built yet. I cannot call model.summary() either - it throws ValueError: This model has not yet been built. Build the model first by calling build()or callingfit()with some data, or specify aninput_shape argument in the first layer(s) for automatic build Which is strange because I have specified the input_shape, albeit in the Conv1D layer and not in the TimeDistributed wrapper.
I would like a working model together with a working tf.keras.utils.plot_model function. Any explanation as to why I need TimeDistributed and why it makes the plot_model function behave weirdly would be greatly awesome.
An alternative to using an Input layer is to simply pass the input_shape to the TimeDistributed wrapper, and not the Conv1D layer:
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
Add your input layer at the beginning. Try this
def DNN_Model(X_train):
model = Sequential()
model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel,
activation='relu')))
model.add(TimeDistributed(Conv1D(filters=n_filters,
kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
....
Now, you can plot and get a summary properly.
DNN_Model(3).summary() # OK
tf.keras.utils.plot_model(DNN_Model(3)) # OK
I created a CNN-LSTM for survival prediction of web sessions, my training data looks as follows:
print(x_train.shape)
(288, 3, 393)
with (samples, timesteps, features) and my model:
model = Sequential()
model.add(TimeDistributed(Conv1D(128, 5, activation='relu'),
input_shape=(x_train.shape[1], x_train.shape[2])))
model.add(TimeDistributed(MaxPooling1D()))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(64, stateful=True, return_sequences=True))
model.add(LSTM(16, stateful=True))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer=Adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])
However, the TimeDistributed Layer requires a minimum of 3 dimensions, how should I transform the data to get it work?
Thanks a lot!
your data are in 3d format and this is all you need to feed a conv1d or an LSTM. if your target is 2D remember to set return_sequences=False in your last LSTM cell.
using a flatten before an LSTM is a mistake because you are destroying the 3D dimensionality
pay attention also on the pooling operation in order to not have a negative time dimension to reduce (I use 'same' padding in the convolution above in order to avoid this)
below is an example in a binary classification task
n_sample, time_step, n_features = 288, 3, 393
X = np.random.uniform(0,1, (n_sample, time_step, n_features))
y = np.random.randint(0,2, n_sample)
model = Sequential()
model.add(Conv1D(128, 5, padding='same', activation='relu',
input_shape=(time_step, n_features)))
model.add(MaxPooling1D())
model.add(LSTM(64, return_sequences=True))
model.add(LSTM(16, return_sequences=False))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X,y, epochs=3)
I'm not super amazing with keras yet, so please be gentle.
My input data is a matrix of size 60000 x 784.
I'm trying to add convolutional layers after my fully connected layers, something like this:
model = Sequential()
model.add(Dense(784, input_dim=train_amplitudes.shape[1], activation='relu'))
model.add(Dense(784, activation='relu'))
model.add(Dense(784, activation='relu'))
model.add(Conv2D(100, kernel_size=5, activation='relu', input_shape=(28, 28))
mode.add(Conv2D(20, kernel_size = 3, activation = 'relu'))
model.add(Dense(train_targets.shape[1], activation='linear'))
Notice that 28 * 28 = 784.
I get the error "Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=2" at the first convolution layer.
Why and how can I fix this?
What is the purpose of this specific network structure? Assuming that your original data was 28x28, you should leave the input with 28x28 and then apply conv2d. After that you can flatten the last output of the convolutional blocks to continue with the fully connected layers.
In Keras, input shape argument is a 4D tensor with shape: (batch, channels, rows, cols) if data_format is "channels_first" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is "channels_last". You're just passing rows and columns (what you think) but it also requires batch and channels. More information can be found here.
I think I've managed to fix it. This is the code that "works"
model = Sequential()
model.add(Dense(784, input_dim=train_amplitudes.shape[1], activation='relu'))
model.add(Dense(784, activation='relu'))
model.add(Dense(784, activation='relu'))
mode.add(Reshape((28, 28, 1))
model.add(Conv2D(100, kernel_size=5, activation='relu')
mode.add(Conv2D(20, kernel_size = 3, activation = 'relu'))
model.add(Flatten())
model.add(Dense(train_targets.shape[1], activation='linear'))
it works in the sense that no error is produced. Whether it makes sense or produces a good output, that's another matter, but this is good enough for me.
I have a CNN and like to change this to a LSTM, but when I modified my code I receive the same error: ValueError: Input 0 is incompatible with layer gru_1: expected ndim=3, found ndim=4
I already change ndim but didn't work.
follow my cnn
def build_model(X,Y,nb_classes):
nb_filters = 32 # number of convolutional filters to use
pool_size = (2, 2) # size of pooling area for max pooling
kernel_size = (3, 3) # convolution kernel size
nb_layers = 4
input_shape = (1, X.shape[2], X.shape[3])
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid', input_shape=input_shape))
model.add(BatchNormalization(axis=1))
model.add(Activation('relu'))
for layer in range(nb_layers-1):
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(BatchNormalization(axis=1))
model.add(ELU(alpha=1.0))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation("softmax"))
return model
and follow how i like to did my LSTM
data_dim = 41
timesteps = 20
num_classes = 10
model = Sequential()
model.add(LSTM(256, return_sequences=True, input_shape=(timesteps, data_dim)))
model.add(Dropout(0.5))
model.add(LSTM(128, return_sequences=True, input_shape=(timesteps, data_dim)))
model.add(Dropout(0.25))
model.add(LSTM(64))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
What I was doing wrong?
Thanks
The LSTM code is fine, it executes with no errors for me.
The error you are seeing is related to internal incompatibility of the tensors within the model itself, not related to training data, in which case you'll get an "Exception: Invalid input shape"
What's confusing in your error is that it refers to a GRU layer, which isn't contained anywhere in your model definition. If your model only contains LSTM, you should get an error that calls out the LSTM layer that it conflicts with.
Perhaps check
model.get_config()
and make sure all the layers and configs are what you intended.
In particular, the first layer should say this:
batch_input_shape': (None, 20, 41)
I have a Keras model defined as such:
model = Sequential()
model.add(embedding_layer)
model.add(Conv1D(filters=256, kernel_size=3, activation='relu', padding='same'))
model.add(MaxPooling1D(pool_size=3))
model.add(Flatten())
model.add(Dense(num_classes, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
After the Flatten() layer, I want to concatenate 2 additional features, i.e. if Flatten() gives me a vector of size (1, n) (model.output_shape == (None, n)), I want to concatenate a separate numpy array of size (1, 2) so model.output_shape == (None, n+2). How would I go about doing this?
I think keras.layers.merge.Concatenate is what I'm looking for here, but I don't know how to implement it. There aren't many examples online, and Keras 2.0 also uses an updated syntax. Any help would be appreciated.
I played around a bit and figured it out. For anyone who's interested: this is a good use case for Keras' functional API, which always returns tensors, on which you can do tensor operations.
embedded_sequence = embedding_layer(sequence_input)
x = Conv1D(filters=256, kernel_size=3, activation='relu', padding='same')(embedded_sequence)
x = MaxPooling1D(pool_size=3)(x)
x = Flatten()(x)
# additional features input
from keras.layers.merge import Concatenate
af_input = Input(shape=(data['af_train'].shape[1],), name='af_input')
x = Concatenate()([x, af_input])
# output
main_output = Dense(num_classes, activation='sigmoid', name='main_output')(x)
model = Model(inputs=[sequence_input, af_input], outputs=main_output)
model.compile(loss='binary_crossentropy', optimizer='adam')
I haven't tested this code, but I did something similar and it worked (may not be the most efficient way though):
model = Sequential()
model.add(embedding_layer)
model.add(Conv1D(filters=256, kernel_size=3, activation='relu', padding='same'))
model.add(MaxPooling1D(pool_size=3))
model.add(Flatten())
auxiliary_input = Input(shape=(2,), name='aux_input')
final_model = Sequential()
final_model.add(Merge([model, auxiliary_input], mode='concat'))
final_model.add(Dense(num_classes, activation='sigmoid'))
final_model.compile(loss='binary_crossentropy', optimizer='adam')
There is also a part of doc that give example of having multiple inputs (and also multiple outputs) but using older API usage style.