I have a CNN and like to change this to a LSTM, but when I modified my code I receive the same error: ValueError: Input 0 is incompatible with layer gru_1: expected ndim=3, found ndim=4
I already change ndim but didn't work.
follow my cnn
def build_model(X,Y,nb_classes):
nb_filters = 32 # number of convolutional filters to use
pool_size = (2, 2) # size of pooling area for max pooling
kernel_size = (3, 3) # convolution kernel size
nb_layers = 4
input_shape = (1, X.shape[2], X.shape[3])
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid', input_shape=input_shape))
model.add(BatchNormalization(axis=1))
model.add(Activation('relu'))
for layer in range(nb_layers-1):
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(BatchNormalization(axis=1))
model.add(ELU(alpha=1.0))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation("softmax"))
return model
and follow how i like to did my LSTM
data_dim = 41
timesteps = 20
num_classes = 10
model = Sequential()
model.add(LSTM(256, return_sequences=True, input_shape=(timesteps, data_dim)))
model.add(Dropout(0.5))
model.add(LSTM(128, return_sequences=True, input_shape=(timesteps, data_dim)))
model.add(Dropout(0.25))
model.add(LSTM(64))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
What I was doing wrong?
Thanks
The LSTM code is fine, it executes with no errors for me.
The error you are seeing is related to internal incompatibility of the tensors within the model itself, not related to training data, in which case you'll get an "Exception: Invalid input shape"
What's confusing in your error is that it refers to a GRU layer, which isn't contained anywhere in your model definition. If your model only contains LSTM, you should get an error that calls out the LSTM layer that it conflicts with.
Perhaps check
model.get_config()
and make sure all the layers and configs are what you intended.
In particular, the first layer should say this:
batch_input_shape': (None, 20, 41)
Related
I have a CNN-LSTM that looks as follows;
SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
I'm using this CNN-LSTM for a multivariate time series forecasting problem. the CNN-LSTM input data comes in the 4D format: [samples, subsequences, timesteps, features]. For some reason, I need TimeDistributed Layers; or I get errors like ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]. I think this has to do with the fact that Conv1D is officially not meant for time series, so to preserve time-series data shape we need to use a wrapper layer like TimeDistributed. I don't really mind using TimeDistributed layers - They're wrappers and if they make my model work I am happy. However, when I try to visualize my model with
file = 'CNN_LSTM_Visualization.png'
tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)
The resulting visualization only shows the Sequential():
I suspect this has to do with the TimeDistributed layers and the model not being built yet. I cannot call model.summary() either - it throws ValueError: This model has not yet been built. Build the model first by calling build()or callingfit()with some data, or specify aninput_shape argument in the first layer(s) for automatic build Which is strange because I have specified the input_shape, albeit in the Conv1D layer and not in the TimeDistributed wrapper.
I would like a working model together with a working tf.keras.utils.plot_model function. Any explanation as to why I need TimeDistributed and why it makes the plot_model function behave weirdly would be greatly awesome.
An alternative to using an Input layer is to simply pass the input_shape to the TimeDistributed wrapper, and not the Conv1D layer:
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
Add your input layer at the beginning. Try this
def DNN_Model(X_train):
model = Sequential()
model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel,
activation='relu')))
model.add(TimeDistributed(Conv1D(filters=n_filters,
kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
....
Now, you can plot and get a summary properly.
DNN_Model(3).summary() # OK
tf.keras.utils.plot_model(DNN_Model(3)) # OK
I am using progressive learning for making a model for classification. I first used vgg16 weights and add a Dense layer at the end to train the model.
prior = keras.applications.VGG16(
include_top=False,
weights='imagenet',
input_shape=(48, 48, 3)
)
model = Sequential()
model.add(prior)
model.add(Flatten())
model.add(Dense(256, activation='relu', name='Dense_Intermediate'))
model.add(Dropout(0.1, name='Dropout_Regularization'))
model.add(Dense(2, activation='sigmoid', name='Output'))
for cnn_block_layer in model.layers[0].layers:
cnn_block_layer.trainable = False
model.layers[0].trainable = False
I froze the vgg layers so their weights do not change. After training, I saved the weights.
model.save('model65x65.h5')
I increased the dimension of the images using ImageDataGenerator and loaded previously saved model weights and I defined a new model
model = Sequential()
model.add(Conv2D(64, kernel_size=(3, 3), input_shape=(96, 96, 3), activation='relu', padding='same'))
model.add(Conv2D(64, kernel_size=(3, 3), input_shape=(96, 96, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
prior = load_model('./model65x65.h5')
Now I wanted to add all but the first two layers of VGG16 to the new model, strip the input layer out and also strip out the first convolutional layer using
for layer in prior.layers[0].layers[2:]:
model.add(layer)
# re-add the feedforward layers on top
for layer in prior.layers[1:]:
model.add(layer)
# the pretrained CNN layers are already marked non-trainable
# mark off the top layers as well
for layer in prior.layers[-4:]:
layer.trainable = False
As there are convolutional layers in the VGG16, I keep getting this error when I try to iterate over the prior model.
AttributeError: 'Conv2D' object has no attribute 'layers'
How can I iterate over the Conv2D layers?
I'm not super amazing with keras yet, so please be gentle.
My input data is a matrix of size 60000 x 784.
I'm trying to add convolutional layers after my fully connected layers, something like this:
model = Sequential()
model.add(Dense(784, input_dim=train_amplitudes.shape[1], activation='relu'))
model.add(Dense(784, activation='relu'))
model.add(Dense(784, activation='relu'))
model.add(Conv2D(100, kernel_size=5, activation='relu', input_shape=(28, 28))
mode.add(Conv2D(20, kernel_size = 3, activation = 'relu'))
model.add(Dense(train_targets.shape[1], activation='linear'))
Notice that 28 * 28 = 784.
I get the error "Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=2" at the first convolution layer.
Why and how can I fix this?
What is the purpose of this specific network structure? Assuming that your original data was 28x28, you should leave the input with 28x28 and then apply conv2d. After that you can flatten the last output of the convolutional blocks to continue with the fully connected layers.
In Keras, input shape argument is a 4D tensor with shape: (batch, channels, rows, cols) if data_format is "channels_first" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is "channels_last". You're just passing rows and columns (what you think) but it also requires batch and channels. More information can be found here.
I think I've managed to fix it. This is the code that "works"
model = Sequential()
model.add(Dense(784, input_dim=train_amplitudes.shape[1], activation='relu'))
model.add(Dense(784, activation='relu'))
model.add(Dense(784, activation='relu'))
mode.add(Reshape((28, 28, 1))
model.add(Conv2D(100, kernel_size=5, activation='relu')
mode.add(Conv2D(20, kernel_size = 3, activation = 'relu'))
model.add(Flatten())
model.add(Dense(train_targets.shape[1], activation='linear'))
it works in the sense that no error is produced. Whether it makes sense or produces a good output, that's another matter, but this is good enough for me.
I'm getting a list of images to train my CNN.
model = Sequential()
model.add(Dense(32, activation='tanh', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
data, labels = ReadImages(TRAIN_DIR)
# Train the model, iterating on the data in batches of 32 samples
model.fit(np.array(data), np.array(labels), epochs=10, batch_size=32)
But I faced this error:
'with shape ' + str(data_shape))
ValueError: Error when checking input: expected dense_1_input to have 2 dimensions, but got array with shape (391, 605, 700, 3)
You are feeding images to the Dense Layer. Either flatten the images using .flatten() or use a model with CNN Layers. The shape (391,605,700,3) means you have 391 images of size 605x700 having 3 dimensions(rgb).
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(605, 700, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
This link has good explanations for basic CNN.
This is no CNN. A Convolutional Neural Network is defined by having Conv Layer. Those Layers work with imput shapes in 4D (Batchsize, ImageDimX, ImageDimY, ColorChannels). The Dense Layers(aka. Fully connected) you are using 2D input (Batchsize, DataAsAVector)
You need to first flatten the image if you want to pass the image directly to dense layers as Dense layer takes input in 2 dimensions only and since you are passing whole image there is 4 dimensions in it i.e. Number of images X Height X Width X Number of channels (391, 605, 700, 3).
You are not actually doing any convolutions on the image. To do convolutions you need to add CNN layers after initialising the model as sequential.
To add dense layer :
model = Sequential()
model.add(Flatten())
model.add(Dense(32, activation='tanh', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
To add CNN layer and then flatten it :
model = Sequential()
model.add(Conv2D(input_shape=(605,700,3), filters=64, kernel_size=(3,3),
padding="same",activation="relu"))
model.add(Flatten())
model.add(Dense(32, activation='tanh', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
I'm trying to create a keras LSTM to predict time series. My x_train is shaped like 3000,15,10 (Examples, Timesteps, Features), y_train like 3000,15,1 and I'm trying to build a many to many model (10 input features per sequence make 1 output / sequence).
The code I'm using is this:
model = Sequential()
model.add(LSTM(
10,
input_shape=(15, 10),
return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(
100,
return_sequences=True))
model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
model.compile(loss="mse", optimizer="rmsprop")
model.fit(
X_train, y_train,
batch_size=512, nb_epoch=1, validation_split=0.05)
However, I can't fit the model when using :
model.add(Dense(1, activation='linear'))
>> Error when checking model target: expected dense_1 to have 2 dimensions, but got array with shape (3000, 15, 1)
or when formatting it this way:
model.add(Dense(1))
model.add(Activation("linear"))
>> Error when checking model target: expected activation_1 to have 2 dimensions, but got array with shape (3000, 15, 1)
I already tried flattening the model ( model.add(Flatten()) ) before adding the dense layer but that just gives me ValueError: Input 0 is incompatible with layer flatten_1: expected ndim >= 3, found ndim=2. This confuses me because I think my data actually is 3 dimensional, isn't it?
The code originated from https://github.com/Vict0rSch/deep_learning/tree/master/keras/recurrent
In case of keras < 2.0: you need to use TimeDistributed wrapper in order to apply it element-wise to a sequence.
In case of keras >= 2.0: Dense layer is applied element-wise by default.
Since you updated your keras version and your error messages changed, here is what works on my machine (Keras 2.0.x)
This works:
model = Sequential()
model.add(LSTM(10,input_shape=(15, 10), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM( 100, return_sequences=True))
model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
This also works:
model = Sequential()
model.add(LSTM(10,input_shape=(15, 10), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM( 100, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(1,return_sequences=True, activation='linear'))
Testing with:
x = np.ones((3000,15,10))
y = np.ones((3000,15,1))
Compiling and training with:
model.compile(optimizer='adam',loss='mse')
model.fit(x,y,epochs=4,verbose=2)