I am trying to create a neural network model with one hidden layer and then trying to evaluate it, but I am getting an error that I am not able to understand clearly:
ValueError: Input 0 of layer sequential_1 is incompatible with the layer: : expected min_ndim=2, found ndim=1. Full shape received: [30]
It looks like I have an error with the dimensions of my input layer, but I can't quite spot what. I've googled and looked on stackoverflow, but haven't found anything that worked so far. Any help please?
Here's a minimal working example:
import tensorflow as tf
# Define Sequential model with 3 layers
input_dim = 30
num_neurons = 10
output_dim = 12
model = tf.keras.Sequential(
[
tf.keras.layers.Dense(input_dim, activation="relu", name="layer1"),
tf.keras.layers.Dense(num_neurons, activation="relu", name="layer2"),
tf.keras.layers.Dense(output_dim, name="layer3"),
]
)
model(tf.ones(input_dim))
Layers have an input and output dimension. For layers in the "middle" of the NN, they figure out their input domain from the output domain of the previous layer. The only exception is the first layer that has nothing to go by, and requires input_dim to be set. Here is how to fix your code. Note how we pass the dimensions. First (hidden) layer is input_dim x num_neurons, second (output layer) num_neurons x output_dim
You can stick more layers in between the two; they only require the first argument, their output dimension
also note I had to fix your last line as well, tf.ones needs to be a 2D shape num_observation x input_dim
import tensorflow as tf
# Define Sequential model with 1 hidden layer
input_dim = 30
num_neurons = 10
output_dim = 12
model = tf.keras.Sequential(
[
tf.keras.layers.Dense(num_neurons, input_dim = input_dim, activation="relu", name="layer1"),
tf.keras.layers.Dense(output_dim, name="layer3"),
]
)
model(tf.ones((1,input_dim)))
produces (for me; I think the numbers are essentially random initialization)
<tf.Tensor: shape=(1, 12), dtype=float32, numpy=
array([[ 0.06973769, -0.1798143 , -0.2920275 , 0.84811246, 0.44899416,
-0.10300556, 0.00831143, -0.16158538, 0.13395026, 0.4352504 ,
0.19114715, 0.44100884]], dtype=float32)>
Related
I want to remove only the last dense layer from an already saved model in .h5 file and add a new dense layer.
Information about the saved model:
I used transfer learning on the EfficientNet B0 model and added a dropout with 2 dense layers. The last dense layer had 3 nodes equal to my number of classes, as shown below:
inputs = tf.keras.layers.Input(shape=(IMAGE_HEIGHT, IMAGE_WIDTH, 3))
x = img_augmentation(inputs)
model = tf.keras.applications.EfficientNetB0(include_top=False, input_tensor=x, weights="imagenet")
# Freeze the pretrained weights
model.trainable = False
# Rebuild top
x = tf.keras.layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(5, activation=tf.nn.relu)(x)
outputs = tf.keras.layers.Dense(len(class_names), activation="softmax", name="pred")(x)
After training, I saved my model as my_h5_model.h5
Main Task: I want to use the saved model architecture with its weights and replace only the last dense layer with 4 nodes dense layer.
I tried many things as suggested by the StackOverflow community as:
Iterate over all the layers except the last layer and add them to a separate already defined sequential model
new_model = Sequential()
for layer in (model.layers[:-1]):
new_model.add(layer)
But it gives an error which state:
ValueError: Exception encountered when calling layer "block1a_se_excite" (type Multiply).
A merge layer should be called on a list of inputs. Received: inputs=Tensor("Placeholder:0", shape=(None, 1, 1, 32), dtype=float32) (not a list of tensors)
Call arguments received:
• inputs=tf.Tensor(shape=(None, 1, 1, 32), dtype=float32)
I also tried the functional approach as:
input_layer = model.input
for layer in (model.layers[:-1]):
x = layer(input_layer)
which throws an as mention below:
ValueError: Exception encountered when calling layer "stem_bn" (type BatchNormalization).
Dimensions must be equal, but are 3 and 32 for '{{node stem_bn/FusedBatchNormV3}} = FusedBatchNormV3[T=DT_FLOAT, U=DT_FLOAT, data_format="NHWC", epsilon=0.001, exponential_avg_factor=1, is_training=false](Placeholder, stem_bn/ReadVariableOp, stem_bn/ReadVariableOp_1, stem_bn/FusedBatchNormV3/ReadVariableOp, stem_bn/FusedBatchNormV3/ReadVariableOp_1)' with input shapes: [?,224,224,3], [32], [32], [32], [32].
Call arguments received:
• inputs=tf.Tensor(shape=(None, 224, 224, 3), dtype=float32)
• training=False
Lastly, I did something that came to my mind
inputs = tf.keras.layers.Input(shape=(IMAGE_HEIGHT, IMAGE_WIDTH, 3))
x = img_augmentation(inputs)
x = model.layers[:-1](x)
x = keras.layers.Dense(5, name="compress_1")(x)
which simply gave an error as:
'list' object is not callable
I did some more experiments and was able to remove the last layer and add the new dense layer
# imported a pretained saved model
from tensorflow import keras
import tensorflow as tf
model = keras.models.load_model('/content/my_h5_model.h5')
# selected all layers except last one
x= model.layers[-2].output
outputs = tf.keras.layers.Dense(4, activation="softmax", name="predictions")(x)
model = tf.keras.Model(inputs = model.input, outputs = outputs)
model.summary()
In the saved model, I had 3 nodes on dense layers, but in the current model, I added 4 layers. The last layer summary is shown below:
dropout_3 (Dropout) (None, 1280) 0 ['batch_normalization_4[0][0]']
dense_3 (Dense) (None, 5) 6405 ['dropout_3[0][0]']
predictions (Dense) (None, 4) 24 ['dense_3[0][0]']
==================================================================================================
Have you tried switching between import keras and tensorflow.keras in your import? This has worked in other issues.
I am trying to tie together a CNN layer with 2 LSTM layers and ctc_batch_cost for loss, but I'm encountering some problems. My model is supposed to work with grayscale images.
During my debugging I've figured out that if I use just a CNN layer that keeps the output size equal to the input size + LSTM and CTC, the model is able to train:
# === Without MaxPool2D ===
inp = Input(name='inp', shape=(128, 32, 1))
cnn = Conv2D(name='conv', filters=1, kernel_size=3, strides=1, padding='same')(inp)
# Go from Bx128x32x1 to Bx128x32 (B x TimeSteps x Features)
rnn_inp = Reshape((128, 32))(maxp)
blstm = Bidirectional(LSTM(256, return_sequences=True), name='blstm1')(rnn_inp)
blstm = Bidirectional(LSTM(256, return_sequences=True), name='blstm2')(blstm)
# Softmax.
dense = TimeDistributed(Dense(80, name='dense'), name='timedDense')(blstm)
rnn_outp = Activation('softmax', name='softmax')(dense)
# Model compiles, calling fit works!
But when I add a MaxPool2D layer that halves the dimensions, I get an error sequence_length(0) <= 64, similar to the one presented here.
# === With MaxPool2D ===
inp = Input(name='inp', shape=(128, 32, 1))
cnn = Conv2D(name='conv', filters=1, kernel_size=3, strides=1, padding='same')(inp)
maxp = MaxPool2D(name='maxp', pool_size=2, strides=2, padding='valid')(cnn) # -> 64x16x1
# Go from Bx64x16x1 to Bx64x16 (B x TimeSteps x Features)
rnn_inp = Reshape((64, 16))(maxp)
blstm = Bidirectional(LSTM(256, return_sequences=True), name='blstm1')(rnn_inp)
blstm = Bidirectional(LSTM(256, return_sequences=True), name='blstm2')(blstm)
# Softmax.
dense = TimeDistributed(Dense(80, name='dense'), name='timedDense')(blstm)
rnn_outp = Activation('softmax', name='softmax')(dense)
# Model compiles, but calling fit crashes with:
# InvalidArgumentError: sequence_length(0) <= 64
# [[{{node ctc_loss_1/CTCLoss}}]]
After struggling for about 3 days with this problem, I posted the above question here, on StackOverflow. About 2 hours after posting the questions I finally figured it out.
TL;DR Solution:
If you're using ctc_batch_cost:
Make sure you're passing the lengths (numbers of timesteps) of the sequences entering your RNNs as their inputs for the input_length argument.
If you're using ctc_loss:
Make sure you're passing the lengths (numbers of timesteps) of the sequences entering your RNNs as their inputs for the logit_length argument.
Solution:
The solution lies in the documentation, which, relatively sparse, can be cryptic for a machine learning newbie like myself.
The TensorFlow documentation for ctc_batch_cost reads:
tf.keras.backend.ctc_batch_cost(
y_true, y_pred, input_length, label_length
)
...
input_length tensor (samples, 1) containing the sequence length for
each batch item in y_pred.
...
input_length corresponds to logit_length from ctc_loss function's TensorFlow documentation:
tf.nn.ctc_loss(
labels, logits, label_length, logit_length, logits_time_major=True, unique=None,
blank_index=None, name=None
)
...
logit_length tensor of shape [batch_size] Length of input sequence in
logits.
...
That's where it clicked, at the word logit. So, the argument for input_length or logit_length is supposed to be a tensor/container (in my case, numpy array) of the lengths (i.e. number of timesteps) of the sequences entering the RNN (in my case LSTM) as input.
I was originally making the mistake of considering the required length to be the width of the grayscale images that act as input for the whole network (CNN + MaxPool2D + RNN), but because the MaxPool2D layer creates a tensor of different dimensions for the RNN's input, the ctc loss function crashes.
Now fit runs without crashing.
I am building a cnn_rnn network for image classification. I am getting an error while running the following python code in my jupyter notebook.
# model
model1 = Sequential()
# first convolutional layer
model1.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape(160, 120, 3)))
# second convolutional layer
model1.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
#Adding a pooling Layer
model1.add(MaxPooling2D(pool_size=(3, 3)))
#Adding dropouts
model1.add(Dropout(0.25))
# flatten and put a fully connected layer
model1.add(Flatten())
model1.add(Dense(32, activation='relu')) # fully connected
#Adding RNN N/W
model1.add(LSTM(32, return_sequences=True))
model1.add(TimeDistributed(Dense(5, activation='softmax')))
I also tried adding input_shape=(160, 120, 3) as a parameter to the LSTM function but to no avail. Please Help!
P.S: I also tried using GRU instead of LSTM but got the same error.
Update: Please note the model.summary() results
enter image description here
Your error is due to your use of Flatten and Dense BEFORE the LSTM layer.
LSTM layers require the input to be in the shape of (Batchsize x Length x Feature depth0) (or some variant), whereas your flatten changes the Conv2D output from (B x H x W x F) to (B x W * F * H) if that makes sense. If you want to use this architecture, I'd recommend using a Reshape layer to flatten the dimension you want, and using a Conv1D of kernel size 1 (same as a fully-connected layer) before the LSTM layer.
Or, if you want to use this exact code, add this before your LSTM layer and it should work:
model1.add(Reshape(target_shape=(1,54*40,32))
It's 54 and 40 due a pool_size of (3,3).
I have a training input in 3 dimensions (8,50,3).
I am trying to pass it as an input to the Sequential Model in Keras. Looking up the documentation I found that this should work:
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(50,3)))
model.add(Dense(100,init="uniform", activation='sigmoid'))
model.add(Dense(50,init="uniform", activation='relu'))
model.add(Dense(output_dim=1))
model.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])
When I try to train this model:
model.fit(train,labelTrain,epochs=1,batch_size=1,verbose=1)
I get the following error:
Error when checking model target: expected dense_148 to have 3 dimensions, but got array with shape (8, 1)
What can it mean?
Also, my first objective was to pass a 3D array where the middle dimension did not have a fixed size but I gave up after finding it impossible. Could it work?
Target means it's the expected result. The problem is in labelTrain, not in the input.
A Dense layer must have a number of neurons. You don't pass it an output shape, you pass the amount of neurons, and the output is automatically (None, neurons)
Your last layer should be:
model.add(Dense(1, activation='I recomend an activation here'))
I want to build a neural network where the two first layers are feedforward and the last one is recurrent.
here is my code :
model = Sequential()
model.add(Dense(150, input_dim=23,init='normal',activation='relu'))
model.add(Dense(80,activation='relu',init='normal'))
model.add(SimpleRNN(2,init='normal'))
adam =OP.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss="mean_squared_error", optimizer="rmsprop")
and I get this error :
Exception: Input 0 is incompatible with layer simplernn_11: expected ndim=3, found ndim=2.
model.compile(loss='mse', optimizer=adam)
It is correct that in Keras, RNN layer expects input as (nb_samples, time_steps, input_dim). However, if you want to add RNN layer after a Dense layer, you still can do that after reshaping the input for the RNN layer. Reshape can be used both as a first layer and also as an intermediate layer in a sequential model. Examples are given below:
Reshape as first layer in a Sequential model
model = Sequential()
model.add(Reshape((3, 4), input_shape=(12,)))
# now: model.output_shape == (None, 3, 4)
# note: `None` is the batch dimension
Reshape as an intermediate layer in a Sequential model
model.add(Reshape((6, 2)))
# now: model.output_shape == (None, 6, 2)
For example, if you change your code in the following way, then there will be no error. I have checked it and the model compiled without any error reported. You can change the dimension as per your need.
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN, Reshape
from keras.optimizers import Adam
model = Sequential()
model.add(Dense(150, input_dim=23,init='normal',activation='relu'))
model.add(Dense(80,activation='relu',init='normal'))
model.add(Reshape((1, 80)))
model.add(SimpleRNN(2,init='normal'))
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model.compile(loss="mean_squared_error", optimizer="rmsprop")
In Keras, you cannot put a Reccurrent layer after a Dense layer because the Dense layer gives output as (nb_samples, output_dim). However, a Recurrent layer expects input as (nb_samples, time_steps, input_dim). So, a Dense layer gives a 2-D output, but a Recurrent layer expects a 3-D input. However, you can do the reverse, i.e., put a Dense layer after a Recurrent layer.