Incompatiblity with the layer in the image captioning model - python

i am working on a image captioning model using flikr8k dataset. when i try to fit the model i am getting this error.
***ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 of input shape to have value 2048 but received input with shape (None, 1)***
This is https://www.kaggle.com/shadabhussain/automated-image-captioning-flickr8/comments code i am trying to replicate.
but the moment i ran model.fit i get a value error.
conca = Concatenate()([image_model.output, language_model.output])
x = LSTM(128, return_sequences=True)(conca)
x = LSTM(512, return_sequences=False)(x)
x = Dense(vocab_size)(x)
out = Activation('softmax')(x)
model = Model(inputs=[image_model.input, language_model.input], outputs = out)
# model.load_weights("../input/model_weights.h5")
model.compile(loss='categorical_crossentropy', optimizer='RMSprop', metrics=['accuracy'])
model.summary()
hist = model.fit([images, captions], next_words, batch_size=512, epochs=200)
I have attached the image of the model architecture
Thanks

Related

Dimension mismatch in Keras sequence to sequence model with Attention

I am trying to build a Neural Machine Translation model with attention. I am following the tutorial on Keras blog that shows how to build a NMT model using sequence-to-sequence approach (without attention). I extended the model to incorporate attention in the following way -
latent_dim = 300
embedding_dim=100
batch_size = 128
# Encoder
encoder_inputs = keras.Input(shape=(None, num_encoder_tokens))
#encoder lstm 1
encoder_lstm = tf.keras.layers.LSTM(latent_dim,return_sequences=True,return_state=True,dropout=0.4,recurrent_dropout=0.4)
encoder_output, state_h, state_c = encoder_lstm(encoder_inputs)
print(encoder_output.shape)
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = keras.Input(shape=(None, num_decoder_tokens))
decoder_lstm = tf.keras.layers.LSTM(latent_dim, return_sequences=True, return_state=True,dropout=0.4,recurrent_dropout=0.2)
decoder_output,decoder_fwd_state, decoder_back_state = decoder_lstm(decoder_inputs,initial_state=[state_h, state_c])
# Attention layer
attn_out = tf.keras.layers.Attention()([encoder_output, decoder_output])
# Concat attention input and decoder LSTM output
decoder_concat_input = tf.keras.layers.Concatenate(axis=-1, name='concat_layer')([decoder_output, attn_out])
#dense layer
decoder_dense = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(num_decoder_tokens, activation='softmax'))
decoder_outputs = decoder_dense(decoder_concat_input)
# Define the model
attn_model = tf.keras.Model([encoder_inputs, decoder_inputs], decoder_outputs)
attn_model.summary()
To train the model -
attn_model.compile(
optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"]
)
history = attn_model.fit(
[encoder_input_data, decoder_input_data],
decoder_target_data,
batch_size=batch_size,
epochs=5,
validation_split=0.2,
)
Here I have below shape
encoder_input_data.shape is (10000, 16, 71)
decoder_input_data.shape is (10000, 59, 92)
decoder_target_data.shape is (10000, 59, 92)
When I train this model, I get below error:
InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 59 and 16. Shapes are [?,59] and [?,16]. for 'model/concat_layer/concat' (op: 'ConcatV2') with input shapes: [?,59,300], [?,16,300], [] and with computed input tensors: input[2] = <2>.
I understand that it is complaining about the dimension of encoder_input_data and decoder_input_data but this same setup works when we run the regular sequence-to-sequence model (without attention) as discussed in keras blog.In this case, it is throwing error because of the Concatenation layer.
Can anyone please suggest how to fix this?

Error while trying to define input_size in the first layer of an 1D CNN

I am trying to train an 1D CNN to recognise bearing faults using the data from the WCRU. I am having difficulties while defining the input_shape an the first layer of my model. My 'train_X' is a vector with dimensions (60800,1). This is the code I use:
X_train = numpy.loadtxt('training_dataX.txt',dtype=float)
Y_train = numpy.loadtxt('training_dataY.txt',dtype=int)
X_test = numpy.loadtxt('testing_dataX.txt',dtype=float)
Y_test = numpy.loadtxt('testing_dataY.txt',dtype=int)
Y_train = np_utils.to_categorical(Y_train) #one hot encode outputs
Y_test = np_utils.to_categorical(Y_test)
num_classes = Y_test.shape[1]
e=0.01 #create a callback to monitor the error to avoid overfitting
class myCallback(keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if(logs.get('val_loss') > e):
print("\nReached %2.2f%% error, so stopping training!!" %(e*100))
self.model.stop_training = True
def baseline_model(): #building our sequential model
model = Sequential()
model.add(Conv1D(60,9,activation='tanh',padding='same',input_shape=(1,1)))
model.add(MaxPooling1D(4))
model.add(Conv1D(40,9,activation='tanh',padding='same'))
model.add(MaxPooling1D(4))
model.add(Conv1D(40,9,activation='tanh',padding='same'))
model.add(Flatten())
model.add(Dense(20,activation='tanh'))
model.add(Dense(num_classes,activation='tanh'))
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
return model
model = baseline_model() #initialize fitting process
model.fit(X_train, Y_train, validation_data=(X_test,Y_test),epochs=100,batch_size=10,callbacks=['callbacks'])
scores = model.evaluate(X_test,Y_test,verbose=0) #final model evaluation
print('CNN Error: %.2f%%' % (100-scores[1]*100))
Ufortunately I am getting this error message i cant figure out the reason:
ValueError: The shape of the input to "Flatten" is not fully defined (got (0, 40)).
Make sure to pass a complete "input_shape" or "batch_input_shape" argument to the first layer in your model.
I 've tryied changing the input_shape to (1,) but the I get this error:
ValueError: Input 0 is incompatible with layer conv1d_24: expected ndim=3, found ndim=2
Any suggestions would be appreciated. Thank you in advance.
First of all, since you are doing classification, the final activation function should be softmax and not tanh. Second of all, if each example is a vector of dimension (60800, 1), passing that as the input shape is necessary. Check code below:
from tensorflow.keras.layers import Input, Convolution1D, MaxPooling1D, GlobalAveragePooling1D, UpSampling1D, Conv1D, Flatten, Dense
input_shape = (60800, 1)
num_classes = 10
model = Sequential()
model.add(Conv1D(60,9,activation='tanh',padding='same',input_shape=input_shape))
model.add(MaxPooling1D(4))
model.add(Conv1D(40,9,activation='tanh',padding='same'))
model.add(MaxPooling1D(4))
model.add(Conv1D(40,9,activation='tanh',padding='same'))
model.add(Flatten())
model.add(Dense(20,activation='tanh'))
model.add(Dense(num_classes,activation='softmax'))
model.summary()
EDITS
I used num_classes to be 10 because I do not know the number of classes, but you can change it accordingly.

Keras multi-output model wrongly calculate target dimensions: ValueError: Error when checking target

I'm trying to build a multi-output keras model starting from a working single output model. Keras however, is complaining about tensors dimensions.
The single output Model:
This GRU model is training and predicting fine:
timesteps = 250
features = 2
input_tensor = Input(shape=(timesteps, features), name="input")
conv = Conv1D(filters=128, kernel_size=6,use_bias=True)(input_tensor)
b = BatchNormalization()(conv)
s_gru, states = GRU(256, return_sequences=True, return_state=True, name="gru_1")(b)
biases = keras.initializers.Constant(value=88.15)
out = Dense(1, activation='linear', name="output")(s_gru)
model = Model(inputs=input_tensor, outputs=out)
My numpy arrays are:
train_x # shape:(7110, 250, 2)
train_y # shape: (7110, 250, 1)
If fit the model with the following code and everything is fine:
model.fit(train_x, train_y,batch_size=128, epochs=10, verbose=1)
The Problem:
I want to use a slightly modified version of the network that outputs also the GRU states:
input_tensor = Input(shape=(timesteps, features), name="input")
conv = Conv1D(filters=128, kernel_size=6,use_bias=True)(input_tensor)
b = BatchNormalization()(conv)
s_gru, states = GRU(256, return_sequences=True, return_state=True, name="gru_1")(b)
biases = keras.initializers.Constant(value=88.15)
out = Dense(1, activation='linear', name="output")(s_gru)
model = Model(inputs=input_tensor, outputs=[out, states]) # multi output
#fit the model but with a list of numpy array as y
model.compile(optimizer=optimizer, loss='mae', loss_weights=[0.5, 0.5])
history = model.fit(train_x, [train_y,train_y], batch_size=128, epochs=10, callbacks=[])
This training fails and keras is complaining about the target dimensions:
ValueError: Error when checking target: expected gru_1 to have 2 dimensions, but got array with shape (7110, 250, 1)
I'm using Keras 2.3.0 and Tensorflow 2.0.
What am I missing here?
The dimensions of the second output and the second element in the outputs list should be of similar shape. In this case, states would be of shape (7110, 256), which can't really be compared to the train_y shape (which will be of shape (7110, 250, 1) as noted in the first code block. Make sure the outputs can be compared with a similar shape.

InvalidArgumentError: logits and labels must have the same first dimension

I am trying to classify images. Those images have different shapes, but this is not a problem.
However, I am trying to create a dataset using the tf.data.Dataset.from_generator function provided by Tensorflow and I have the feeling that something is not working as it should.
Here is the code:
filenames_ds = tf.data.Dataset.from_tensor_slices(categ_img[:1000]['image_name'])
labels_ds = tf.data.Dataset.from_tensor_slices(categ_img[:1000]['category_label'])
images_ds = filenames_ds.map(lambda x: tf.image.decode_jpeg(tf.read_file(x)))
labels_ds = labels_ds.map(lambda x: tf.one_hot(x, NUM_CATEGORIES))
ds = tf.data.Dataset.zip((images_ds, labels_ds)).batch(1)
I also tried to create the labels_ds like this:
labels_ds.map(lambda x: tf.expand_dims(tf.one_hot(x, NUM_CATEGORIES), axis=0))
categ_imgis a pandas.DataFrame containing image paths and labels under image_name and category_label columns respectively.
And I keep getting this error:
InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [1,50] and labels shape [50]
My model is based on a pretrained ResNet model provided by Keras:
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(None, None, 3))
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = GlobalAveragePooling2D()(x)
for fc in FC_LAYERS:
x = Dense(fc, activation='relu')(x)
x = Dropout(DROPOUT)(x)
output = Dense(NUM_CATEGORIES, activation='softmax', name='fully-connected')(x)
model = Model(inputs=base_model.input, outputs=output)
optimizer = tf.keras.optimizers.SGD(lr=LEARNING_RATE)
cce = tf.keras.losses.CategoricalCrossentropy()
model.compile(optimizer, loss=cce)
return model
It is trained like this:
model_classification.fit(
ds,
epochs=epochs,
steps_per_epoch=steps
)
Which seems pretty straight-forward to me.
Any help would be appreciated.
Thank you.
I finally tried something that worked.
Here is the line you need to change:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
I don't know why, but this made things working.

RNN with GRU in Keras

I want to implement Recurrent Neural network with GRU using Keras in python. I have problem in running code and I change variables more and more but it doesn't work. Do you have an idea for solve it?
inputs = 42 #number of columns input
num_hidden =50 #number of neurons in the layer
outputs = 1 #number of columns output
num_epochs = 50
batch_size = 1000
learning_rate = 0.05
#train (125973, 42) 125973 Rows and 42 Features
#Labels (125973,1) is True Results
model = tf.contrib.keras.models.Sequential()
fv=tf.contrib.keras.layers.GRU
model.add(fv(units=42, activation='tanh', input_shape= (1000,42),return_sequences=True)) #i want to send Batches to train
#model.add(tf.keras.layers.Dropout(0.15)) # Dropout overfitting
#model.add(fv((1,42),activation='tanh', return_sequences=True))
#model.add(Dropout(0.2)) # Dropout overfitting
model.add(fv(42, activation='tanh'))
model.add(tf.keras.layers.Dropout(0.15)) # Dropout overfitting
model.add(tf.keras.layers.Dense(1000,activation='softsign'))
#model.add(tf.keras.layers.Activation("softsign"))
start = time.time()
# sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
# model.compile(loss="mse", optimizer=sgd)
model.compile(loss="mse", optimizer="Adam")
inp = np.array(train)
oup = np.array(labels)
X_tr = inp[:batch_size].reshape(-1, batch_size, inputs)
model.fit(X_tr,labels,epochs=20, batch_size=batch_size)
However I get the following error:
ValueError: Error when checking target: expected dense to have shape (1000,) but got array with shape (1,)
Here, you have mentioned input vector shape to be 1000.
model.add(fv(units=42, activation='tanh', input_shape= (1000,42),return_sequences=True)) #i want to send Batches to train
However, shape of your training data (X_tr) is 1-D
Check your X_tr variable and have same dimension for input layer.
If you read the error carefully you would realize there is a shape mismatch between the shapes of labels you provide, which is (None, 1), and the shape of output of model, which is (None, 1):
ValueError: Error when checking target: <--- This means the output shapes
expected dense to have shape (1000,) <--- output shape of model
but got array with shape (1,) <--- the shape of labels you give when training
Therefore you need to make them consistent. You just need to change the number of units in the last layer to 1 since there is one output per input sample:
model.add(tf.keras.layers.Dense(1, activation='softsign')) # 1 unit in the output

Categories