Discrepancy in the results of model.evaluate and model.predict in Keras

Discrepancy in the results of model.evaluate and model.predict in Keras - python

I'm performing multi-class classification with three class labels in Keras. During training, both the training and validation losses were decreasing and accuracies were increasing. After training, I tested out the model on the training set as a sanity check and there seems to be a huge discrepancy between model.evaluate and model.predict. I did find some solutions that seemed to indicate this was an issue with BatchNorm and Dropout layers, but that shouldn't result in such a huge difference. The relevant code is as shown below.
model=Sequential()
model.add(Conv2D(32, (3, 3), padding="same",input_shape=input_shape))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Dropout(0.25))
.
.
model.add(Dense(n_classes))
model.add(Activation("softmax"))
optimizer=Adam()
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['categorical_accuracy'])
datagen = ImageDataGenerator(horizontal_flip=True, fill_mode='nearest')
train_datagen = datagen.flow(X_train, y_train, batch_size=batch_size)
val_datagen = ImageDataGenerator().flow(X_val, y_val, batch_size=batch_size)
history=model.fit(train_datagen, steps_per_epoch=math.ceil(nb_train_samples/batch_size), verbose=2, epochs=50, validation_data=val_datagen, validation_steps=math.ceil(nb_validation_samples/batch_size), class_weight=d_class_weights)
print('model.evaluate accuracy: ', model.evaluate(X_train, y_train, batch_size=batch_size)[1])
test_pred = model.predict(ImageDataGenerator().flow(X_train, y=None, batch_size=batch_size), steps=math.ceil(nb_train_samples/batch_size))
test_result=np.array(test_pred)
test_result = np.zeros(test_result.shape)
test_result[np.arange(len(test_pred)), test_pred.argmax(1)] = 1
total=0
count=0
for i in range(test_result.shape[0]):
total+=1
count+=(test_result[i]==y_train[i]).all()
print('model.predict accuracy: ', count/total)
The output I get is as follows:-
66/66 [==============================] - 12s 177ms/step - loss: 0.0010 - categorical_accuracy: 1.0000
model.evaluate accuracy: 1.0
model.predict accuracy: 0.42138063279002874
I've been trying to solve this for a while now and have failed to find anything. I'm already using categorical_crossentropy, categorical_accuracy, and softmax activation in the last layer, so I have no idea what's wrong. Any help would be greatly appreciated!

I finally found the solution, turns out that I'm only passing X_train into the predict function, and the shuffle parameter is True by default, because of which the predictions didn't correspond to the ground truth. Setting shuffle=False solved the problem.
test_pred = model.predict(ImageDataGenerator().flow(X_train, y=None, batch_size=batch_size, shuffle=False), steps=math.ceil(nb_train_samples/batch_size))

Related

Resuming neural network training after a certain epoch in Keras

I am training a neural network with a constant learning rate and epoch = 45. I observed that the accuracy is highest at epoch no 35 and then it wiggles around and decreases. I think I need to reduce the learning rate at epoch 35. Is there any chance that I can train the model again from epoch no 35 after the completion of all the epochs? My code is shown below-
model_nn = keras.Sequential()
model_nn.add(Dense(352, input_dim=28, activation='relu', kernel_regularizer=l2(0.001)))
model_nn.add(Dense(384, activation='relu', kernel_regularizer=l2(0.001)))
model_nn.add(Dense(288, activation='relu', kernel_regularizer=l2(0.001)))
model_nn.add(Dense(448, activation='relu', kernel_regularizer=l2(0.001)))
model_nn.add(Dense(320, activation='relu', kernel_regularizer=l2(0.001)))
model_nn.add(Dense(1, activation='sigmoid'))
auc_score = tf.keras.metrics.AUC()
model_nn.compile(loss='binary_crossentropy',
optimizer=keras.optimizers.Adam(learning_rate=0.0001),
metrics=['accuracy',auc_score])
history = model_nn.fit(X_train1, y_train1,
validation_data=(X_test, y_test),
epochs=45,
batch_size=250,
verbose=1)
_, accuracy = model_nn.evaluate(X_test, y_test)
# Saving model weights
model_nn.save('mymodel.h5')

You can do two useful things:
Use the ModelCheckpoint callback with the save_best_only=True parameter. It only saves when the model is considered the "best" and the latest best model according to the quantity monitored will not be overwritten.
Use the ReduceLROnPlateau and EarlyStopping callbacks. ReduceLROnPlateau will reduce learning rate when the metric has stops improving for the validation subset. EarlyStopping will stop training when a monitored metric has stopped improving at all.
In simple words, ReduceLROnPlateau helps us find the global minimum, EarlyStopping takes care of the number of epochs, and ModelCheckpoint will save the best model.
The code might look like this:
early_stoping = EarlyStopping(patience=5, min_delta=0.0001)
reduce_lr_loss = ReduceLROnPlateau(patience=2, verbose=1, min_delta=0.0001, factor=0.65)
model_checkpoint = ModelCheckpoint(save_best_only=True)
history = model_nn.fit(X_train1, y_train1,
validation_data=(X_test,y_test),
epochs=100,
batch_size=250,
verbose=1,
callbacks=[early_stoping, reduce_lr_loss, model_checkpoint])

How to fix no validation accuracy?

I'm working on a Neural Network and I've been training it recently, and it has approximately 93% accuracy on the training data and 0% accuracy on the validation data. My first thought was overfitting, but the model doesn't save in between training and I get these results in the first Epoch. I'm using keras in python with the following model code:
model = Sequential(
[
Conv1D(320, 8, input_shape=(560, 560), activation="relu"),
# Conv1D(320, 8, activation="relu"),
# Conv1D(320, 8, activation="relu"),
# Dense(750, activation="relu"),
# Dropout(0.6),
Dense(1500, activation="relu"),
Dropout(0.6),
Dense(750, activation="relu"),
Dropout(0.6),
GlobalMaxPooling1D(keepdims=True),
Dense(1, activation='softmax')
]
)
model.compile(optimizer=Adam(learning_rate=0.00001), loss="binary_crossentropy", metrics=['accuracy'])
earlystopping = callbacks.EarlyStopping(monitor="val_accuracy",
mode="max", patience=2,
restore_best_weights=True)
model1 = model.fit(x=training_x, y=training_y, batch_size=150, epochs=5, shuffle=True, verbose=1, callbacks=[earlystopping], validation_data=(val_x, val_y))
The results I'm getting look like this:
Epoch 1/5
167/167 [==============================] - 1266s 8s/step - loss: 6.4154 - accuracy: 0.9262 - val_loss: 0.0054 - val_accuracy: 0.0000e+00
I've tried changing almost all of the hyperparameters and changing the model's architecture but I keep getting similar results. Does this have anything to do with the data? The data I'm using is a 3d NumPy array containing pixel data from a bunch of images. Any help here would be greatly appreciated.

You need to use activation='sigmoid' and optimizers.RMSprop(lr=1e-4) for a binary classification.

High accuracy but bad predictions on Keras Tensorflow

I have a 9 class dataset with 7000 images, I use MobilenetV2 for training my set and ImageGenerator, resulting in 82% percent val accuracy. But when i predict my test images, it always predicts a false class. I have no idea what is wrong with it.Here is my code;
My ImageGenerator:
image_gen = ImageDataGenerator(rotation_range = 20,
width_shift_range=0.12,
height_shift_range=0.12,
shear_range=0.1,
zoom_range = 0.06,
horizontal_flip=True,
fill_mode='nearest',
rescale=1./255)
My model:
Model = Sequential()
Model.add(Conv2D(filters=32,kernel_size=(3,3),input_shape=image_shape,activation='relu'))
Model.add(MaxPool2D(pool_size=(2,2)))
Model.add(Conv2D(filters=64,kernel_size=(3,3),input_shape=image_shape,activation='relu'))
Model.add(MaxPool2D(pool_size=(2,2)))
Model.add(Conv2D(filters=64,kernel_size=(3,3),input_shape=image_shape,activation='relu'))
Model.add(MaxPool2D(pool_size=(2,2)))
Model.add(Conv2D(filters=64,kernel_size=(3,3),input_shape=image_shape,activation='relu'))
Model.add(MaxPool2D(pool_size=(2,2)))
Model.add(Flatten())
Model.add(Dense(256,activation='relu'))
Model.add(Dense(9,activation='softmax'))
MobilenetV2:
height=224
width=224
img_shape=(height, width, 3)
dropout=.3
lr=.001
class_count=9 # number of classes
img_shape=(height, width, 3)
base_model=tf.keras.applications.MobileNetV2( include_top=False, input_shape=img_shape, pooling='max', weights='imagenet')
x=base_model.output
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(512, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
bias_regularizer=regularizers.l1(0.006) ,activation='relu', kernel_initializer= tf.keras.initializers.GlorotUniform(seed=123))(x)
x=Dropout(rate=dropout, seed=123)(x)
output=Dense(class_count, activation='softmax',kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123))(x)
Model = keras.models.Model(inputs=base_model.input, outputs=output)
Model.compile( loss='categorical_crossentropy', metrics=['accuracy'],optimizer='Adamax')
My Rlronp:
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=1, verbose=1, mode='auto', min_delta=0.0001, cooldown=0, min_lr=0)
My train_image_gen:
train_image_gen = image_gen.flow_from_directory(train_path,
target_size=image_shape[:2],
color_mode='rgb',
batch_size=batch_size,
class_mode='categorical')
My test_image_gen:
test_image_gen = image_gen.flow_from_directory(test_path,
target_size=image_shape[:2],
color_mode='rgb',
batch_size=batch_size,
class_mode='categorical',shuffle=False)
My earlystop:
early_stop = EarlyStopping(monitor='val_loss',patience=4)
My Model fit:
results = Model.fit(train_image_gen,epochs=20,
validation_data=test_image_gen,callbacks=[rlronp,early_stop],class_weight=class_weight
)
Training and accuracy:
Epoch 20/20 200/200 [==============================] - 529s 3s/step -
loss: 0.3995 - accuracy: 0.9925 - val_loss: 0.8637 - val_accuracy: 0.8258
My problem is when i predict an image from test set, it predicts the false class, 90% of time.
For example here, it has to be 3rd class, but max is on 2nd class.
array([[0.08064549, 0.04599327, 0.27055973, 0.05219262, 0.055945 ,
0.25723988, 0.07608379, 0.10404343, 0.05729679]], dtype=float32)
I tried collecting my own dataset with 156 class and 2.5k images, but it was even worse.
My loss on 20 epochs:

accuracy: 0.9925; val_accuracy: 0.8258
Clearly the model is overfitted,
Try using regularization techniques such as L2,L1 or Dropout, they will work.
Try to Collect More data(Or use data augumentation)
Or search for other Neural Network Architectures
The best method is plot val_loss v/s loss
r = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=15)
import matplotlib.pyplot as plt
plt.plot(r.history['loss'], label='loss')
plt.plot(r.history['val_loss'], label='val_loss')
plt.legend()
and check the point where loss and val_loss meet each other and then at the point of intersection see the number of epochs (say x) and train the model for x epochs only.
Hope you will find this useful.

Model is overfitted...use dropout layer.. I think it will help
Model.add(Dropout(0.2))

can the accuracy measure val_acc be trusted?

Building a sequence
simple_seq= [x for x in list(range(1000)) if x % 3 == 0]
after reshape and split
x_train, x_test shape = (159, 5, 1)
y_train, y_test shape = (159, 2)
Model
model = Sequential(name='acc_test')
model.add(Conv1D(
kernel_size = 2,
filters= 128,
strides= 1,
use_bias= True,
activation= 'relu',
padding='same',
input_shape=(x_train.shape[1], x_train.shape[2])))
model.add(AveragePooling1D(pool_size =(2), strides= [1]))
model.add(Flatten())
model.add(Dense(2))
optimizer = Adam(lr=0.001)
model.compile( optimizer= optimizer, loss= 'mse', metrics=['accuracy'])
Train
hist = model.fit(
x=x_train,
y = y_train,
epochs=100,
validation_split=0.2)
The Result:
Epoch 100/100
127/127 [==============================] - 0s 133us/sample - loss: 0.0096 - acc: 1.0000 - val_loss: 0.6305 - val_acc: 1.0000
But if using this model to predict:
x_test[-1:] = array([[[9981],
[9984],
[9987],
[9990],
[9993]]])
model.predict(x_test[-1:])
result is: array([[10141.571, 10277.236]], dtype=float32)
How can the vall_acc be 1 if the result is so far from the truth, result was
step 1 2
true [9996, 9999 ]
pred [10141.571, 10277.236]

Accuracy metric is only valid for classification tasks. Therefore, if you use accuracy as the metric in a regression tasks, the reported metric values may not be valid at all. From your code, I feel that you are having a regression task, so this shouldn't be used.
Below is a list of the metrics that you can use in Keras on regression problems.
Mean Squared Error: mean_squared_error, MSE or mse
Mean Absolute Error: mean_absolute_error, MAE, mae
Mean Absolute Percentage Error: mean_absolute_percentage_error, MAPE, mape
Cosine Proximity: cosine_proximity, cosine
You can read about some theory at link and see some keras exampler code at link.
Sorry, little short of time, but I am sure these links will really help you. :)

By the range of your true/predicted values and used loss - it seems like you're trying to solve regression problem, not classification.
So if I understood you correctly - you're trying to predict two numeric values based on input - instead of predicting, which of two classes is valid for these input.
If so - you shouldn't use accuracy metric. Because it'll just compare indices of maximal input for each input sample/prediction (a bit simplified). E.g. 9996 < 9999 and 10141.571 < 10277.236.

ResNet50 Model is not learning with transfer learning in keras

I am trying to perform transfer learning on ResNet50 model pretrained on Imagenet weights for PASCAL VOC 2012 dataset. As it is a multi label dataset, I am using sigmoid activation function in the final layer and binary_crossentropy loss. The metrics are precision,recall and accuracy. Below is the code I used to build the model for 20 classes (PASCAL VOC has 20 classes).
img_height,img_width = 128,128
num_classes = 20
#If imagenet weights are being loaded,
#input must have a static square shape (one of (128, 128), (160, 160), (192, 192), or (224, 224))
base_model = applications.resnet50.ResNet50(weights= 'imagenet', include_top=False, input_shape= (img_height,img_width,3))
x = base_model.output
x = GlobalAveragePooling2D()(x)
#x = Dropout(0.7)(x)
predictions = Dense(num_classes, activation= 'sigmoid')(x)
model = Model(inputs = base_model.input, outputs = predictions)
for layer in model.layers[-2:]:
layer.trainable=True
for layer in model.layers[:-3]:
layer.trainable=False
adam = Adam(lr=0.0001)
model.compile(optimizer= adam, loss='binary_crossentropy', metrics=['accuracy',precision_m,recall_m])
#print(model.summary())
X_train, X_test, Y_train, Y_test = train_test_split(x_train, y, random_state=42, test_size=0.2)
savingcheckpoint = ModelCheckpoint('ResnetTL.h5',monitor='val_loss',verbose=1,save_best_only=True,mode='min')
earlystopcheckpoint = EarlyStopping(monitor='val_loss',patience=10,verbose=1,mode='min',restore_best_weights=True)
model.fit(X_train, Y_train, epochs=epochs, validation_data=(X_test,Y_test), batch_size=batch_size,callbacks=[savingcheckpoint,earlystopcheckpoint],shuffle=True)
model.save_weights('ResnetTLweights.h5')
It ran for 35 epochs until earlystopping and the metrics are as follows (without Dropout layer):
loss: 0.1195 - accuracy: 0.9551 - precision_m: 0.8200 - recall_m: 0.5420 - val_loss: 0.3535 - val_accuracy: 0.8358 - val_precision_m: 0.0583 - val_recall_m: 0.0757
Even with Dropout layer, I don't see much difference.
loss: 0.1584 - accuracy: 0.9428 - precision_m: 0.7212 - recall_m: 0.4333 - val_loss: 0.3508 - val_accuracy: 0.8783 - val_precision_m: 0.0595 - val_recall_m: 0.0403
With dropout, for a few epochs, the model is reaching to a validation precision and accuracy of 0.2 but not above that.
I see that precision and recall of validation set is pretty low compared to training set with and without dropout layer. How should I interpret this? Does this mean the model is overfitting. If so, what should I do? As of now the model predictions are quite random (totally incorrect). The dataset size is 11000 images.

Please can you modify code as below and try to execute
From:
predictions = Dense(num_classes, activation= 'sigmoid')(x)
To:
predictions = Dense(num_classes, activation= 'softmax')(x)
From:
model.compile(optimizer= adam, loss='binary_crossentropy', metrics=['accuracy',precision_m,recall_m])
To:
model.compile(optimizer= adam, loss='categorical_crossentropy', metrics=['accuracy',precision_m,recall_m])

This question is pretty old, but I'll answer it in case it is helpful to someone else:
In this example, you froze all layers except by the last two (Global Average Pooling and the last Dense one). There is a cleaner way to achieve the same state:
rn50 = applications.resnet50.ResNet50(weights='imagenet', include_top=False,
input_shape=(img_height, img_width, 3))
x = rn50.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(num_classes, activation= 'sigmoid')(x)
model = Model(inputs = base_model.input, outputs = predictions)
rn50.trainable = False # <- this
model.compile(...)
In this case, features are being extracted from the ResNet50 network and fed to a linear softmax classifier, but the ResNet50's weights are not being trained. This is called feature extraction, not fine-tuning.
The only weights being trained are from your classifier, which was instantiated with weights drawn from a random distribution, and thus should be entirely trained. You should be using Adam with its default learning rate:
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.001))
So you can train it for a few epochs, and, once it's done, then you unfreeze the backbone and "fine-tune" it:
backbone.trainable = False
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.001))
model.fit(epochs=50)
backbone.trainable = True
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.00001))
model.fit(epochs=60, initial_epoch=50)
There is a nice article about this on Keras website: https://keras.io/guides/transfer_learning/

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Discrepancy in the results of model.evaluate and model.predict in Keras - python

Related

Resuming neural network training after a certain epoch in Keras

How to fix no validation accuracy?

High accuracy but bad predictions on Keras Tensorflow

can the accuracy measure val_acc be trusted?

ResNet50 Model is not learning with transfer learning in keras

Categories

Resources