Very Bad predictions while achieving high validation accuracy - python

I have trained a model that achieved a validation accuracy of 96% and very low validation loss but when I test it on other images , the prediction accuracy is bad in comparaison with the validation one , I have tried to process the images in the validation and test phase with the same parameters but the issue is still occurring , any ideas how i can fix it
here is the code :
directory = '/Users/anastalib/PycharmProjects/pythonProject1/Banana2'
img_width, img_height = 100, 100
img_datagen = ImageDataGenerator(
validation_split=0.2
, rescale=1. / 255, )
# test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = img_datagen.flow_from_directory(directory,
shuffle=True,
batch_size=16,
subset='training',
target_size=(img_width, img_height))
valid_generator = img_datagen.flow_from_directory(directory,
shuffle=False,
batch_size=16,
subset='validation',
target_size=(img_width, img_height))
resnet_model = Sequential()
pretrained_model = tf.keras.applications.ResNet50(include_top=False,
input_shape=(100, 100, 3),
pooling='avg',
weights='imagenet')
for layer in pretrained_model.layers:
layer.trainable = False
resnet_model.add(pretrained_model)
resnet_model.add(Flatten())
resnet_model.add(Dense(256, activation='relu'))
resnet_model.add(Dropout(0.5))
resnet_model.add(Dense(3, activation='softmax'))
resnet_model.summary()
resnet_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
t1 = time.time()
print(datetime.datetime.now())
history = resnet_model.fit_generator(train_generator, validation_data=valid_generator,
steps_per_epoch=train_generator.n // train_generator.batch_size,
validation_steps=valid_generator.n // valid_generator.batch_size,
epochs=5)
print("Training took %s seconds" % (time.time() - t1))
path = 'overripe.jpg'
img = tf.keras.utils.load_img(path, target_size=(100, 100))
x = tf.keras.utils.img_to_array(img)
# Rescale image.
x = x / 255.
x = np.expand_dims(x, axis=0)
images = np.vstack([x])
classes = resnet_model.predict(images, batch_size=10)
print(np.argmax(classes))

What you describe seems a case of poor generalizability and your training/validation data are not representative a real data.
What you can try is to pool validation and test data, split again, retrain the model and see if the metrics improve.
A possible explanation is that there is a bias in your dataset, so the model is not really learning but is looking for something specific. Here you can find a better description of the phenomenon: https://pair.withgoogle.com/explorables/saliency/?linkId=8403074
Generally speaking, you can solve with more data or other solutions specific to the use case (in my experience, with medical images, color normalization was particularly important, as a example).

Related

Low accuracy after testing hyperparameters

I am using VGG19 pre-trained model with ImageNet weights to do transfer-learning on 4 classes with keras. However I do not know if there really is a difference between these 4 classes, I'd like to discover it. The goal would be to discover if these classes make sense or if there is no difference between these images classes.
These classes are made up of abstract paintings from the same individual.
I tried different models with different hyperparameters (Adam/SGD, learning rate, dropout, l2 regularization, FC layers size, batch size, unfreeze, and also weighted classes as the data is a little bit unbalanced
batch_size = 32
unfreeze = 17
dropout = 0.2
fc = 256
lr = 1e-4
l2_reg = 0.1
train_datagen = ImageDataGenerator(
preprocessing_function = preprocess_input,
horizontal_flip=True,
vertical_flip=True,
fill_mode='nearest'
)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input)
train_generator = train_datagen.flow_from_directory(
'C:/Users/train',
target_size=(224, 224),
batch_size=batch_size,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
'C:/Users/test',
target_size=(224, 224),
batch_size=batch_size,
class_mode='categorical')
base_model = VGG19(
weights="imagenet",
input_shape=(224, 224, 3),
include_top=False,
)
last_layer = base_model.get_layer('block5_pool')
last_output = last_layer.output
x = Flatten()(last_output)
x = GlobalMaxPooling2D()(last_output)
x = Dense(fc)(x)
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout)(x)
x = Dense(fc, activation='relu', kernel_regularizer = regularizers.l2(l2=l2_reg))(x)
x = layers.Dense(4, activation='softmax')(x)
model = Model(base_model.input, x)
for layer in model.layers:
layer.trainable = False
for layer in model.layers[unfreeze:]:
layer.trainable = True
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(learning_rate = lr),
metrics=['accuracy'])
class_weights = class_weight.compute_class_weight('balanced',
np.unique(train_generator.classes),
train_generator.classes)
class_weights_dict = dict(enumerate(class_weights))
history = model.fit(train_generator, epochs=epochs, validation_data=validation_generator,
validation_steps=392//batch_size,
steps_per_epoch=907//batch_size)
plot_model_history(history)
I also did feature extractions at every layer, and fed the extracted features to a SVM (for each layer), and the accuracy of these SVM was about 40%, which is higher than this model (30 to 33%). So, I may be wrong but I think this model could achieve a higher accuracy.
I have a few questions about my model.
First, is my code correct, or am I doing something wrong ?
If the validation set accuracy for a 4-classes classification task is ~30% (assuming the data are balanced or weighted), is it likely or very not likely to be able to improve it to something significantly better with other hyperparameters ?
What else can I try to have a better accuracy ?
When and how can I conclude that these classes do not make sense ?

Keras model predicts same class

I am new in the field of Deep Learning and I tried to train a model for image classification. I used a pre-trained model (ResNet50) and added own layers.
The Dataset I use for training contains about 1000 images for each class and I separated it in train and test set.
My problem is, that if I evaluate the Model with model.evaluate(test_set_generator) I get an accuracy of about 90%
If I load an Image and predict with model.predict(img) the result is always the same class
My generators:
img_height = 128
img_width = 128
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
data_dir_path,
target_size=(img_height, img_width),
batch_size=16,
shuffle=True,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
test_dir_path,
target_size=(img_height, img_width),
batch_size=16,
class_mode='categorical')
my model:
base_model = tf.keras.applications.ResNet50(input_shape=(img_height,img_width,3),
include_top=False,
weights='imagenet')
prediction_layer = tf.keras.layers.Dense(5)
model = models.Sequential()
model.add(base_model)
model.add(tf.keras.layers.GlobalAveragePooling2D())
model.add(prediction_layer)
base_learning_rate = 0.0005
model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=base_learning_rate),
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
How I am loading an Image:
test_image = image.load_img(path_to_image, target_size=(128, 128))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis=0)
I tried to load and predict every image from my test set and I got always the same result (that is a small output, but more or less every output looks the same):
[[ -38774.88 -228962.86 20932.826 -169404.3 -265980.06 ]]
[[ -54851.016 -320424.4 31585.99 -236997.28 -374307.2 ]]
[[ -36518.344 -212326.48 18832.361 -156810.19 -244721.2 ]]
[[ -31010.965 -196458.73 19816.562 -146228.39 -230922.06 ]]
[[ -37712.95 -222710.1 19780.334 -164643.36 -256392.48 ]]
I cant understand why the evaluation gets correct results and the prediction dont. I predicted the test_set_generator with model.predict(test_set_generator) and I got results that looked fine to me. The results were not always the same.
I tried to change the learning rate, more layers, a dropout layer, different amount of epochs and steps per epoch, a different pre-trained model and different batch sizes.
I am thankful for any suggestions
Your model expects the image values to be in range (0, 1).
Try with:
test_image = image.load_img(path_to_image, target_size=(128, 128))
test_image = image.img_to_array(test_image) / 255 # < - division by 255
test_image = np.expand_dims(test_image, axis=0)
There is two errors in your code :
First when you call a Dense layer without activation parameters, it will be a linear activation by default, in a multi-class prob we want a softmax activation
prediction_layer = tf.keras.layers.Dense(5, activation = "softmax")
Secondly, the loss, you are using binary_crossentropy, a loss used for binary classification, but here we, once again, have a multi-class problem, so you need to use the categorical_crossentropy loss
model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=base_learning_rate),
loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy'])

how to fix overfitting or where is my fault in my code

I'm using a pre-trained model Vgg16 to do 100 classification problem. The dataset is tiny-imagenet, each class has 500 images, and I random choose 100 class from tiny-imagenet for my training(400) and validation(100) data. So I change input_shape of vgg16 for 32*32 size.
The results always look like overfitting. Training acc is high, but val_acc always stuck at almost 40%.
I used dropout, regularization L2, data augmentation ... , but val_acc is also stuck at almost 40%.
How could I do for overfitting or correct my code.
Thanks
img_width, img_height = 32, 32
epochs = 50
learning_rate = 1e-4
steps_per_epoch = 2500
train_path='./training_set_100A/'
valid_path='./testing_set_100A/'
test_path='./testing_set_100A/'
class_num = 100
train_batches = ImageDataGenerator(rescale=1. / 255
,rotation_range=20, zoom_range=0.15,
width_shift_range=0.2, height_shift_range=0.2,
shear_range=0.15,
horizontal_flip=True, fill_mode="nearest"
).flow_from_directory(
train_path, target_size=(img_width,img_height),
batch_size=32, shuffle=True)
valid_batches = ImageDataGenerator(rescale=1. / 255).flow_from_directory(
valid_path, target_size=(img_width,img_height),
batch_size=10, shuffle=False)
test_batches = ImageDataGenerator(rescale=1. / 255).flow_from_directory(
test_path, target_size=
(img_width,img_height),batch_size=10,shuffle=False)
seqmodel = Sequential()
VGG16Model = VGG16(weights='imagenet', include_top=False)
input = Input(shape=(img_width, img_height, 3), name='image_intput')
output_vgg16_conv = VGG16Model(input)
x = Flatten()(output_vgg16_conv)
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(class_num, activation='softmax')(x)
funcmodel = Model([input], [x])
funcmodel.summary()
funcmodel.compile(optimizer=SGD(lr=learning_rate, momentum=0.9),
loss='categorical_crossentropy', metrics=['accuracy'])
train_history = funcmodel.fit_generator(train_batches,
steps_per_epoch=steps_per_epoch, validation_data=valid_batches,
validation_steps=1000, epochs=epochs, verbose=1)
`
It seems you followed examples of implementing this from other sites, but you're training samples are very small to train the 2 new Dense layers of 4096 size each.
you have to either lower the size of you layers or add a lot more samples 20,000 instead of 500.
1) 50epoch is too much. Try running smaller epoch?
2) Check your validation accuracy for every epoch?
3) VGG is too deep for your small(32 * 32) image data. Try building your own network with lesser number of parameters. or Try Lenet?

My model has high accuracy and val_accuracy but giving wrong result on test data

I have created some images using opencv and i am running a deep neural network classifier on it.
It gives around 97% accuracy and 95% val_accuracy but when i test it, it gives wrong predictions.
Here is my code to create images.
import cv2
import numpy as np
import random
import os
size = 64
def circle(i,d):
img = np.zeros(shape=(size,size,3))
point = (random.randint(1,size),random.randint(1,size))
img = cv2.circle(img,point,random.randint(1,size),(255,255,0),thickness=2,lineType=8)
if not os.path.exists(d+"/circle"):
os.makedirs(d+"/circle")
cv2.imwrite(d+"/circle/"+str(i)+"circle.png",img)
#print("created circle"+str(i))
def rectangle(i,d):
img = np.zeros(shape=(size,size,3))
point = (random.randint(1,size),random.randint(1,size))
w = random.randint(1,size);
h = random.randint(1,size);
point2 = (point[0] + w,point[1]+h)
img = cv2.rectangle(img,point,point2,(255, 255, 0), 2)
if not os.path.exists(d+"/react"):
os.makedirs(d+"/react")
cv2.imwrite(d+"/react/"+str(i)+"react.png",img)
#print("created reactangle"+str(i))
def traingle(i,d):
img = np.zeros(shape=(size,size,3))
point1 = (random.randint(1,size),random.randint(1,size))
point2 = (random.randint(1,size),random.randint(1,size))
point3 = (random.randint(1,size),random.randint(1,size))
img = cv2.line(img,point1,point2,(255, 255, 0), 2)
img = cv2.line(img,point2,point3,(255, 255, 0), 2)
img = cv2.line(img,point3,point1,(255, 255, 0), 2)
if not os.path.exists(d+"/tra"):
os.makedirs(d+"/tra")
cv2.imwrite(d+"/tra/"+str(i)+"tra.png",img)
#print("created trangle"+str(i))
if not os.path.exists("data_train"):
os.makedirs('data_train')
for i in range(1,2000):
circle(i,"data_train")
rectangle(i,"data_train")
traingle(i,"data_train")
print("Created test data")
if not os.path.exists("data_test"):
os.makedirs('data_test')
for i in range(1,500):
circle(i,"data_test")
rectangle(i,"data_test")
traingle(i,"data_test")
And here is my code for classification.
# importing libraries
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import MaxPooling2D,Dropout, Convolution2D
from keras.layers import Flatten, Dense
from keras import backend as K
img_width, img_height = 64, 64
train_data_dir = 'data_train'
validation_data_dir = 'data_test'
nb_train_samples = 5997
nb_validation_samples = 1497
epochs = 3
batch_size = 15
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape = input_shape,activation="relu"))
model.add(MaxPooling2D(pool_size =(2, 2)))
model.add(Convolution2D(32, 3, 3,activation="relu"))
model.add(MaxPooling2D(pool_size =(2, 2)))
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(output_dim=180,activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(3,activation="softmax"))
model.compile(loss ='categorical_crossentropy',
optimizer ='adam',
metrics =['categorical_accuracy'])
train_datagen = ImageDataGenerator(
rescale = 1. / 255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = False)
test_datagen = ImageDataGenerator(rescale = 1. / 255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size =(img_width, img_height),
batch_size = batch_size, class_mode ='categorical')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size =(img_width, img_height),
batch_size = batch_size, class_mode ='categorical')
model.fit_generator(train_generator,
steps_per_epoch = nb_train_samples,
epochs = epochs, validation_data = validation_generator,
validation_steps = nb_validation_samples)
I have tried
1. Change the number of hidden layer
2. Add dropout layer before final layer and after first layer.
2. Add conv layer.
Any please suggest me what i am doing wrong.
Thanks in advance.
Most likely reason of this issue is your test set and training set are not from the same sample. This is so common in classification problems. Before training, you should compare the class distributions and the feature distributions of training and test sets. If they are not close to each other, the rules learned from the training set doesn't generalize to test set.
For example a training set class distributions are 70% of class 1, 20% class 2 and 10% class 3. Since the cross-validation comes from the training set, the model has a high training and cross-validation accuracy. However, the model may not perform well if the test set class distributions are like 10% class 1, 20% class 2 and 70% class 3.
Another probable reason for this issue is overfitting
since you are getting high training and validation accuracy
The commonly used methodologies are:
Cross- Validation: A standard way to find out-of-sample prediction error is to use 5-fold cross validation.
Early Stopping: Its rules provide us the guidance as to how many iterations can be run before learner begins to over-fit.
Pruning: Pruning is extensively used while building related models. It simply removes the nodes which add little predictive power for the problem in hand.
Regularization: It introduces a cost term for bringing in more features with the objective function. Hence it tries to push the coefficients for many variables to zero and hence reduce cost term.

Why is my validation accuracy stuck around 65% and how do i increase it?

I'm making an image classification CNN with 5 classes with each having 693 images with a width and height of 224px using VGG16, but my validation accuracy is stuck after 15-20 epochs around 60% - 65%.
I'm already using some data augmentation, batch normalization, and dropout and I have frozen the first 5 layers but I can't seem to increase my accuracy more than 65%.
these are my own layers
img_rows, img_cols, img_channel = 224, 224, 3
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_rows, img_cols, img_channel))
for layer in base_model.layers[:5]:
layer.trainable = False
add_model = Sequential()
add_model.add(Flatten(input_shape=base_model.output_shape[1:]))
add_model.add(Dropout(0.5))
add_model.add(Dense(512, activation='relu'))
add_model.add(BatchNormalization())
add_model.add(Dropout(0.5))
add_model.add(Dense(5, activation='softmax'))
model = Model(inputs=base_model.input, outputs=add_model(base_model.output))
model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizers.Adam(lr=0.0001),
metrics=['accuracy'])
model.summary()
and this is my dataset with my model
batch_size = 64
epochs = 25
train_datagen = ImageDataGenerator(
rotation_range=30,
width_shift_range=.1,
height_shift_range=.1,
horizontal_flip=True)
train_datagen.fit(x_train)
history = model.fit_generator(
train_datagen.flow(x_train, y_train, batch_size=batch_size),
steps_per_epoch=x_train.shape[0] // batch_size,
epochs=epochs,
validation_data=(x_test, y_test),
callbacks=[ModelCheckpoint('VGG16-transferlearning.model', monitor='val_acc', save_best_only=True)]
)
I want to get a higher accuracy because what I get now is just not enough so any help or suggestions would be appreciated.
A few things you can try are:
Reduce your batch size.
Choose another optimizer: RMSprop, SGD...
Increase the learning rate by default and then use the callback ReduceLROnPlateau
But, as usual, it depends on the data you are using. Are well balanced?

Categories