ResNet50 Model Always Predicts 1 Class - python

I am working on a ResNet50 model to predict covid/non-covid presence in chest x-rays. However, my model currently only predicts class label 1... I have tried 3 different optimizers, 2 different loss functions, changing the learning rate multiple times from 1e-6 to 0.5, and changing the weights on the class labels...
Does anyone have any ideas what the issue could be? Why does it always predict class label 1?
Here is the code:
# import data
# train_ds = tf.keras.utils.image_dataset_from_directory(
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
DATASET_PATH+"Covid/",
labels="inferred",
batch_size=64,
image_size=(256, 256),
shuffle=True,
seed=COVID_SEED,
validation_split=0.2,
subset="training",
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
DATASET_PATH+"Covid/",
labels="inferred",
batch_size=64,
image_size=(256, 256),
shuffle=True,
seed=COVID_SEED,
validation_split=0.2,
subset="validation",
)
# split data
train_X = list()
train_y = list()
test_X = list()
test_y = list()
for image_batch_train, labels_batch_train in train_ds:
for index in range(0, len(image_batch_train)):
train_X.append(image_batch_train[index])
train_y.append(labels_batch_train[index])
for image_batch, labels_batch in val_ds:
for index in range(0, len(image_batch)):
test_X.append(image_batch[index])
test_y.append(labels_batch[index])
Conv_Base = ResNet50(weights=None, input_shape=(256, 256, 3), classes=2)
# The Convolutional Base of the Pre-Trained Model will be added as a Layer in this Model
for layer in Conv_Base.layers[:-8]:
layer.trainable = False
model = Sequential()
model.add(Conv_Base)
model.add(Flatten())
model.add(Dense(units = 1024, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(units = 1, activation = 'sigmoid'))
model.summary()
opt = Adadelta(learning_rate=0.3)
model.compile(optimizer = opt, loss = 'BinaryCrossentropy', metrics = ['accuracy'])
# try to add class weights to make it predict 0, since we currently only predict class label 1
class_weight = {0: 50.,
1: 1.}
r=model.fit(x = train_ds, validation_data = val_ds, epochs = COVID_EPOCHS, class_weight=class_weight)
#print the class labels of prediction
predictions = model.predict(val_ds)
predictions = np.ndarray.flatten(predictions)
predictions = np.where(predictions < 0, 0, 1) # Convert to 0 and 1.
np.set_printoptions(threshold=np.inf)
print(predictions)

Well done! I'll leave an answer here as well because I think you need to do more besides normalization.
When the weights are None (see here) the resnet weights are randomized. You are using a large convolutional feature extractor (the first layers of a Resnet) but this extractor was not trained on anything. You may achieve decent performance because the Dense layer that succeeds it compensates for this random initialization but chances are it's not what you're aiming for. Keep in mind your resnet weights are not trainable, so the feature extraction will never change.
The reason I suggested imagenet weights is because you're working with images and therefore it's reasonable to assume that your convolutional feature extractor needs to extract important image features such as colors, shapes, edges etc. The fact that the imagenet resnet was trained on 1000 classes or so is irrelevant because you chop it off before it reaches the output layer, which is where the class number bottleneck occurs. I would pursue the weights = 'imagenet' thing.

Related

Low accuracy after testing hyperparameters

I am using VGG19 pre-trained model with ImageNet weights to do transfer-learning on 4 classes with keras. However I do not know if there really is a difference between these 4 classes, I'd like to discover it. The goal would be to discover if these classes make sense or if there is no difference between these images classes.
These classes are made up of abstract paintings from the same individual.
I tried different models with different hyperparameters (Adam/SGD, learning rate, dropout, l2 regularization, FC layers size, batch size, unfreeze, and also weighted classes as the data is a little bit unbalanced
batch_size = 32
unfreeze = 17
dropout = 0.2
fc = 256
lr = 1e-4
l2_reg = 0.1
train_datagen = ImageDataGenerator(
preprocessing_function = preprocess_input,
horizontal_flip=True,
vertical_flip=True,
fill_mode='nearest'
)
test_datagen = ImageDataGenerator(preprocessing_function = preprocess_input)
train_generator = train_datagen.flow_from_directory(
'C:/Users/train',
target_size=(224, 224),
batch_size=batch_size,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
'C:/Users/test',
target_size=(224, 224),
batch_size=batch_size,
class_mode='categorical')
base_model = VGG19(
weights="imagenet",
input_shape=(224, 224, 3),
include_top=False,
)
last_layer = base_model.get_layer('block5_pool')
last_output = last_layer.output
x = Flatten()(last_output)
x = GlobalMaxPooling2D()(last_output)
x = Dense(fc)(x)
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout)(x)
x = Dense(fc, activation='relu', kernel_regularizer = regularizers.l2(l2=l2_reg))(x)
x = layers.Dense(4, activation='softmax')(x)
model = Model(base_model.input, x)
for layer in model.layers:
layer.trainable = False
for layer in model.layers[unfreeze:]:
layer.trainable = True
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(learning_rate = lr),
metrics=['accuracy'])
class_weights = class_weight.compute_class_weight('balanced',
np.unique(train_generator.classes),
train_generator.classes)
class_weights_dict = dict(enumerate(class_weights))
history = model.fit(train_generator, epochs=epochs, validation_data=validation_generator,
validation_steps=392//batch_size,
steps_per_epoch=907//batch_size)
plot_model_history(history)
I also did feature extractions at every layer, and fed the extracted features to a SVM (for each layer), and the accuracy of these SVM was about 40%, which is higher than this model (30 to 33%). So, I may be wrong but I think this model could achieve a higher accuracy.
I have a few questions about my model.
First, is my code correct, or am I doing something wrong ?
If the validation set accuracy for a 4-classes classification task is ~30% (assuming the data are balanced or weighted), is it likely or very not likely to be able to improve it to something significantly better with other hyperparameters ?
What else can I try to have a better accuracy ?
When and how can I conclude that these classes do not make sense ?

GradCAM with Transfer-Learning and augmentation layers at imput

I'm trying to implement GradCAM to a transfer-learning model. For that reason I need an additional output from the last Convolutional Layer of the base model.
My model consists of preprocessing/augmentation layers, pretrained MobileNet and a custom head. When MobileNet is implemented one functional layer I always get a disconnected graph error. And because of augmentation layers at the beginning I didn't manage to implement MobileNet as single layers, as other solutions proposed. Thanks a lot for any help!
# transfer-learning model
base_model = MobileNetV2(input_shape=(224, 224, 3), include_top=False, weights='imagenet')
inputs = Input(shape=(224, 224, 3))
augmented = RandomFlip("horizontal")(inputs)
augmented = RandomRotation(0.1)(augmented)
augmented = RandomZoom(height_factor=(0.0, 0.3), width_factor=(0.0, 0.3),
fill_mode='constant')(augmented)
mobilenet = base_model(augmented)
pooling = GlobalAveragePooling2D()(mobilenet)
dropout = Dropout(0.5)(pooling)
outputs = Dense(len(classes), activation="softmax")(dropout)
model = Model(inputs=inputs, outputs=outputs)
model.summary()
And here's my model for GradCAM:
gradModel = Model(inputs=[model.inputs],
outputs=[model.get_layer('mobilenetv2_1.00_224').get_layer('Conv_1').output,
model.output])
I had a similar problem and ended up implementing the augmentation at the dataset level rather than in the model layers.
train_ds = tf.keras.utils.image_dataset_from_directory(
train_dir,
validation_split=0.3,
label_mode='categorical',
subset="training",
seed=s,
color_mode="rgb",
image_size=image_size,
batch_size=batch_size,
)
data_augmentation = tf.keras.Sequential([
tf.keras.layers.RandomFlip("horizontal"),
tf.keras.layers.RandomRotation(0.1),
tf.keras.layers.RandomZoom(height_factor=(0.0, 0.3), width_factor=(0.0, 0.3), fill_mode='constant')
])
train_ds = train_ds.map(
lambda x, y: (data_augmentation(x, training=True), y)
)
I would then feed this data into the model and it had the desired effect.
model.fit(train_ds, EPOCHS)

how to reducing validation loss and improving the test result in CNN Model

How may I improve the valid accuracy? Besides that, my test accuracy is also low. I am trying to do categorical image classification on pictures about weeds detection in the agriculture field.
Dataset: The total number of images is 5539 with 12 classes where 70% (3870 images) of Training set 15% (837 images) of Validation and 15% (832 images) of Testing set
#data augmentation by applying Augmentor
train_aug = Augmentor.Pipeline(source_directory="/content/dataset/train",
output_directory="/content/dataset/train")
# Defining augmentation parameters and generating 17600 samples
train_aug.flip_left_right(probability=0.4)
train_aug.flip_top_bottom(probability=0.8)
train_aug.rotate(probability=0.5, max_left_rotation=5, max_right_rotation=10)
train_aug.skew(0.4, 0.5)
train_aug.zoom(probability = 0.2, min_factor = 1.1, max_factor = 1.5)
train_aug.sample(17600)
def cnn_model():
Model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters = 32, kernel_size = (3,3) , activation ='relu',input_shape=(224,224,3)),
tf.keras.layers.MaxPool2D(2,2),
tf.keras.layers.Conv2D(filters = 96, kernel_size = (3,3) , activation ='relu'),
tf.keras.layers.MaxPool2D(2,2),
tf.keras.layers.Conv2D(filters = 150, kernel_size = (3,3) , activation ='relu'),
tf.keras.layers.MaxPool2D(2,2),
tf.keras.layers.Conv2D(filters = 64, kernel_size = (3,3) , activation ='relu'),
tf.keras.layers.MaxPool2D(2,2),
tf.keras.layers.Flatten() ,
tf.keras.layers.Dense(512,activation='relu') ,
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(416, [enter image description here][1]activation='relu') ,
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(12,activation='softmax') ,
])
Model.summary()
return Model
Model = cnn_model()
from tensorflow.keras.callbacks import ModelCheckpoint
checkpoint = ModelCheckpoint(filepath, monitor='accuracy', verbose=1, save_best_only=False, save_weights_only=True, mode='auto')
callbacks_list = [checkpoint]
model= Model.compile(tf.keras.optimizers.Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
History = Model.fit_generator(generator= train_data, steps_per_epoch= 3333//BATCH_SIZE , epochs= NO_OF_EPOCHS , validation_data= valid_data, validation_steps=1 ,callbacks=callbacks_list)
How may I increase my valid accuracy where my training accuracy is 98% and validation accuracy is 71%?
I would adjust the number of filters to size to 32, then 64, 128, 256. Then I would replace the flatten layer with
tf.keras.layers.GlobalAveragePooling2D()
I would also remove the checkpoint callback and replace with
es=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=3,
verbose=1, restore_best_weights=True)
rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1,
verbose=1)
callback_list=[es, rlronp]
the early stopping callback will monitor validation loss and if it fails to reduce after 3 consecutive epochs it will halt training and restore the weights from the best epoch to the model. The ReduceLROnPlateau callback will monitor validation loss and reduce the learning rate by a factor of .5 if the loss does not reduce at the end of an epoch. Run this and if it does not do much better you can try to use a class_weight dictionary to try to compensate for the class imbalance. To calculate the dictionary find the class that has the HIGHEST number of samples. Then the weight for each class is
weight for class=highest number of samples/samples in class. So create a dictionary of the
form class integer:weight. Be careful to keep the order of the classes correct.
Also to help with the imbalance you can try image augmentation. If you use ImageDataGenerator.flow_from_directory to read in your data you can use the generator to provide image augmentation like horizontal flip. If not you can use the Keras augmentation layers directly in your model. Documentation is here..

Fine-tuning ResNet50 with Keras - val_loss keeps increasing

I am trying to customize resnet50 using keras with a tensorflow backend. However, upon tranining my val_loss keeps increasing. Trying different learning rates and batch sizes does not resolve the problem.
Using different preprocessing methods such as rescaling or using the preprocess_input function for resnet50 inside the ImageDataGenerator did not not solve the problem either.
This is the code I am using
Importing and preprocessing data:
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.resnet50 import preprocess_input, decode_predictions
IMAGE_SIZE = 224
BATCH_SIZE = 32
num_classes = 27
main_path = "C:/Users/aaron/Desktop/DATEN/data"
gesamt_path = os.path.join(main_path, "ML_DATA")
labels = listdir(gesamt_path)
data_generator = ImageDataGenerator(#rescale=1./255,
validation_split=0.20,
preprocessing_function=preprocess_input)
train_generator = data_generator.flow_from_directory(gesamt_path, target_size=(IMAGE_SIZE, IMAGE_SIZE), shuffle=True, seed=13,
class_mode='categorical', batch_size=BATCH_SIZE, subset="training")
validation_generator = data_generator.flow_from_directory(gesamt_path, target_size=(IMAGE_SIZE, IMAGE_SIZE), shuffle=False, seed=13,
class_mode='categorical', batch_size=BATCH_SIZE, subset="validation")
Defining and training the model
img_width = 224
img_height = 224
model = keras.applications.resnet50.ResNet50()
classes = list(iter(train_generator.class_indices))
model.layers.pop()
for layer in model.layers:
layer.trainable=False
last = model.layers[-1].output
x = Dense(len(classes), activation="softmax")(last)
finetuned_model = Model(model.input, x)
finetuned_model.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
for c in train_generator.class_indices:
classes[train_generator.class_indices[c]] = c
finetuned_model.classes = classes
earlystopCallback = keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=8, verbose=1, mode='auto')
tbCallBack = keras.callbacks.TensorBoard(log_dir='./Graph', histogram_freq=0, write_graph=True, write_images=True)
history = finetuned_model.fit_generator(train_generator,
validation_data=validation_generator,
epochs=85, verbose=1,callbacks=[tbCallBack,earlystopCallback])
You need to match the preprocessing used for the pretrained network, not come up your own preprocessing. Double check the network input tensor, i.e. whether the channel-wise average of your input matches that of the data used for the pretrained network.
It could be that your new data is very different from the data used for the pretrained network. In that case, all BN layers gonna migrate their pretrained mean/var to new values, so an increasing loss is also possible (but eventually the loss should decrease).
In your training you are using a pretrained model (resnet50) changing only the last layer because you want to predict only a few classes and not the 1000 classes the pretrained model was trained on (that's the meaning of transfer learning).
You are freezing all weights and you are not letting your model to train. Try:
model = keras.applications.resnet50.ResNet50(include_top=False, pooling='avg')
for layer in model.layers:
layer.trainable=False
last = model.output
x = Dense(512, activation='relu')(last)
x = Dropout(0.5)(x)
#x = BatchNormalization()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
#x = BatchNormalization()(x)
x = Dense(len(classes), activation="softmax")(x)
You can modify the code above, change 512 number of neurons, add or not dropout/batchnormalization, use as many dense layers as you want....
There is known ""problem"" (strange design) regarding BN in Keras and your bad result may be related to this issue.

How to use InceptionV3 bottlenecks as input in Keras 2.0

I want to use bottlenecks for transfer learning using InceptionV3 in Keras.
I've used some of the tips on creating, loading and using bottlenecks from
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
My problem is that I don't know how to use a bottleneck (numpy array) as input to an InceptionV3 with a new top layer.
I get the following error:
ValueError: Error when checking input: expected input_3 to have shape
(None, None, None, 3) but got array with shape (248, 8, 8, 2048)
248 refers to the total number of images in this case.
I know that this line is wrong, but I dont't know how to correct it:
model = Model(inputs=base_model.input, outputs=predictions)
What is the correct way to input the bottleneck into InceptionV3?
Creating the InceptionV3 bottlenecks:
def create_bottlenecks():
datagen = ImageDataGenerator(rescale=1. / 255)
model = InceptionV3(include_top=False, weights='imagenet')
# Generate bottlenecks for all training images
generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
nb_train_samples = len(generator.filenames)
bottlenecks_train = model.predict_generator(generator, int(math.ceil(nb_train_samples / float(batch_size))), verbose=1)
np.save(open(train_bottlenecks_file, 'w'), bottlenecks_train)
# Generate bottlenecks for all validation images
generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
nb_validation_samples = len(generator.filenames)
bottlenecks_validation = model.predict_generator(generator, int(math.ceil(nb_validation_samples / float(batch_size))), verbose=1)
np.save(open(validation_bottlenecks_file, 'w'), bottlenecks_validation)
Loading the bottlenecks:
def load_bottlenecks(src_dir, bottleneck_file):
datagen = ImageDataGenerator(rescale=1. / 255)
generator = datagen.flow_from_directory(
src_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
num_classes = len(generator.class_indices)
# load the bottleneck features saved earlier
bottleneck_data = np.load(bottleneck_file)
# get the class lebels for the training data, in the original order
bottleneck_class_labels = generator.classes
# convert the training labels to categorical vectors
bottleneck_class_labels = to_categorical(bottleneck_class_labels, num_classes=num_classes)
return bottleneck_data, bottleneck_class_labels
Starting training:
def start_training():
global nb_train_samples, nb_validation_samples
create_bottlenecks()
train_data, train_labels = load_bottlenecks(train_data_dir, train_bottlenecks_file)
validation_data, validation_labels = load_bottlenecks(validation_data_dir, validation_bottlenecks_file)
nb_train_samples = len(train_data)
nb_validation_samples = len(validation_data)
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 2 classes
predictions = Dense(2, activation='softmax')(x)
# What is the correct input? Obviously not base_model.input.
model = Model(inputs=base_model.input, outputs=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer=optimizers.SGD(lr=0.01, momentum=0.9), loss='categorical_crossentropy', metrics=['accuracy'])
# train the model on the new data for a few epochs
history = model.fit(train_data, train_labels,
epochs=epochs,
batch_size=batch_size,
validation_data=(validation_data, validation_labels),
)
Any help would be appreciated!
This error happens when you try to train your model with input data in a different shape from the shape your model supports.
Your model supports (None, None, None, 3), meaning:
Any number of images
Any height
Any width
3 channels
So, you must make sure that train_data (and validation_data) matches this shape.
The system is telling that train_data.shape = (248,8,8,2048)
I see that train_data comes from load_botlenecks. Is it really supposed to be coming from there? What is train data supposed to be? An image? Something else? What is a bottleneck?
Your model starts in the Inception model, and the Inception model takes images.
But if bottlenecks are already results of the Inception model, and you want to feed only bottlenecks, then the Inception model should not participate of anything at all.
Start from:
inputTensor = Input((8,8,2048)) #Use (None,None,2048) if bottlenecks vary in size
x = GlobalAveragePooling2D()(inputTensor)
.....
Create the model with:
model = Model(inputTensor, predictions)
The idea is:
Inception model: Image -> Inception -> Bottlenecks
Your model: Bottlenecks -> Model -> Labels
The combination of the two models is only necessary when you don't have the bottlenecks preloaded, but you have your own images for which you want to predict the bottlenecks first. (Of course you can work with separate models as well)
Then you're going to input only images (the bottlenecks will be created by Inception and passed to your model, everything internally):
Combined model: Image -> Inception ->(bottlenecks)-> Model -> Labels
For that:
inputImage = Input((None,None,3))
bottleNecks = base_model(inputImage)
predictions = model(bottleNecks)
fullModel = Model(inputImage, predictions)

Categories