Simple tf.keras Resnet50 model not converging

Simple tf.keras Resnet50 model not converging - python

I'm using the ResNet50v2 model from keras.applications for image classification but I have had persisting problems trying to get the model to converge on any meaningful accuracy. Previously, I have developed this same model with the same data in Matlab and reached around 75% accuracy but now the training just hovers around 30% accuracy and the loss does not drop. I'm thinking that there is a really simple mistake somewhere but I can't find it.
import tensorflow as tf
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./224,
validation_split=0.2)
train_generator = train_datagen.flow_from_directory(main_dir,
class_mode='categorical',
batch_size=32,
target_size=(224,224),
shuffle=True,
subset='training')
validation_generator = train_datagen.flow_from_directory(main_dir,
target_size=(224, 224),
batch_size=32,
class_mode='categorical',
shuffle=True,
subset='validation')
IMG_SHAPE = (224, 224, 3)
base_model = tf.keras.applications.ResNet50V2(
input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
maxpool_layer = tf.keras.layers.GlobalMaxPooling2D()
prediction_layer = tf.keras.layers.Dense(4, activation='softmax')
model = tf.keras.Sequential([
base_model,
maxpool_layer,
prediction_layer
])
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(
train_generator,
steps_per_epoch = train_generator.samples // 32,
validation_data = validation_generator,
validation_steps = validation_generator.samples // 32,
epochs = 20)

Since your last layer contains a softmax activation, your loss doesn't need from_logits=True. However, if you didn't have a softmax activation, you would need from_logits=True. This is because categorical_crossentropy handles probability outputs differently from logits.

Related

Tensorflow: loss and accuracy stay flat training CNN on image classification

I copied / pasted this Tensorflow tutorial into a Jupyter notebook. (As of this writting they changed the tutorial to the flower data set instead of the dog one, but the question still applies).
https://www.tensorflow.org/tutorials/images/classification
The first part (without augmentation) runs fine and I get similar results.
But with data augmentation, my Loss and Accuracy stay flat across all epoch. I've checked this posts already on SO :
Keras accuracy does not change
How to fix flatlined accuracy and NaN loss in tensorflow image classification
Tensorflow: loss decreasing, but accuracy stable
None of this applied, since the dataset is a standard one, I don't have the problem of corrupted data, plus I printed a couple of images augmented and it works fine (see below).
I've tried adding more fully connected layers to increase the model capacity, dropout to limit over fitting,... nothing change here are the curve :
Any ideas as to why? Have I missed something in the code?
I know training a DL model is a lot of trial and error, but I'm sure there must be some logic or intuition beyond randomly turning the knobs until something happens.
Thanks !
Source Data :
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')
Params :
batch_size = 128
epochs = 15
IMG_HEIGHT = 150
IMG_WIDTH = 150
Preprocessing stage :
image_gen = ImageDataGenerator(rescale=1./255,
rotation_range=20,
width_shift_range=0.15,
height_shift_range=0.15,
horizontal_flip=True,
zoom_range=0.2)
train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH))
augmented_images = [train_data_gen[0][0][i] for i in range(5)]
plotImages(augmented_images)
image_gen_val = ImageDataGenerator(rescale=1./255)
val_data_gen = image_gen_val.flow_from_directory(batch_size=batch_size,
directory=validation_dir,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='binary')
Model :
model_new = Sequential([
Conv2D(16, 2, padding='same', activation='relu',
input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
MaxPooling2D(),
Conv2D(32, 2, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 2, padding='same', activation='relu'),
MaxPooling2D(),
Dropout(0.2),
Flatten(),
Dense(512, activation='relu'),
Dense(1)
])
model_new.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
model_new.summary()
history = model_new.fit(
train_data_gen,
steps_per_epoch= total_train // batch_size,
epochs=epochs,
validation_data=val_data_gen,
validation_steps= total_val // batch_size
)

As suggested by #today, class_method= 'binary' was missing from the training data generator
Now the model is able to train properly.
train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_method = 'binary')

Retrain Vgg16 keras model on imagenet data set for 1000 classes giving less accuracy on pretrained weight

I am using Vgg16 keras model with imagent data set on pre-trained weights as per Keras documentation.
I am trying to replicate results mention in keras documents.But while training it is showing 1% accuracy on below code.How can I replicate similar reults as per Keras Documentation.Note: This is not fine tuning,I am simply trying to replicate similar result in few epochs on 1000 class of imagenet dataset.
model = VGG16(weights='imagenet', include_top=True)
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
train_generator = train_datagen.flow_from_directory(
directory="/home/svarada2/Downloads/ILSVRC2012_img_train",
target_size=(224, 224),
color_mode="rgb",
batch_size=32,
class_mode="categorical",
shuffle=True,
seed=42
)
STEP_SIZE_TRAIN=train_generator.n//train_generator.batch_size
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit_generator(generator=train_generator,
steps_per_epoch=STEP_SIZE_TRAIN,
epochs=10
)

Why is my validation accuracy stuck around 65% and how do i increase it?

I'm making an image classification CNN with 5 classes with each having 693 images with a width and height of 224px using VGG16, but my validation accuracy is stuck after 15-20 epochs around 60% - 65%.
I'm already using some data augmentation, batch normalization, and dropout and I have frozen the first 5 layers but I can't seem to increase my accuracy more than 65%.
these are my own layers
img_rows, img_cols, img_channel = 224, 224, 3
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_rows, img_cols, img_channel))
for layer in base_model.layers[:5]:
layer.trainable = False
add_model = Sequential()
add_model.add(Flatten(input_shape=base_model.output_shape[1:]))
add_model.add(Dropout(0.5))
add_model.add(Dense(512, activation='relu'))
add_model.add(BatchNormalization())
add_model.add(Dropout(0.5))
add_model.add(Dense(5, activation='softmax'))
model = Model(inputs=base_model.input, outputs=add_model(base_model.output))
model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizers.Adam(lr=0.0001),
metrics=['accuracy'])
model.summary()
and this is my dataset with my model
batch_size = 64
epochs = 25
train_datagen = ImageDataGenerator(
rotation_range=30,
width_shift_range=.1,
height_shift_range=.1,
horizontal_flip=True)
train_datagen.fit(x_train)
history = model.fit_generator(
train_datagen.flow(x_train, y_train, batch_size=batch_size),
steps_per_epoch=x_train.shape[0] // batch_size,
epochs=epochs,
validation_data=(x_test, y_test),
callbacks=[ModelCheckpoint('VGG16-transferlearning.model', monitor='val_acc', save_best_only=True)]
)
I want to get a higher accuracy because what I get now is just not enough so any help or suggestions would be appreciated.

A few things you can try are:
Reduce your batch size.
Choose another optimizer: RMSprop, SGD...
Increase the learning rate by default and then use the callback ReduceLROnPlateau
But, as usual, it depends on the data you are using. Are well balanced?

Epoch does not start while training CNN with keras VGGFace Framework

I am trying to use VGG Face implementation with keras framework on my own dataset consisting of 12 classes of face images. I have applied augmentation on some classes with very less data in training set.
After finetuning with resnet50, when I try to train my model, it gets stuck in epoch i.e., it does not start to train but keep displaying Epoch 1/50.
Here's what it looks like:
Layer (type) Output Shape Param #
=================================================================
model_1 (Model) (None, 12) 23585740
=================================================================
Total params: 23,585,740
Trainable params: 23,532,620
Non-trainable params: 53,120
_________________________________________________________________
Found 1774 images belonging to 12 classes.
Found 313 images belonging to 12 classes.
Epoch 1/50
Here's my code:
train_data_path = 'dataset_cfps/train'
validation_data_path = 'dataset_cfps/validation'
#Parametres
img_width, img_height = 224, 224
vggface = VGGFace(model='resnet50', include_top=False, input_shape=(img_width, img_height, 3))
#vgg_model = VGGFace(include_top=False, input_shape=(224, 224, 3))
last_layer = vggface.get_layer('avg_pool').output
x = Flatten(name='flatten')(last_layer)
out = Dense(12, activation='sigmoid', name='classifier')(x)
custom_vgg_model = Model(vggface.input, out)
# Create the model
model = models.Sequential()
# Add the convolutional base model
model.add(custom_vgg_model)
# Add new layers
# model.add(layers.Flatten())
# model.add(layers.Dense(1024, activation='relu'))
# model.add(BatchNormalization())
# model.add(layers.Dropout(0.5))
# model.add(layers.Dense(12, activation='sigmoid'))
# Show a summary of the model. Check the number of trainable parameters
model.summary()
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
validation_datagen = ImageDataGenerator(rescale=1./255)
train_batchsize = 16
val_batchsize = 16
train_generator = train_datagen.flow_from_directory(
train_data_path,
target_size=(img_width, img_height),
batch_size=train_batchsize,
class_mode='categorical')
validation_generator = validation_datagen.flow_from_directory(
validation_data_path,
target_size=(img_width, img_height),
batch_size=val_batchsize,
class_mode='categorical',
shuffle=True)
# Compile the model
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-3),
metrics=['acc'])
# Train the model
history = model.fit_generator(
train_generator,
steps_per_epoch=train_generator.samples/train_generator.batch_size ,
epochs=50,
validation_data=validation_generator,
validation_steps=validation_generator.samples/validation_generator.batch_size,
verbose=1)
# Save the model
model.save('facenet_resnet.h5')
Does anyone know what could be the possible problem? And how can I make my model better(if there's something I could do). Feel free to suggest me improvements.

Waiting did not solve it, I solved it by restarting the whole program.

Just you wait few hours(based on your gpu). Finally it will tell the loss and val_loss per each epochs.

A huge time to download the weights of deep networks

Hi I feel that there is something wrong with the way my code is running. I'm trying to load vgg and resnet for deep learning. This is the code I used.
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
# path to the model weights files.
weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'fc_model.h5'
# dimensions of our images.
img_width, img_height = 150, 150
train_data_dir = 'cats_and_dogs_small/train'
validation_data_dir = 'cats_and_dogs_small/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16
# build the VGG16 network
model = applications.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')
# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))
# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)
# add the model on top of the convolutional base
model.add(top_model)
# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:25]:
layer.trainable = False
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
metrics=['accuracy'])
# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary')
# fine-tune the model
model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
epochs=epochs,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)
At the line 'model = applications.VGG16(weights='imagenet', include_top=False)'
programs starts to download weights and it displays as below.
This process would take around 5/6 days to complete fully. But it gets stuck at the middle. Is there a simple way that I can avoid this complete process. Manually downloading. Is there something I'm missing.
Please help

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Simple tf.keras Resnet50 model not converging - python

Since your last layer contains a softmax activation, your loss doesn't need from_logits=True. However, if you didn't have a softmax activation, you would need from_logits=True. This is because categorical_crossentropy handles probability outputs differently from logits.

Related

Tensorflow: loss and accuracy stay flat training CNN on image classification

Retrain Vgg16 keras model on imagenet data set for 1000 classes giving less accuracy on pretrained weight

Why is my validation accuracy stuck around 65% and how do i increase it?

Epoch does not start while training CNN with keras VGGFace Framework

A huge time to download the weights of deep networks

Categories

Resources