Tensorflow: loss and accuracy stay flat training CNN on image classification

Tensorflow: loss and accuracy stay flat training CNN on image classification - python

I copied / pasted this Tensorflow tutorial into a Jupyter notebook. (As of this writting they changed the tutorial to the flower data set instead of the dog one, but the question still applies).
https://www.tensorflow.org/tutorials/images/classification
The first part (without augmentation) runs fine and I get similar results.
But with data augmentation, my Loss and Accuracy stay flat across all epoch. I've checked this posts already on SO :
Keras accuracy does not change
How to fix flatlined accuracy and NaN loss in tensorflow image classification
Tensorflow: loss decreasing, but accuracy stable
None of this applied, since the dataset is a standard one, I don't have the problem of corrupted data, plus I printed a couple of images augmented and it works fine (see below).
I've tried adding more fully connected layers to increase the model capacity, dropout to limit over fitting,... nothing change here are the curve :
Any ideas as to why? Have I missed something in the code?
I know training a DL model is a lot of trial and error, but I'm sure there must be some logic or intuition beyond randomly turning the knobs until something happens.
Thanks !
Source Data :
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')
Params :
batch_size = 128
epochs = 15
IMG_HEIGHT = 150
IMG_WIDTH = 150
Preprocessing stage :
image_gen = ImageDataGenerator(rescale=1./255,
rotation_range=20,
width_shift_range=0.15,
height_shift_range=0.15,
horizontal_flip=True,
zoom_range=0.2)
train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH))
augmented_images = [train_data_gen[0][0][i] for i in range(5)]
plotImages(augmented_images)
image_gen_val = ImageDataGenerator(rescale=1./255)
val_data_gen = image_gen_val.flow_from_directory(batch_size=batch_size,
directory=validation_dir,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='binary')
Model :
model_new = Sequential([
Conv2D(16, 2, padding='same', activation='relu',
input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
MaxPooling2D(),
Conv2D(32, 2, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(64, 2, padding='same', activation='relu'),
MaxPooling2D(),
Dropout(0.2),
Flatten(),
Dense(512, activation='relu'),
Dense(1)
])
model_new.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
model_new.summary()
history = model_new.fit(
train_data_gen,
steps_per_epoch= total_train // batch_size,
epochs=epochs,
validation_data=val_data_gen,
validation_steps= total_val // batch_size
)

As suggested by #today, class_method= 'binary' was missing from the training data generator
Now the model is able to train properly.
train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_method = 'binary')

Related

Why am I getting very small number as CNN prediction?

I created a CNN using Tensorflow to identify pneumonia and sometimes it returns a very small number as a prediction. why is this happening?
I have attached the link for the dataset
Here I how I process and load the data.
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator( rescale = 1.0/255. )
val_datagen = ImageDataGenerator( rescale = 1.0/255. )
test_datagen = ImageDataGenerator( rescale = 1.0/255. )
train_generator = train_datagen.flow_from_directory('/kaggle/input/chest-xray-pneumonia/chest_xray/chest_xray/train/',
batch_size=20,
class_mode='binary',
target_size=(350, 350))
validation_generator = val_datagen.flow_from_directory('/kaggle/input/chest-xray-pneumonia/chest_xray/chest_xray/val/',
batch_size=20,
class_mode = 'binary',
target_size = (350, 350))
test_generator = test_datagen.flow_from_directory('/kaggle/input/chest-xray-pneumonia/chest_xray/chest_xray/test/',
batch_size=20,
class_mode = 'binary',
target_size = (350, 350
And here the Model, compile and fit functions
import tensorflow as tf
model = tf.keras.models.Sequential([
# Note the input shape is the desired size of the image 150x150 with 3 bytes color
tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(350, 350, 3)),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# Flatten the results to feed into a DNN
tf.keras.layers.Flatten(),
# 512 neuron hidden layer
tf.keras.layers.Dense(1024, activation='relu'),
# Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('cats') and 1 for the other ('dogs')
tf.keras.layers.Dense(1, activation='sigmoid')
])
compile the model
from tensorflow.keras.optimizers import RMSprop
model.compile(optimizer=RMSprop(learning_rate=0.001),
loss='binary_crossentropy',
metrics = ['accuracy'])
model fit
history = model.fit(train_generator,
validation_data=validation_generator,
steps_per_epoch=200,
epochs=2000,
validation_steps=200,
callbacks=[callbacks],
verbose=2)
The evaluation metrics as followings, loss: 0.2351 - accuracy: 0.9847
The prediction shows a very small number for the negative pneumonia, and for positive it shows more than .50.
I have two questions:
why I get a very small number as 2.xxxx * 10e-20?
why I can't get the following values as null?
val_acc = history.history[ 'val_accuracy' ]
val_loss = history.history['val_loss' ]

I see that there is no problem with your code, neither with the results you get.
This is a binary classification problem (2 classes: Positive or negative pneumonia), and the output of your model is one neurone giving values between 0 and 1.
So if the output is higher than 0.5, this means positive pneumonia. Otherwise, when you have a very small value like 2 * 10e-20 this means that it's negative pneumonia.
For your second question, you are not supposed to have accuracy and loss values to be null simply because the model is well trained and has 98% accuracy on training data.

ImageDataGenerator doesn't generate enough samples

I am following F.Chollet book "Deep learning with python" and can't get one example working.
In particular, I am running an example from chapter "Training a convnet from scratch on a small dataset".
My training dataset has 2000 sample and I am trying to extend it with augmentation using ImageDataGenerator. Despite that my code is exactly the same, I am getting error:
Your input ran out of data; interrupting training. Make sure that your
dataset or generator can generate at least steps_per_epoch * epochs
batches (in this case, 10000 batches).
from keras import layers
from keras import models
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
# creating model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
# model compilation
model.compile(loss='binary_crossentropy',
optimizer=optimizers.RMSprop(lr=1e-4),
metrics=['acc'])
# model.summary()
# generating trains and test sets with rescaling 0-255 -> 0-1
train_dir = 'c:\\Work\\Code\\Python\\DL\\cats_and_dogs_small\\train\\'
validation_dir = 'c:\\Work\\Code\\Python\\DL\\cats_and_dogs_small\\validation\\'
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)
# Note that the validation data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
# This is the target directory
train_dir,
# All images will be resized to 150x150
target_size=(150, 150),
batch_size=32,
# Since we use binary_crossentropy loss, we need binary labels
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150, 150),
batch_size=32,
class_mode='binary')
for data_batch, labels_batch in train_generator:
print('data batch shape:', data_batch.shape)
print('labels batch shape:', labels_batch.shape)
break
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=100,
validation_data=validation_generator,
validation_steps=50)
Here is the link to github page for this book samples. Where you can check the code as well.
I am not sure what I am doing wrong and asking any advice. Thank you

It seems the batch_size should be 20 not 32.
Since you have steps_per_epoch = 100, it will execute next() on train generator 100 times before going to next epoch.
Now, in train_generator the batch_size is 32, so it can generate 2000/32 number of batches, given that you have 2000 number of training samples. And that is approximate 62.
So on 63th time executing next() on train_generator will give nothing and it will tell Your input ran out of data;
Ideally,
steps_per_epoch = total_traing_sample / batch_size

The above answer described your issue well. I would like to add one point, you can also get the correct steps_per_epoch value by adding these lines.
train_steps = train_generator.__len__()
val_steps = validation_generator.__len__()

I experienced the same issue while dealing with the data generation with augmentation.
The solutions provided above are true steps_per_epoch = total_training_sample / batch_size.
But since you are augmenting the images, I believe you would not mind passing the same image with different augmentations.
For me, I used the same keras version with tensorflow background as Chollet used and it removed this error. In that case, it will keep feeding you images indefinitely since it loop when it finishes from the available images.
Hopefully this helps

Deploying a CNN: High training and test accuracy but low prediction accuracy

Just starting out in ML and created my first CNN to detect the orientation of an image of a face. I got the training and testing accuracy up to around 96-99% over 2 different sets of 1000 pictures (128x128 RGB). However, when I go to predict an image from the test set on its own, the model rarely predicts correctly. I think there must be a difference in the way I load data into the model during testing vs prediction. Here is how I load the data into the model to train and test:
datagen = ImageDataGenerator()
train_it = datagen.flow_from_directory('twoThousandTransformed/', class_mode='categorical', batch_size=32, color_mode="rgb", target_size=(64,64))
val_it = datagen.flow_from_directory('validation/', class_mode='categorical', batch_size=32, color_mode="rgb", target_size=(64,64))
test_it = datagen.flow_from_directory('test/', class_mode='categorical', batch_size=32, color_mode='rgb', target_size=(64,64))
And here is how I load an image to make a prediction:
image_path='inputPicture/02001.png'
image = tf.keras.preprocessing.image.load_img(image_path)
input_arr = keras.preprocessing.image.img_to_array(image)
reshaped_image = np.resize(input_arr, (64,64,3))
input_arr = np.array([reshaped_image])
predictions = model.predict(input_arr)
print(predictions)
classes = np.argmax(predictions, axis = 1)
print(classes)
There must be some difference in the way the ImageDataGenerator handles the images vs. how I am doing it in the prediction. Can y'all help a noobie out? Thanks!
Edit: Below is my model
imageInput = Input(shape=(64,64,3))
conv1 = Conv2D(128, kernel_size=16, activation='relu')(imageInput)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(64, kernel_size=12, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(64, kernel_size=4, activation='relu')(pool2)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
flat = Flatten()(pool3)
hidden1 = Dense(16, activation='relu')(flat)
hidden2 = Dense(16, activation='relu')(hidden1)
hidden3 = Dense(10, activation='relu')(hidden2)
output = Dense(4, activation='softmax')(hidden3)
model = Model(inputs=imageInput, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_it, steps_per_epoch=16, validation_data=val_it, validation_steps=8, epochs=25)
print('here we go!')
_, accuracy = model.evaluate(test_it)
print('Accuracy: %.2f' % (accuracy*100))

One thing you can try is to replicate the chosen image to resemble the batch size with which you trained the model. Also, because of such high training accuracy, it seems your model must be overfitting. So, try adding dropout or reducing the number of layers in your network, if the first thing doesn't work out.

Simple tf.keras Resnet50 model not converging

I'm using the ResNet50v2 model from keras.applications for image classification but I have had persisting problems trying to get the model to converge on any meaningful accuracy. Previously, I have developed this same model with the same data in Matlab and reached around 75% accuracy but now the training just hovers around 30% accuracy and the loss does not drop. I'm thinking that there is a really simple mistake somewhere but I can't find it.
import tensorflow as tf
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./224,
validation_split=0.2)
train_generator = train_datagen.flow_from_directory(main_dir,
class_mode='categorical',
batch_size=32,
target_size=(224,224),
shuffle=True,
subset='training')
validation_generator = train_datagen.flow_from_directory(main_dir,
target_size=(224, 224),
batch_size=32,
class_mode='categorical',
shuffle=True,
subset='validation')
IMG_SHAPE = (224, 224, 3)
base_model = tf.keras.applications.ResNet50V2(
input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
maxpool_layer = tf.keras.layers.GlobalMaxPooling2D()
prediction_layer = tf.keras.layers.Dense(4, activation='softmax')
model = tf.keras.Sequential([
base_model,
maxpool_layer,
prediction_layer
])
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(
train_generator,
steps_per_epoch = train_generator.samples // 32,
validation_data = validation_generator,
validation_steps = validation_generator.samples // 32,
epochs = 20)

Since your last layer contains a softmax activation, your loss doesn't need from_logits=True. However, if you didn't have a softmax activation, you would need from_logits=True. This is because categorical_crossentropy handles probability outputs differently from logits.

Deep Learning - Candlestick Question (CNN Model)

I am new to Deep Learning and just have a question if the method I am using is correct.
Also, if anybody has suggestions on what to change on the model creation it would also be appreciated.
graphs look similar
I am using a CNN model to train candlesticks based on 'buy', sell', and 'do trade' pictures that look similar to the attached picture. (tried different number of bars but results where similar)
I based the code of this post:
https://towardsdatascience.com/making-a-i-that-looks-into-trade-charts-62e7d51edcba
I have made a few changes but kept the model training code similar (small changes did not produce significant accuracy)
# Input the size of your sample images
img_width, img_height = 150, 150
nb_filters1 = 32
nb_filters2 = 32
nb_filters3 = 64
conv1_size = 3
conv2_size = 2
conv3_size = 5
pool_size = 2
# We have 2 classes, buy and sell
classes_num = 3
batch_size = 128
lr = 0.001
chanDim =3
model = Sequential()
model.add(Convolution2D(nb_filters1, conv1_size, conv1_size, border_mode ='same', input_shape=(img_height, img_width , 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(pool_size, pool_size)))
model.add(Convolution2D(nb_filters2, conv2_size, conv2_size, border_mode ="same"))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(pool_size, pool_size), dim_ordering='th'))
model.add(Convolution2D(nb_filters3, conv3_size, conv3_size, border_mode ='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(pool_size, pool_size), dim_ordering='th'))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(classes_num, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.rmsprop(),
metrics=['accuracy'])
train_datagen = ImageDataGenerator(
#rescale=1. / 255,
horizontal_flip=False)
test_datagen = ImageDataGenerator(
#rescale=1. / 255,
horizontal_flip=False)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
#shuffle=True,
batch_size=batch_size,
class_mode='categorical'
)
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
#shuffle=True,
class_mode='categorical')
With this, I get an accuracy of 38% and if I remove the 'no trade' option, I get an accuracy of 52%.
Before training and after training does not improve accuracy drastically, that is why I am assuming the settings are not 100%
.
When predicting, the results always lean to one side (52% buy, 48% sell) and don't change much after a few hundred images.
Any suggestions?

I assume your three options are "buy", "sell", and "no trade". The reason why it jumps to 52% is because it's differentiating between 2 instead of 3 options.
With regards to the lower than expected accuracy, I recommend changing the loss to Adam. Also possibly move the dropout layer to the middle of the network. I have found success adding a dropout = .2 after each pooling layer. This way nodes are dropped throughout the network which allows for more "diversity" in node paths taken.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Tensorflow: loss and accuracy stay flat training CNN on image classification - python

As suggested by #today, class_method= 'binary' was missing from the training data generator Now the model is able to train properly. train_data_gen = image_gen.flow_from_directory(batch_size=batch_size, directory=train_dir, shuffle=True, target_size=(IMG_HEIGHT, IMG_WIDTH), class_method = 'binary')

Related

Why am I getting very small number as CNN prediction?

ImageDataGenerator doesn't generate enough samples

Deploying a CNN: High training and test accuracy but low prediction accuracy

Simple tf.keras Resnet50 model not converging

Deep Learning - Candlestick Question (CNN Model)

Categories

Resources