Whenever I run my model, the "Precision", "recall" ,"Sensitivity", "Specificity" Changes their name like, first time "Precision", next "Precision_11", then, "Precision_12".... so on.
How to solve this?
here is the code:
model.compile(optimizer="sgd",
loss="categorical_crossentropy",
metrics=[keras.metrics.Precision(), keras.metrics.Recall(), keras.metrics.SpecificityAtSensitivity(0.5), keras.metrics.SensitivityAtSpecificity(0.5), 'accuracy'])
# fit the model
# Run the cell. It will take some time to execute
r = model.fit_generator(
training_set,
validation_data=test_set,
epochs=5,
steps_per_epoch=len(training_set),
validation_steps=len(test_set)
)
Here is the output:
Epoch 1/5
164/164 [==============================] - 111s 675ms/step - loss: 5.4092 - precision_22: 0.7641 - recall_12: 0.7641 - specificity_at_sensitivity_7: 0.8196 - sensitivity_at_specificity_9: 0.8196 - accuracy: 0.7641 - val_loss: 1.8738 - val_precision_22: 0.7965 - val_recall_12: 0.7965 - val_specificity_at_sensitivity_7: 0.8622 - val_sensitivity_at_specificity_9: 0.8622 - val_accuracy: 0.7965
Epoch 2/5
164/164 [==============================] - 109s 665ms/step - loss: 1.4624 - precision_22: 0.8702 - recall_12: 0.8702 - specificity_at_sensitivity_7: 0.9192 - sensitivity_at_specificity_9: 0.9192 - accuracy: 0.8702 - val_loss: 3.0408 - val_precision_22: 0.7340 - val_recall_12: 0.7340 - val_specificity_at_sensitivity_7: 0.8061 - val_sensitivity_at_specificity_9: 0.8061 - val_accuracy: 0.7340
Epoch 3/5
164/164 [==============================] - 110s 670ms/step - loss: 1.1008 - precision_22: 0.8882 - recall_12: 0.8882 - specificity_at_sensitivity_7: 0.9360 - sensitivity_at_specificity_9: 0.9360 - accuracy: 0.8882 - val_loss: 0.8237 - val_precision_22: 0.8830 - val_recall_12: 0.8830 - val_specificity_at_sensitivity_7: 0.9391 - val_sensitivity_at_specificity_9: 0.9391 - val_accuracy: 0.8830
Epoch 4/5
164/164 [==============================] - 109s 666ms/step - loss: 0.7959 - precision_22: 0.9031 - recall_12: 0.9031 - specificity_at_sensitivity_7: 0.9481 - sensitivity_at_specificity_9: 0.9481 - accuracy: 0.9031 - val_loss: 0.6393 - val_precision_22: 0.8926 - val_recall_12: 0.8926 - val_specificity_at_sensitivity_7: 0.9551 - val_sensitivity_at_specificity_9: 0.9551 - val_accuracy: 0.8926
Epoch 5/5
164/164 [==============================] - 109s 666ms/step - loss: 0.7639 - precision_22: 0.9100 - recall_12: 0.9100 - specificity_at_sensitivity_7: 0.9540 - sensitivity_at_specificity_9: 0.9540 - accuracy: 0.9100 - val_loss: 3.9008 - val_precision_22: 0.6843 - val_recall_12: 0.6843 - val_specificity_at_sensitivity_7: 0.7580 - val_sensitivity_at_specificity_9: 0.7580 - val_accuracy: 0.6843
#TFer2 , Here is the Code:
# -*- coding: utf-8 -*-
"""Vgg19.ipynb
Automatically generated by Colaboratory.
Original file is located at
https://colab.research.google.com/drive/1C71ob5s4BWiK0GF2eVOf00bpNhB1Nu_f
"""
# import the libraries as shown below
from keras.layers import Input, Lambda, Dense, Flatten
from keras.models import Model
#from keras.applications.resnet50 import ResNet50
#from keras.applications.vgg16 import VGG16
#from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
import numpy as np
from glob import glob
import matplotlib.pyplot as plt
from keras.applications.vgg19 import VGG19
from keras.applications.vgg19 import preprocess_input
import keras
# re-size all the images to this
IMAGE_SIZE = [224, 224]
train_path = 'drive/My Drive/chest_xray/train'
valid_path = 'drive/My Drive/chest_xray/test'
# Import the Vgg 16 library as shown below and add preprocessing layer to the front of VGG
# Here we will be using imagenet weights
vgg = VGG19(input_shape=IMAGE_SIZE + [3], weights='imagenet', include_top=False)
# don't train existing weights
for layer in vgg.layers:
layer.trainable = False
# useful for getting number of output classes
folders = glob('drive/My Drive/chest_xray/train/*')
# our layers - you can add more if you want
x = Flatten()(vgg.output)
prediction = Dense(len(folders), activation='softmax')(x)
# create a model object
model = Model(inputs=vgg.input, outputs=prediction)
# view the structure of the model
model.summary()
# tell the model what cost and optimization method to use
# model.compile(
#loss='categorical_crossentropy',
#optimizer='adam',
#metrics=['accuracy']
#)
model.compile(optimizer="adam",
loss="categorical_crossentropy",
metrics=[keras.metrics.Precision(), keras.metrics.Recall(), keras.metrics.SpecificityAtSensitivity(0.5), keras.metrics.SensitivityAtSpecificity(0.5), keras.metrics.AUC(), 'accuracy'])
# Use the Image Data Generator to import the images from the dataset
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
# Make sure you provide the same target size as initialied for the image size
training_set = train_datagen.flow_from_directory('drive/My Drive/chest_xray/train',
target_size = (224, 224),
batch_size = 32,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('drive/My Drive/chest_xray/test',
target_size = (224, 224),
batch_size = 32,
class_mode = 'categorical')
# fit the model
# Run the cell. It will take some time to execute
r = model.fit_generator(
training_set,
validation_data=test_set,
epochs=5,
steps_per_epoch=len(training_set),
validation_steps=len(test_set)
)
# plot the loss
plt.plot(r.history['loss'], label='train loss')
plt.plot(r.history['val_loss'], label='val loss')
plt.legend()
plt.show()
plt.savefig('LossVal_loss(Vgg19)')
# plot the accuracy
plt.plot(r.history['accuracy'], label='train acc')
plt.plot(r.history['val_accuracy'], label='val acc')
plt.legend()
plt.show()
plt.savefig('AccVal_acc(Vgg19)')
# plot the recall
plt.plot(r.history['recall_1'], label='train recall')
plt.plot(r.history['val_recall_1'], label='val recall')
plt.legend()
plt.show()
plt.savefig('RecallVal_recall(Vgg19)')
# plot the precision
plt.plot(r.history['precision_1'], label='train precision')
plt.plot(r.history['val_precision_1'], label='val precision')
plt.legend()
plt.show()
plt.savefig('PrecisionVal_precision(Vgg19)')
# plot the specificity_at_sensitivity
plt.plot(r.history['specificity_at_sensitivity_1'], label='train specificity_at_sensitivity')
plt.plot(r.history['val_specificity_at_sensitivity_1'], label='val specificity_at_sensitivity')
plt.legend()
plt.show()
plt.savefig('specificity_at_sensitivityVal_specificity_at_sensitivity(Vgg19)')
# plot the sensitivity_at_specificity
plt.plot(r.history['sensitivity_at_specificity_1'], label='train sensitivity_at_specificity')
plt.plot(r.history['val_sensitivity_at_specificity_1'], label='val sensitivity_at_specificity')
plt.legend()
plt.show()
plt.savefig('sensitivity_at_specificityVal_sensitivity_at_specificity(Vgg19)')
# plot the AUC/ROC
plt.plot(r.history['auc_1'], label='train AUC')
plt.plot(r.history['val_auc_1'], label='val AUC')
plt.legend()
plt.show()
plt.savefig('AUCVal_AUC(Vgg19)')
I recently encountered this same odd behaviour. I did not find the reason for it, but I managed to circumvent it by explicitly passing the name argument to the constructor of the metrics that you instantiate in the metrics list in the compile function (e.g. Recall). See the docs, e.g. for the Recall metric for this optional name argument.
Therefore, I would suggest compiling your model as follows:
model.compile(optimizer="sgd",
loss="categorical_crossentropy",
metrics=[keras.metrics.Precision(name='precision'),
keras.metrics.Recall(name='recall'),
keras.metrics.SpecificityAtSensitivity(0.5, name='specificity_at_sensitivity'),
keras.metrics.SensitivityAtSpecificity(0.5, name='sensitivity_at_specificity'),
'accuracy'])
Related
I'm a beginner in deep learning and I'm trying to train a deep learning model to classify different ASL hand signs using Mobilenet_v2 and Inception.
Here are my codes create an ImageDataGenerator for creating the training and validation set.
# Reformat Images and Create Batches
IMAGE_RES = 224
BATCH_SIZE = 32
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
validation_split = 0.4
)
train_generator = datagen.flow_from_directory(
base_dir,
target_size = (IMAGE_RES,IMAGE_RES),
batch_size = BATCH_SIZE,
subset = 'training'
)
val_generator = datagen.flow_from_directory(
base_dir,
target_size= (IMAGE_RES, IMAGE_RES),
batch_size = BATCH_SIZE,
subset = 'validation'
)
Here are the codes to train the models:
# Do transfer learning with Tensorflow Hub
URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
feature_extractor = hub.KerasLayer(URL,
input_shape=(IMAGE_RES, IMAGE_RES, 3))
# Freeze pre-trained model
feature_extractor.trainable = False
# Attach a classification head
model = tf.keras.Sequential([
feature_extractor,
layers.Dense(5, activation='softmax')
])
model.summary()
# Train the model
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
EPOCHS = 5
history = model.fit(train_generator,
steps_per_epoch=len(train_generator),
epochs=EPOCHS,
validation_data = val_generator,
validation_steps=len(val_generator)
)
Epoch 1/5
94/94 [==============================] - 19s 199ms/step - loss: 0.7333 - accuracy: 0.7730 - val_loss: 0.6276 - val_accuracy: 0.7705
Epoch 2/5
94/94 [==============================] - 18s 190ms/step - loss: 0.1574 - accuracy: 0.9893 - val_loss: 0.5118 - val_accuracy: 0.8145
Epoch 3/5
94/94 [==============================] - 18s 191ms/step - loss: 0.0783 - accuracy: 0.9980 - val_loss: 0.4850 - val_accuracy: 0.8235
Epoch 4/5
94/94 [==============================] - 18s 196ms/step - loss: 0.0492 - accuracy: 0.9997 - val_loss: 0.4541 - val_accuracy: 0.8395
Epoch 5/5
94/94 [==============================] - 18s 193ms/step - loss: 0.0349 - accuracy: 0.9997 - val_loss: 0.4590 - val_accuracy: 0.8365
I've tried using data augmentation but the model still overfits so I'm wondering if I've done something wrong in my code.
Your data is very small. Try splitting with random seeds and check if the problem still persists.
If it does, then use regularizations and decrease the complexity of neural network.
Also experiment with different optimizers and smaller learning rate (try lr scheduler)
It seems like your dataset is very small with some true outputs separated only by a small distance of inputs in the input-output curve. That is why it is fitting easily to those points.
I am currently working on a project involving training a regression model, saving it and then loading it to make further predictions using that model. However I'm having a problem. Each time that I model.predict on images it gives out the same predictions. I am not entirely sure what the problem is, maybe it's in the training stage or i'm just doing something wrong.
I was following this tutorial
All of the files are in this github repo
Here are some bits from the code:
(This part is training the model and saving it)
model = create_cnn(400, 400, 3, regress=True)
opt = Adam(lr=1e-3, decay=1e-3 / 200)
model.compile(loss="mean_absolute_percentage_error", optimizer=opt)
model.fit(X, Y, epochs=70, batch_size=8)
model.save("D:/statispic2/final-statispic_model.hdf5")
The next code part is from loading the model and making predictions.
model = load_model("D:/statispic2/statispic_model.hdf5") # Loading the model
prediction = model.predict(images_ready_for_prediction) #images ready for prediction include a numpy array
#that is loaded with the images just like I loaded them for the training stage.
print(prediction_list)
After trying it out this is the output prediction from the model:
[[0.05169942] # I gave it 5 images as parameters
[0.05169942]
[0.05169942]
[0.05169942]
[0.05169942]]
If anything is unclear, or you would like to see some more code, please let me know.
People saying regression and CNN are two completely different things clearly have missed some basic learnings in their ML course. Yes they are completely different! But should not be compared ;)
CNN is a type of deep neural network usually which became quite famous for its use on images. Therefore it is a framework to solve problem, and can solve both regression AND classification problems.
Regression refers to the type of output you are predicting. So comparing the two directly is quite stupid to be honest.
I cant comment on the specific people misleading you in this section, since I need a specific number of points to do so.
However, back to the problem. Do you encounter this problem before or after saving it? If you encounter it before, I would try scaling your output values to an easier distribution. If it happens after you save, I would look into versions of your framework and the documentation of how they save it.
It could also just be that there is no information in the pictures.
No, no, no! Regression is completely different from CNN. Do a little research and the differences will quickly become apparent. In the meantime, I'll share two code samples with you right here.
Regression:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline
import sklearn
from sklearn.datasets import load_boston
boston = load_boston()
# Now we will load the data into a pandas dataframe and then will print the first few rows of the data using the head() function.
bos = pd.DataFrame(boston.data)
bos.head()
bos.columns = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT']
bos.head()
bos['MEDV'] = boston.target
bos.describe()
bos.isnull().sum()
sns.distplot(bos['MEDV'])
plt.show()
sns.pairplot(bos)
corr_mat = bos.corr().round(2)
sns.heatmap(data=corr_mat, annot=True)
sns.lmplot(x = 'RM', y = 'MEDV', data = bos)
X = bos[['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX','PTRATIO', 'B', 'LSTAT']]
y = bos['MEDV']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 10)
# Training the Model
# We will now train our model using the LinearRegression function from the sklearn library.
from sklearn.linear_model import LinearRegression
lm = LinearRegression()
lm.fit(X_train, y_train)
# Prediction
# We will now make prediction on the test data using the LinearRegression function and plot a scatterplot between the test data and the predicted value.
prediction = lm.predict(X_test)
plt.scatter(y_test, prediction)
df1 = pd.DataFrame({'Actual': y_test, 'Predicted':prediction})
df2 = df1.head(10)
df2
df2.plot(kind = 'bar')
from sklearn import metrics
from sklearn.metrics import r2_score
print('MAE', metrics.mean_absolute_error(y_test, prediction))
print('MSE', metrics.mean_squared_error(y_test, prediction))
print('RMSE', np.sqrt(metrics.mean_squared_error(y_test, prediction)))
print('R squared error', r2_score(y_test, prediction))
Result:
MAE 4.061419182954711
MSE 34.413968453138565
RMSE 5.866341999333023
R squared error 0.6709339839115628
CNN:
# keras imports for the dataset and building our neural network
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPool2D, Flatten
from keras.utils import np_utils
# to calculate accuracy
from sklearn.metrics import accuracy_score
# loading the dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# building the input vector from the 28x28 pixels
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalizing the data to help with the training
X_train /= 255
X_test /= 255
# one-hot encoding using keras' numpy-related utilities
n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
Y_train = np_utils.to_categorical(y_train, n_classes)
Y_test = np_utils.to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", Y_train.shape)
# building a linear stack of layers with the sequential model
model = Sequential()
# convolutional layer
model.add(Conv2D(25, kernel_size=(3,3), strides=(1,1), padding='valid', activation='relu', input_shape=(28,28,1)))
model.add(MaxPool2D(pool_size=(1,1)))
# flatten output of conv
model.add(Flatten())
# hidden layer
model.add(Dense(100, activation='relu'))
# output layer
model.add(Dense(10, activation='softmax'))
# compiling the sequential model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# training the model for 10 epochs
model.fit(X_train, Y_train, batch_size=128, epochs=10, validation_data=(X_test, Y_test))
Result:
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 27s 451us/step - loss: 0.2037 - accuracy: 0.9400 - val_loss: 0.0866 - val_accuracy: 0.9745
Epoch 2/10
60000/60000 [==============================] - 27s 451us/step - loss: 0.0606 - accuracy: 0.9819 - val_loss: 0.0553 - val_accuracy: 0.9812
Epoch 3/10
60000/60000 [==============================] - 27s 445us/step - loss: 0.0352 - accuracy: 0.9892 - val_loss: 0.0533 - val_accuracy: 0.9824
Epoch 4/10
60000/60000 [==============================] - 27s 446us/step - loss: 0.0226 - accuracy: 0.9930 - val_loss: 0.0572 - val_accuracy: 0.9825
Epoch 5/10
60000/60000 [==============================] - 27s 448us/step - loss: 0.0148 - accuracy: 0.9959 - val_loss: 0.0516 - val_accuracy: 0.9834
Epoch 6/10
60000/60000 [==============================] - 27s 443us/step - loss: 0.0088 - accuracy: 0.9976 - val_loss: 0.0574 - val_accuracy: 0.9824
Epoch 7/10
60000/60000 [==============================] - 26s 442us/step - loss: 0.0089 - accuracy: 0.9973 - val_loss: 0.0526 - val_accuracy: 0.9847
Epoch 8/10
60000/60000 [==============================] - 26s 440us/step - loss: 0.0047 - accuracy: 0.9988 - val_loss: 0.0593 - val_accuracy: 0.9838
Epoch 9/10
60000/60000 [==============================] - 28s 469us/step - loss: 0.0056 - accuracy: 0.9986 - val_loss: 0.0559 - val_accuracy: 0.9836
Epoch 10/10
60000/60000 [==============================] - 27s 449us/step - loss: 0.0059 - accuracy: 0.9981 - val_loss: 0.0663 - val_accuracy: 0.9820
CNN is deep learning. You use regression models for calculating a number, like the price of a car.
I had the exact same issue after pickle.dump and pickle.load my model. The problem I was missing is that I was not normalizing features (vector X) before predicting using the model. I hope the will help you.
Changing the optimizer from Adam() to RMSprop() with a learning rate of >0.001 worked for me.
I am trying to train my 6000 train dataset and 1000 validation dataset but I have a problem: the program just freezes and hangs during training without any error message .
1970/6000 [========>.....................] - ETA: 1:50:11 - loss: 1.2256 - accuracy: 0.5956
1971/6000 [========>.....................] - ETA: 1:50:08 - loss: 1.2252 - accuracy: 0.5958
1972/6000 [========>.....................] - ETA: 1:50:08 - loss: 1.2248 - accuracy: 0.5960
1973/6000 [========>.....................] - ETA: 1:50:06 - loss: 1.2245 - accuracy: 0.5962
1974/6000 [========>.....................] - ETA: 1:50:04 - loss: 1.2241 - accuracy: 0.5964
1975/6000 [========>.....................] - ETA: 1:50:02 - loss: 1.2243 - accuracy: 0.5961
1976/6000 [========>.....................] - ETA: 1:50:00 - loss: 1.2239 - accuracy: 0.5963
1977/6000 [========>.....................] - ETA: 1:49:58 - loss: 1.2236 - accuracy: 0.5965
1978/6000 [========>.....................] - ETA: 1:49:57 - loss: 1.2241 - accuracy: 0.5962
1979/6000 [========>.....................] - ETA: 1:49:56 - loss: 1.2237 - accuracy: 0.5964
1980/6000 [========>.....................] - ETA: 1:49:55 - loss: 1.2242 - accuracy: 0.5961
1981/6000 [========>.....................] - ETA: 1:49:53 - loss: 1.2252 - accuracy: 0.5958
1982/6000 [========>.....................] - ETA: 1:49:52 - loss: 1.2257 - accuracy: 0.5955
I wait 5-6 minutes but it seem nothing happen.
I try to solved like
Change steps_per_epoch to 100 and increase epoch to 20
I think it a problem of function ReduceLROnPlateau so I will add cooldown =1
but 2 solution did not solve this problem
Hardware configuration:
I5-8300h
Gtx 1060 6GB
Dependencies:
Keras 2.3.1
TensorFlow 2.0.0(GPU-Version)
The code is provided below:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import keras
import tensorflow as tf
from skimage import exposure, color
from keras.optimizers import Adam
from tqdm import tqdm
from keras.models import Model
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D,Convolution2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint, Callback
from keras import regularizers
from keras.applications.densenet import DenseNet121
from keras_preprocessing.image import ImageDataGenerator
from sklearn.utils import class_weight
from collections import Counter
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth=True
session = tf.compat.v1.Session(config=config)
# Histogram equalization
def HE(img):
img_eq = exposure.equalize_hist(img)
return img_eq
def plotImages(images_arr):
fig, axes = plt.subplots(1, 5, figsize=(20,20))
axes = axes.flatten()
for img, ax in zip( images_arr, axes):
ax.imshow(img)
ax.axis('off')
plt.tight_layout()
plt.show()
train_datagen = ImageDataGenerator(
rescale=1. / 255,
rotation_range=40,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest',
preprocessing_function=HE,
)
validation_datagen = ImageDataGenerator(
rescale=1./255
)
test_datagen = ImageDataGenerator(
rescale=1./255
)
#get image and label with augmentation
train = train_datagen.flow_from_directory(
'train/train_deep/',
target_size=(224,224),
class_mode='categorical',
shuffle=False,
batch_size = 20,
)
test = test_datagen.flow_from_directory(
'test_deep/',
batch_size=1,
target_size = (224,224),
)
val = validation_datagen.flow_from_directory(
'train/validate_deep/',
target_size=(224,224),
batch_size = 20,
)
#Training
X_train, y_train = next(train)
class_names = ['No DR', 'Mild', 'Moderate', 'Severe', 'Proliferative DR']
counter = Counter(train.classes)
class_weights = class_weight.compute_class_weight(
'balanced',
np.unique(train.classes),
train.classes)
#X_test , y_test = next(test)
#X_test=np.reshape(X_test,(X_test.shape[0],X_test.shape[1],X_test.shape[2]))
#Training parameter
batch_size =32
Epoch = 2
model = DenseNet121(include_top=True, weights=None, input_tensor=None, input_shape=(224,224,3), pooling=None, classes=5)
model.compile(loss='categorical_crossentropy',
optimizer=Adam(learning_rate=0.01),
metrics=['accuracy'])
model.summary()
filepath="weights-improvement-{epoch:02d}-{val_loss:.2f}.hdf5"
checkpointer = ModelCheckpoint(filepath,monitor='val_loss', verbose=1, save_best_only=True,save_weights_only=True)
lr_reduction = ReduceLROnPlateau(monitor='val_loss', patience=5, verbose=2, factor=0.2,cooldown=1)
callbacks_list = [checkpointer, lr_reduction]
#Validation
X_val , y_val = next(val)
#history = model.fit(X_train,y_train,epochs=Epoch,validation_data = (X_val,y_val))
history = model.fit_generator(
train,
epochs=Epoch,
steps_per_epoch=6000,
class_weight=class_weights,
validation_data=val,
validation_steps=1000,
use_multiprocessing = False,
max_queue_size=100,
workers = 1,
callbacks=callbacks_list
)
# Score trained model.
scores = model.evaluate(X_val, y_val, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])
#predict
test.reset()
pred=model.predict_generator(test,
steps=25,)
print(pred)
for i in pred:
print(np.argmax(i))
This code would work well if you used Keras < 2.0.0 (I do not recommend that you use old versions).
Your error comes from the fact that you are using Keras > 2.0.0 or Keras inside TensorFlow.
The exact error from your code springs from these lines:
history = model.fit_generator( #change `.fit_generator() to .fit()`
train,
epochs=Epoch,
steps_per_epoch=6000, #change this to 6000//32
class_weight=class_weights,
validation_data=val,
validation_steps=1000, #change this to 1000//32
use_multiprocessing = False,
max_queue_size=100,
workers = 1,
callbacks=callbacks_list
)
The parameters "steps_per_epoch" and "validation_steps" have to be equal to the length of the dataset divided by the batch size.
I thought that pre-built models were supposed to be high of accuracy simply by changing the last layer to accommodate your CNN needs. I am unsure of why I receive such a low validated accuracy when fitting the model. Any layer suggestions or what I should do to get my accuracy to 80%? I am attempting to decide whether a plane is an Airbus or a Boeing aircraft. Could it be the extremely low resolution image sizes?
40% Fitted Example:
Epoch 10/10
20/149 [===>..........................] - ETA: 24s - loss: 0.6908 - accuracy: 0.5500
40/149 [=======>......................] - ETA: 21s - loss: 0.6916 - accuracy: 0.5250
60/149 [===========>..................] - ETA: 17s - loss: 0.6918 - accuracy: 0.5167
80/149 [===============>..............] - ETA: 13s - loss: 0.6917 - accuracy: 0.5125
100/149 [===================>..........] - ETA: 9s - loss: 0.6918 - accuracy: 0.5200
120/149 [=======================>......] - ETA: 5s - loss: 0.6924 - accuracy: 0.5167
140/149 [===========================>..] - ETA: 1s - loss: 0.6924 - accuracy: 0.5071
149/149 [==============================] - 33s 225ms/step - loss: 0.6925 - accuracy: 0.5034 - val_loss: 0.6965 - val_accuracy: 0.4706
Here is the full script:
Any idea of what is going wrong?
from keras.applications.resnet50 import ResNet50
from keras.applications import VGG19
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input, decode_predictions
from sklearn.utils import shuffle
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras import models
from keras import layers
import matplotlib as plt
import cv2
import os
'''Resnet-50 classifier that attempts to predict whether a photo of an aircraft is the type of a Boeing or an airbus'''
boeing_dir = '#' # Paths of images/folders used to create training data
airbus_dir = '#'
conv_base = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
model = models.Sequential()
model.add(conv_base)
for layer in model.layers:
layer.trainable = False
model.add(layers.Dropout(0.15))
model.add(layers.GaussianNoise(0.15))
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(16, activation='relu'))
model.add(layers.Dense(2, activation='softmax'))
model.layers[0].trainable = False
print('model constructed')
boeing_data = []
boeing_label = []
airbus_data = []
airbus_label = []
'''iterate through each file, resize, append to variable accordingly'''
for filename in os.listdir(boeing_dir):
if filename.endswith(".jpg") or filename.endswith(".png") or filename.endswith(".jpeg"):
path_b = os.path.join(boeing_dir, filename)
im = cv2.imread(path_b)
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im = cv2.resize(im, (224, 224))
boeing_data.append(im)
boeing_label.append(0)
for filename in os.listdir(airbus_dir):
if filename.endswith(".jpg") or filename.endswith(".png") or filename.endswith(".jpeg"):
path_b = os.path.join(airbus_dir, filename)
im = cv2.imread(path_b)
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im = cv2.resize(im, (224, 224))
airbus_data.append(im)
airbus_label.append(1)
training_data = boeing_data + airbus_data #Concadenate Boeing and Airbus data
training_label = boeing_label + airbus_label
training_data = np.array(training_data)
training_label = np.asarray(training_label) # Turn Data into numpy arrays
training_data, training_label = shuffle(training_data, training_label) # Shuffle
print(training_data.shape)
print(training_label)
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(training_data, training_label, epochs=10, batch_size=20, validation_split=0.1, verbose=1)
print('finished training')
My model trains fine on a CPU machine but I am running into an issue when trying to rerun it on our cluster (using a single GPU and the same dataset). When training on a GPU machine validation loss and accuracy are not improving from epoch to epoch (see below).This was not the case on a CPU machine (I was able to achieve validation accuracy ~0.8 after 20 epochs)
Details:
Keras 2.1.3
TensforFlow backend
70/20/10 train/dev/test
~ 7000 images
model is based on ResNet50
Code
import sys
import math
import os
import glob
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Flatten, Dense
from keras import backend as k
from keras.callbacks import ModelCheckpoint, CSVLogger, EarlyStopping
############ Training parameters ##################
img_width, img_height = 224, 224
batch_size = 32
epochs = 100
############ Define the data ##################
train_data_dir = '/mnt/data/train'
validation_data_dir = '/mnt/data/validate'
train_data_dir_class1 = os.path.join(train_data_dir,'class1', '*.jpg')
train_data_dir_class2 = os.path.join(train_data_dir, 'class2', '*.jpg')
validation_data_dir_class1 = os.path.join(validation_data_dir, 'class1', '*.jpg')
validation_data_dir_class2 = os.path.join(validation_data_dir, 'class2', '*.jpg')
# number of training and validation samples
nb_train_samples = len(glob.glob(train_data_dir_class1)) + len(glob.glob(train_data_dir_class2))
nb_validation_samples = len(glob.glob(validation_data_dir_class1)) + len(glob.glob(validation_data_dir_class2))
############ Define the model ##################
model = applications.resnet50.ResNet50(weights = "imagenet",
include_top = False,
input_shape = (img_width, img_height, 3))
for layer in model.layers:
layer.trainable = False
# Adding a FC layer
x = model.output
x = Flatten()(x)
predictions = Dense(1, activation = "sigmoid")(x)
# creating the final model
model_final = Model(inputs = model.input, outputs = predictions)
# compile the model
model_final.compile(loss = "binary_crossentropy",
optimizer = optimizers.Adam(lr = 0.001,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = 1e-10),
metrics = ["accuracy"])
# train and test generators
train_datagen = ImageDataGenerator(rescale = 1./255,
horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
width_shift_range = 0.3,
height_shift_range = 0.3,
rotation_range = 30)
test_datagen = ImageDataGenerator(rescale = 1./255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "binary",
seed = 2018)
validation_generator = test_datagen.flow_from_directory(validation_data_dir,
target_size = (img_height, img_width),
class_mode = "binary",
seed = 2018)
early = EarlyStopping(monitor = 'val_loss', min_delta = 10e-5, patience = 10, verbose = 1, mode = 'auto')
performance_log = CSVLogger('/mnt/results/vanilla_model_log.csv', separator = ',', append = False)
# Train the model
model_final.fit_generator(generator = train_generator,
steps_per_epoch = math.ceil(train_generator.samples / batch_size),
epochs = epochs,
validation_data = validation_generator,
validation_steps = math.ceil(validation_generator.samples / batch_size),
callbacks = [early, performance_log])
# Save the model
model_final.save('/mnt/results/vanilla_model.h5')
Training Log
Epoch 1/100
151/151 [==============================] - 237s 2s/step - loss: 0.7234 - acc: 0.5240 - val_loss: 0.9899 - val_acc: 0.5425
Epoch 2/100
151/151 [==============================] - 65s 428ms/step - loss: 0.6491 - acc: 0.6228 - val_loss: 1.0248 - val_acc: 0.5425
Epoch 3/100
151/151 [==============================] - 65s 429ms/step - loss: 0.6091 - acc: 0.6648 - val_loss: 1.0377 - val_acc: 0.5425
Epoch 4/100
151/151 [==============================] - 64s 426ms/step - loss: 0.5829 - acc: 0.6968 - val_loss: 1.0459 - val_acc: 0.5425
Epoch 5/100
151/151 [==============================] - 64s 427ms/step - loss: 0.5722 - acc: 0.7070 - val_loss: 1.0472 - val_acc: 0.5425
Epoch 6/100
151/151 [==============================] - 64s 427ms/step - loss: 0.5582 - acc: 0.7166 - val_loss: 1.0501 - val_acc: 0.5425
Epoch 7/100
151/151 [==============================] - 64s 424ms/step - loss: 0.5535 - acc: 0.7188 - val_loss: 1.0492 - val_acc: 0.5425
Epoch 8/100
151/151 [==============================] - 64s 426ms/step - loss: 0.5377 - acc: 0.7287 - val_loss: 1.0209 - val_acc: 0.5425
Epoch 9/100
151/151 [==============================] - 64s 425ms/step - loss: 0.5328 - acc: 0.7368 - val_loss: 1.0062 - val_acc: 0.5425
Epoch 10/100
151/151 [==============================] - 65s 432ms/step - loss: 0.5296 - acc: 0.7381 - val_loss: 1.0016 - val_acc: 0.5425
Epoch 11/100
151/151 [==============================] - 65s 430ms/step - loss: 0.5231 - acc: 0.7419 - val_loss: 1.0021 - val_acc: 0.5425
Since I was able to get good results on a CPU machine, I hypothesized that validation loss/accuracy must be calculated incorrectly at the end of each epoch. To test this theory I used train set as validation set: if validation loss/accuracy is calculated correctly we should see roughly the same values for train and validation loss and accuracy. As you may see below, validation loss values are not the same as training loss values, which makes me believe validation loss is calculated incorrectly at the end of each epoch. Why does it happen? What are the possible solutions?
Modified Code
import sys
import math
import os
import glob
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Sequential, Model
from keras.layers import Flatten, Dense
from keras import backend as k
from keras.callbacks import ModelCheckpoint, CSVLogger, EarlyStopping
############ Training parameters ##################
img_width, img_height = 224, 224
batch_size = 32
epochs = 100
############ Define the data ##################
train_data_dir = '/mnt/data/train'
validation_data_dir = '/mnt/data/train' # redefined validation set to test accuracy of validation loss/accuracy calculations
train_data_dir_class1 = os.path.join(train_data_dir,'class1', '*.jpg')
train_data_dir_class2 = os.path.join(train_data_dir, 'class2', '*.jpg')
validation_data_dir_class1 = os.path.join(validation_data_dir, 'class1', '*.jpg')
validation_data_dir_class2 = os.path.join(validation_data_dir, 'class2', '*.jpg')
# number of training and validation samples
nb_train_samples = len(glob.glob(train_data_dir_class1)) + len(glob.glob(train_data_dir_class2))
nb_validation_samples = len(glob.glob(validation_data_dir_class1)) + len(glob.glob(validation_data_dir_class2))
############ Define the model ##################
model = applications.resnet50.ResNet50(weights = "imagenet",
include_top = False,
input_shape = (img_width, img_height, 3))
for layer in model.layers:
layer.trainable = False
# Adding a FC layer
x = model.output
x = Flatten()(x)
predictions = Dense(1, activation = "sigmoid")(x)
# creating the final model
model_final = Model(inputs = model.input, outputs = predictions)
# compile the model
model_final.compile(loss = "binary_crossentropy",
optimizer = optimizers.Adam(lr = 0.001,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = 1e-10),
metrics = ["accuracy"])
# train and test generators
train_datagen = ImageDataGenerator(rescale = 1./255,
horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
width_shift_range = 0.3,
height_shift_range = 0.3,
rotation_range = 30)
test_datagen = ImageDataGenerator(rescale = 1./255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "binary",
seed = 2018)
validation_generator = test_datagen.flow_from_directory(validation_data_dir,
target_size = (img_height, img_width),
class_mode = "binary",
seed = 2018)
early = EarlyStopping(monitor = 'val_loss', min_delta = 10e-5, patience = 10, verbose = 1, mode = 'auto')
performance_log = CSVLogger('/mnt/results/vanilla_model_log.csv', separator = ',', append = False)
# Train the model
model_final.fit_generator(generator = train_generator,
steps_per_epoch = math.ceil(train_generator.samples / batch_size),
epochs = epochs,
validation_data = validation_generator,
validation_steps = math.ceil(validation_generator.samples / batch_size),
callbacks = [early, performance_log])
# Save the model
model_final.save('/mnt/results/vanilla_model.h5')
Training log for the modified code:
Epoch 1/100
151/151 [==============================] - 251s 2s/step - loss: 0.6804 - acc: 0.5910 - val_loss: 0.6923 - val_acc: 0.5469
Epoch 2/100
151/151 [==============================] - 87s 578ms/step - loss: 0.6258 - acc: 0.6523 - val_loss: 0.6938 - val_acc: 0.5469
Epoch 3/100
151/151 [==============================] - 88s 580ms/step - loss: 0.5946 - acc: 0.6874 - val_loss: 0.7001 - val_acc: 0.5469
Epoch 4/100
151/151 [==============================] - 88s 580ms/step - loss: 0.5718 - acc: 0.7086 - val_loss: 0.7036 - val_acc: 0.5469
Epoch 5/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5634 - acc: 0.7157 - val_loss: 0.7067 - val_acc: 0.5469
Epoch 6/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5467 - acc: 0.7243 - val_loss: 0.7099 - val_acc: 0.5469
Epoch 7/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5392 - acc: 0.7317 - val_loss: 0.7096 - val_acc: 0.5469
Epoch 8/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5287 - acc: 0.7387 - val_loss: 0.7083 - val_acc: 0.5469
Epoch 9/100
151/151 [==============================] - 87s 575ms/step - loss: 0.5306 - acc: 0.7385 - val_loss: 0.7088 - val_acc: 0.5469
Epoch 10/100
151/151 [==============================] - 87s 577ms/step - loss: 0.5303 - acc: 0.7318 - val_loss: 0.7111 - val_acc: 0.5469
Epoch 11/100
151/151 [==============================] - 87s 578ms/step - loss: 0.5157 - acc: 0.7474 - val_loss: 0.7143 - val_acc: 0.5469
A very quick idea that might help.
I think image labels are randomly assigned by two image data generator and trained.
And two image data generator gives different label distribution.
That's why training accuracy goes up while validation set remains around 50%.
I haven't entirely checked documentation of data image generator. Hope this might helps.
Argument classes for flow_from_directory() describes a way of setting up training labels.
classes: optional list of class subdirectories (e.g. ['dogs',
'cats']). Default: None. If not provided, the list of classes will be
automatically inferred from the subdirectory names/structure under
directory, where each subdirectory will be treated as a different
class (and the order of the classes, which will map to the label
indices, will be alphanumeric). The dictionary containing the mapping
from class names to class indices can be obtained via the attribute
class_indices.