I have created a transfer learning model with RESNET-50. I am using K-fold cross-validation in my model. However, my model is not performing properly. Although I didn't get any error messages, my accuracy values are very weird. They are the same across all the folds. Please find my code below:
root_path = 'D:/regionGrowing_MLT/NewSavedRGBImages'
datasetFolderName=root_path
MODEL_FILENAME=root_path+"model_cv.h5"
sourceFiles=[]
classLabels=['Benign', 'Malignant']
X=[]
Y=[]
img_rows, img_cols = 224, 224 # input image dimensions
train_path=datasetFolderName+'/Training/'
validation_path=datasetFolderName+'/validation/'
test_path=datasetFolderName+'/test/'
def transferBetweenFolders(source, dest, splitRate):
global sourceFiles
sourceFiles=os.listdir(source)
if(len(sourceFiles)!=0):
transferFileNumbers=int(len(sourceFiles)*splitRate)
transferIndex=random.sample(range(0, len(sourceFiles)), transferFileNumbers)
for eachIndex in transferIndex:
shutil.move(source+str(sourceFiles[eachIndex]), dest+str(sourceFiles[eachIndex]))
else:
print("No file moved. Source empty!")
def transferAllClassBetweenFolders(source, dest, splitRate):
for label in classLabels:
transferBetweenFolders(datasetFolderName+'/'+source+'/'+label+'/',
datasetFolderName+'/'+dest+'/'+label+'/',
splitRate)
def my_metrics(y_true, y_pred):
accuracy=accuracy_score(y_true, y_pred)
precision=precision_score(y_true, y_pred,average='weighted')
f1Score=f1_score(y_true, y_pred, average='weighted')
print("Accuracy : {}".format(accuracy))
print("Precision : {}".format(precision))
print("f1Score : {}".format(f1Score))
cm=confusion_matrix(y_true, y_pred)
print(cm)
return accuracy, precision, f1Score
transferAllClassBetweenFolders('Training', 'test', 0.20)
def prepareNameWithLabels(folderName):
sourceFiles=os.listdir(datasetFolderName+'/Training/'+folderName)
for val in sourceFiles:
X.append(val)
for i in range(len(classLabels)):
if(folderName==classLabels[i]):
Y.append(i)
# Organize file names and class labels in X and Y variables
for i in range(len(classLabels)):
prepareNameWithLabels(classLabels[i])
X=np.asarray(X)
Y=np.asarray(Y)
batch_size = 32
epoch=100
activationFunction='relu'
def getModel():
model = Sequential()
model.add(Flatten())
model.add(Dense(256, activation='relu', name='fc1'))
model.add(Dense(128, activation='relu', name='fc2'))
model.add(layers.Dropout(0.5)) #### used for regularization (to aviod overfitting)
model.add(Dense(2, activation='softmax'))
# model.summary()
model.compile(optimizer=optimizers.Adam(learning_rate=2e-5),
loss='binary_crossentropy',
metrics=['accuracy'])
return model
model=getModel()
# ===============Stratified K-Fold======================
skf = StratifiedKFold(n_splits=5, shuffle=True)
skf.get_n_splits(X, Y)
foldNum=0
for train_index, val_index in skf.split(X, Y):
#First cut all images from validation to train (if any exists)
transferAllClassBetweenFolders('validation', 'Training', 1.0)
foldNum+=1
print("Results for fold",foldNum)
X_train, X_val = X[train_index], X[val_index]
Y_train, Y_val = Y[train_index], Y[val_index]
# Move validation images of this fold from train folder to the validation folder
for eachIndex in range(len(X_val)):
classLabel=''
for i in range(len(classLabels)):
if(Y_val[eachIndex]==i):
classLabel=classLabels[i]
#Then, copy the validation images to the validation folder
shutil.move(datasetFolderName+'/Training/'+classLabel+'/'+X_val[eachIndex],
datasetFolderName+'/validation/'+classLabel+'/'+X_val[eachIndex])
train_datagen = ImageDataGenerator(
rescale=1./255,
horizontal_flip=True,
rotation_range=40,
shear_range=0.2,
width_shift_range=0.2,
height_shift_range=0.2,
zoom_range=0.20,
fill_mode="nearest")
validation_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
#Start ImageClassification Model
train_generator = train_datagen.flow_from_directory(
train_path,
target_size=(img_rows, img_cols),
batch_size=batch_size,
class_mode='categorical',
subset='training')
validation_generator = validation_datagen.flow_from_directory(
validation_path,
target_size=(img_rows, img_cols),
batch_size=batch_size,
class_mode=None, # only data, no labels
shuffle=False)
# fit model
history=model.fit_generator(train_generator,
epochs=epoch)
predictions = model.predict_generator(validation_generator, verbose=1)
yPredictions = np.argmax(predictions, axis=1)
true_classes = validation_generator.classes
# evaluate validation performance
print("***Performance on Validation data***")
valAcc, valPrec, valFScore = my_metrics(true_classes, yPredictions)
My results are below:
Epoch 1/100
72/72 [==============================] - 25s 350ms/step - loss: 0.6931 - accuracy: 0.5013
Epoch 2/100
72/72 [==============================] - 25s 349ms/step - loss: 0.6931 - accuracy: 0.5013
Epoch 3/100
72/72 [==============================] - 25s 348ms/step - loss: 0.6931 - accuracy: 0.5013
Epoch 4/100
72/72 [==============================] - 25s 348ms/step - loss: 0.6931 - accuracy: 0.5013
Epoch 5/100
72/72 [==============================] - 25s 352ms/step - loss: 0.6931 - accuracy: 0.5013
Epoch 6/100
72/72 [==============================] - 25s 351ms/step - loss: 0.6931 - accuracy: 0.5013
Epoch 7/100
72/72 [==============================] - 25s 353ms/step - loss: 0.6931 - accuracy: 0.5013
As can be seen above, the accuracy values are the same for all epochs which shouldn't be the case. I am not sure exactly where I am making the error. Any suggestions would be appreciated. Thank you.
Related
I am running tf.keras.callbacks.ModelCheckpoint with the accuracy metric but loss is used to save the best checkpoints. I have tested this in different places (my computer and collab) and two different code and faced the same issue. Here is an example code and the results:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import os
import shutil
def get_uncompiled_model():
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
return model
def get_compiled_model():
model = get_uncompiled_model()
model.compile(
optimizer="rmsprop",
loss="sparse_categorical_crossentropy",
metrics=["accuracy"],
)
return model
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Preprocess the data (these are NumPy arrays)
x_train = x_train.reshape(60000, 784).astype("float32") / 255
x_test = x_test.reshape(10000, 784).astype("float32") / 255
y_train = y_train.astype("float32")
y_test = y_test.astype("float32")
# Reserve 10,000 samples for validation
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]
ckpt_folder = os.path.join(os.getcwd(), 'ckpt')
if os.path.exists(ckpt_folder):
shutil.rmtree(ckpt_folder)
ckpt_path = os.path.join(r'D:\deep_learning\tf_keras\semantic_segmentation\logs', 'mymodel_{epoch}')
callbacks = [
tf.keras.callbacks.ModelCheckpoint(
# Path where to save the model
# The two parameters below mean that we will overwrite
# the current checkpoint if and only if
# the `val_loss` score has improved.
# The saved model name will include the current epoch.
filepath=ckpt_path,
montior="val_accuracy",
# save the model weights with best validation accuracy
mode='max',
save_best_only=True, # only save the best weights
save_weights_only=False,
# only save model weights (not whole model)
verbose=1
)
]
model = get_compiled_model()
model.fit(
x_train, y_train, epochs=3, batch_size=1, callbacks=callbacks, validation_split=0.2, steps_per_epoch=1
)
1/1 [==============================] - ETA: 0s - loss: 2.6475 - accuracy: 0.0000e+00
Epoch 1: val_loss improved from -inf to 2.32311, saving model to D:\deep_learning\tf_keras\semantic_segmentation\logs\mymodel_1
1/1 [==============================] - 6s 6s/step - loss: 2.6475 - accuracy: 0.0000e+00 - val_loss: 2.3231 - val_accuracy: 0.1142
Epoch 2/3
1/1 [==============================] - ETA: 0s - loss: 1.9612 - accuracy: 1.0000
Epoch 2: val_loss improved from 2.32311 to 2.34286, saving model to D:\deep_learning\tf_keras\semantic_segmentation\logs\mymodel_2
1/1 [==============================] - 5s 5s/step - loss: 1.9612 - accuracy: 1.0000 - val_loss: 2.3429 - val_accuracy: 0.1187
Epoch 3/3
1/1 [==============================] - ETA: 0s - loss: 2.8378 - accuracy: 0.0000e+00
Epoch 3: val_loss did not improve from 2.34286
1/1 [==============================] - 5s 5s/step - loss: 2.8378 - accuracy: 0.0000e+00 - val_loss: 2.2943 - val_accuracy: 0.1346
In your code, You write montior instead of monitor, and the function doesn't have this word as param then use the default value, If you write like below, You get what you want:
callbacks = [
tf.keras.callbacks.ModelCheckpoint(
filepath=ckpt_path,
monitor="val_accuracy",
mode='max',
save_best_only=True,
save_weights_only=False,
verbose=1
)
]
I wrote this very simple code
model = keras.models.Sequential()
model.add(layers.Dense(13000, input_dim=X_train.shape[1], activation='relu', trainable=False))
model.add(layers.Dense(1, input_dim=13000, activation='linear'))
model.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])
model.fit(X_train, y_train, batch_size=X_train.shape[0], epochs=1000000, verbose=1)
The data is MNIST but only for digits '0' and '1'.
I have a very strange issue, where the loss is monotonically decreasing to zero, as expected, yet the accuracy instead of increasing, is also decreasing.
Here is a sample output
12665/12665 [==============================] - 0s 11us/step - loss: 0.0107 - accuracy: 0.2355
Epoch 181/1000000
12665/12665 [==============================] - 0s 11us/step - loss: 0.0114 - accuracy: 0.2568
Epoch 182/1000000
12665/12665 [==============================] - 0s 11us/step - loss: 0.0128 - accuracy: 0.2726
Epoch 183/1000000
12665/12665 [==============================] - 0s 11us/step - loss: 0.0133 - accuracy: 0.2839
Epoch 184/1000000
12665/12665 [==============================] - 0s 11us/step - loss: 0.0134 - accuracy: 0.2887
Epoch 185/1000000
12665/12665 [==============================] - 0s 11us/step - loss: 0.0110 - accuracy: 0.2842
Epoch 186/1000000
12665/12665 [==============================] - 0s 11us/step - loss: 0.0101 - accuracy: 0.2722
Epoch 187/1000000
12665/12665 [==============================] - 0s 11us/step - loss: 0.0094 - accuracy: 0.2583
Since we only have two classes, the benchmark for lowest possible accuracy should be 0.5, and furthermore we are monitoring accuracy on the training set, so it should very going up to 100%, I expect overfitting and I am overfitting according to the loss function.
At the final epoch, this is the situation
12665/12665 [==============================] - 0s 11us/step - loss: 9.9710e-06 - accuracy: 0.0758
a 7% accuracy when the worst theoretical possibility if you guess randomly is 50%. This is no accident. Something is going on here.
Can anyone see the problem?
Entire code
from tensorflow import keras
import numpy as np
from matplotlib import pyplot as plt
import keras
from keras.callbacks import Callback
from keras import layers
import warnings
class EarlyStoppingByLossVal(Callback):
def __init__(self, monitor='val_loss', value=0.00001, verbose=0):
super(Callback, self).__init__()
self.monitor = monitor
self.value = value
self.verbose = verbose
def on_epoch_end(self, epoch, logs={}):
current = logs.get(self.monitor)
if current is None:
warnings.warn("Early stopping requires %s available!" % self.monitor, RuntimeWarning)
if current < self.value:
if self.verbose > 0:
print("Epoch %05d: early stopping THR" % epoch)
self.model.stop_training = True
def load_mnist():
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = np.reshape(train_images, (train_images.shape[0], train_images.shape[1] * train_images.shape[2]))
test_images = np.reshape(test_images, (test_images.shape[0], test_images.shape[1] * test_images.shape[2]))
train_labels = np.reshape(train_labels, (train_labels.shape[0],))
test_labels = np.reshape(test_labels, (test_labels.shape[0],))
train_images = train_images[(train_labels == 0) | (train_labels == 1)]
test_images = test_images[(test_labels == 0) | (test_labels == 1)]
train_labels = train_labels[(train_labels == 0) | (train_labels == 1)]
test_labels = test_labels[(test_labels == 0) | (test_labels == 1)]
train_images, test_images = train_images / 255, test_images / 255
return train_images, train_labels, test_images, test_labels
X_train, y_train, X_test, y_test = load_mnist()
train_acc = []
train_errors = []
test_acc = []
test_errors = []
width_list = [13000]
for width in width_list:
print(width)
model = keras.models.Sequential()
model.add(layers.Dense(width, input_dim=X_train.shape[1], activation='relu', trainable=False))
model.add(layers.Dense(1, input_dim=width, activation='linear'))
model.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])
callbacks = [EarlyStoppingByLossVal(monitor='loss', value=0.00001, verbose=1)]
model.fit(X_train, y_train, batch_size=X_train.shape[0], epochs=1000000, verbose=1, callbacks=callbacks)
train_errors.append(model.evaluate(X_train, y_train)[0])
test_errors.append(model.evaluate(X_test, y_test)[0])
train_acc.append(model.evaluate(X_train, y_train)[1])
test_acc.append(model.evaluate(X_test, y_test)[1])
plt.plot(width_list, train_errors, marker='D')
plt.xlabel("width")
plt.ylabel("train loss")
plt.show()
plt.plot(width_list, test_errors, marker='D')
plt.xlabel("width")
plt.ylabel("test loss")
plt.show()
plt.plot(width_list, train_acc, marker='D')
plt.xlabel("width")
plt.ylabel("train acc")
plt.show()
plt.plot(width_list, test_acc, marker='D')
plt.xlabel("width")
plt.ylabel("test acc")
plt.show()
A linear activation in the last layer for a (binary) classification problem is meaningless; change your last layer to:
model.add(layers.Dense(1, input_dim=width, activation='sigmoid'))
Linear activations for the last layer are used for regression problems and not for classification ones.
I'm a beginner in deep learning and I'm trying to train a deep learning model to classify different ASL hand signs using Mobilenet_v2 and Inception.
Here are my codes create an ImageDataGenerator for creating the training and validation set.
# Reformat Images and Create Batches
IMAGE_RES = 224
BATCH_SIZE = 32
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
validation_split = 0.4
)
train_generator = datagen.flow_from_directory(
base_dir,
target_size = (IMAGE_RES,IMAGE_RES),
batch_size = BATCH_SIZE,
subset = 'training'
)
val_generator = datagen.flow_from_directory(
base_dir,
target_size= (IMAGE_RES, IMAGE_RES),
batch_size = BATCH_SIZE,
subset = 'validation'
)
Here are the codes to train the models:
# Do transfer learning with Tensorflow Hub
URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
feature_extractor = hub.KerasLayer(URL,
input_shape=(IMAGE_RES, IMAGE_RES, 3))
# Freeze pre-trained model
feature_extractor.trainable = False
# Attach a classification head
model = tf.keras.Sequential([
feature_extractor,
layers.Dense(5, activation='softmax')
])
model.summary()
# Train the model
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
EPOCHS = 5
history = model.fit(train_generator,
steps_per_epoch=len(train_generator),
epochs=EPOCHS,
validation_data = val_generator,
validation_steps=len(val_generator)
)
Epoch 1/5
94/94 [==============================] - 19s 199ms/step - loss: 0.7333 - accuracy: 0.7730 - val_loss: 0.6276 - val_accuracy: 0.7705
Epoch 2/5
94/94 [==============================] - 18s 190ms/step - loss: 0.1574 - accuracy: 0.9893 - val_loss: 0.5118 - val_accuracy: 0.8145
Epoch 3/5
94/94 [==============================] - 18s 191ms/step - loss: 0.0783 - accuracy: 0.9980 - val_loss: 0.4850 - val_accuracy: 0.8235
Epoch 4/5
94/94 [==============================] - 18s 196ms/step - loss: 0.0492 - accuracy: 0.9997 - val_loss: 0.4541 - val_accuracy: 0.8395
Epoch 5/5
94/94 [==============================] - 18s 193ms/step - loss: 0.0349 - accuracy: 0.9997 - val_loss: 0.4590 - val_accuracy: 0.8365
I've tried using data augmentation but the model still overfits so I'm wondering if I've done something wrong in my code.
Your data is very small. Try splitting with random seeds and check if the problem still persists.
If it does, then use regularizations and decrease the complexity of neural network.
Also experiment with different optimizers and smaller learning rate (try lr scheduler)
It seems like your dataset is very small with some true outputs separated only by a small distance of inputs in the input-output curve. That is why it is fitting easily to those points.
I trained LSTM classification model, but got weird results (0 accuracy). Here is my dataset with preprocessing steps:
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
import numpy as np
url = 'https://raw.githubusercontent.com/MislavSag/trademl/master/trademl/modeling/random_forest/X_TEST.csv'
X_TEST = pd.read_csv(url, sep=',')
url = 'https://raw.githubusercontent.com/MislavSag/trademl/master/trademl/modeling/random_forest/labeling_info_TEST.csv'
labeling_info_TEST = pd.read_csv(url, sep=',')
# TRAIN TEST SPLIT
X_train, X_test, y_train, y_test = train_test_split(
X_TEST.drop(columns=['close_orig']), labeling_info_TEST['bin'],
test_size=0.10, shuffle=False, stratify=None)
### PREPARE LSTM
x = X_train['close'].values.reshape(-1, 1)
y = y_train.values.reshape(-1, 1)
x_test = X_test['close'].values.reshape(-1, 1)
y_test = y_test.values.reshape(-1, 1)
train_val_index_split = 0.75
train_generator = keras.preprocessing.sequence.TimeseriesGenerator(
data=x,
targets=y,
length=30,
sampling_rate=1,
stride=1,
start_index=0,
end_index=int(train_val_index_split*X_TEST.shape[0]),
shuffle=False,
reverse=False,
batch_size=128
)
validation_generator = keras.preprocessing.sequence.TimeseriesGenerator(
data=x,
targets=y,
length=30,
sampling_rate=1,
stride=1,
start_index=int((train_val_index_split*X_TEST.shape[0] + 1)),
end_index=None, #int(train_test_index_split*X.shape[0])
shuffle=False,
reverse=False,
batch_size=128
)
test_generator = keras.preprocessing.sequence.TimeseriesGenerator(
data=x_test,
targets=y_test,
length=30,
sampling_rate=1,
stride=1,
start_index=0,
end_index=None,
shuffle=False,
reverse=False,
batch_size=128
)
# convert generator to inmemory 3D series (if enough RAM)
def generator_to_obj(generator):
xlist = []
ylist = []
for i in range(len(generator)):
x, y = train_generator[i]
xlist.append(x)
ylist.append(y)
X_train = np.concatenate(xlist, axis=0)
y_train = np.concatenate(ylist, axis=0)
return X_train, y_train
X_train_lstm, y_train_lstm = generator_to_obj(train_generator)
X_val_lstm, y_val_lstm = generator_to_obj(validation_generator)
X_test_lstm, y_test_lstm = generator_to_obj(test_generator)
# test for shapes
print('X and y shape train: ', X_train_lstm.shape, y_train_lstm.shape)
print('X and y shape validate: ', X_val_lstm.shape, y_val_lstm.shape)
print('X and y shape test: ', X_test_lstm.shape, y_test_lstm.shape)
and here is my model with resuslts:
### MODEL
model = keras.models.Sequential([
keras.layers.LSTM(124, return_sequences=True, input_shape=[None, 1]),
keras.layers.LSTM(258),
keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_train_lstm, y_train_lstm, epochs=10, batch_size=128,
validation_data=[X_val_lstm, y_val_lstm])
# history = model.fit_generator(train_generator, epochs=40, validation_data=validation_generator, verbose=1)
score, acc = model.evaluate(X_val_lstm, y_val_lstm,
batch_size=128)
historydf = pd.DataFrame(history.history)
historydf.head(10)
Why do I get 0 accuracy?
You're using sigmoid activation, which means your labels must be in range 0 and 1. But in your case, the labels are 1. and -1.
Just replace -1 with 0.
for i, y in enumerate(y_train_lstm):
if y == -1.:
y_train_lstm[i,:] = 0.
for i, y in enumerate(y_val_lstm):
if y == -1.:
y_val_lstm[i,:] = 0.
for i, y in enumerate(y_test_lstm):
if y == -1.:
y_test_lstm[i,:] = 0.
Sidenote:
The signals are very close, it would be hard to distinguish them. So, probably accuracy won't be high with simple models.
After training with 0. and 1. labels,
model = keras.models.Sequential([
keras.layers.LSTM(124, return_sequences=True, input_shape=(30, 1)),
keras.layers.LSTM(258),
keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_train_lstm, y_train_lstm, epochs=5, batch_size=128,
validation_data=(X_val_lstm, y_val_lstm))
# history = model.fit_generator(train_generator, epochs=40, validation_data=validation_generator, verbose=1)
score, acc = model.evaluate(X_val_lstm, y_val_lstm,
batch_size=128)
historydf = pd.DataFrame(history.history)
historydf.head(10)
Epoch 1/5
12/12 [==============================] - 5s 378ms/step - loss: 0.7386 - accuracy: 0.4990 - val_loss: 0.6959 - val_accuracy: 0.4896
Epoch 2/5
12/12 [==============================] - 4s 318ms/step - loss: 0.6947 - accuracy: 0.5133 - val_loss: 0.6959 - val_accuracy: 0.5104
Epoch 3/5
12/12 [==============================] - 4s 318ms/step - loss: 0.6941 - accuracy: 0.4895 - val_loss: 0.6930 - val_accuracy: 0.5104
Epoch 4/5
12/12 [==============================] - 4s 332ms/step - loss: 0.6946 - accuracy: 0.5269 - val_loss: 0.6946 - val_accuracy: 0.5104
Epoch 5/5
12/12 [==============================] - 4s 334ms/step - loss: 0.6931 - accuracy: 0.4901 - val_loss: 0.6929 - val_accuracy: 0.5104
3/3 [==============================] - 0s 73ms/step - loss: 0.6929 - accuracy: 0.5104
loss accuracy val_loss val_accuracy
0 0.738649 0.498980 0.695888 0.489583
1 0.694708 0.513256 0.695942 0.510417
2 0.694117 0.489463 0.692987 0.510417
3 0.694554 0.526852 0.694613 0.510417
4 0.693118 0.490143 0.692936 0.510417
Source code in colab: https://colab.research.google.com/drive/10yRf4TfGDnp_4F2HYoxPyTlF18no-8Dr?usp=sharing
I'm using ImageDataGenerator and flow_from_directory to generate my data, and
using model.fit_generator to fit the data.
This defaults to outputting the accuracy for training data set only.
There doesn't seem to be an option to output validation accuracy to the terminal.
Here is the relevant portion of my code:
#train data generator
print('Starting Preprocessing')
train_datagen = ImageDataGenerator(preprocessing_function = preprocess)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = 'categorical') #class_mode = 'categorical'
#same for validation
val_datagen = ImageDataGenerator(preprocessing_function = preprocess)
validation_generator = val_datagen.flow_from_directory(
validation_data_dir,
target_size = (img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
########################Model Creation###################################
#create the base pre-trained model
print('Finished Preprocessing, starting model creating \n')
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(12, activation='softmax')(x)
model = Model(input=base_model.input, output=predictions)
for layer in model.layers[:-34]:
layer.trainable = False
for layer in model.layers[-34:]:
layer.trainable = True
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.001, momentum=0.92),
loss='categorical_crossentropy',
metrics = ['accuracy'])
#############SAVE Model #######################################
file_name = str(datetime.datetime.now()).split(' ')[0] + '_{epoch:02d}.hdf5'
filepath = os.path.join(save_dir, file_name)
checkpoints =ModelCheckpoint(filepath, monitor='val_acc', verbose=1,
save_best_only=False, save_weights_only=False,
mode='auto', period=2)
###############Fit Model #############################
model.fit_generator(
train_generator,
steps_per_epoch =total_samples//batch_size,
epochs = epochs,
validation_data=validation_generator,
validation_steps=total_validation//batch_size,
callbacks = [checkpoints],
shuffle= True)
UPDATE OUTPUT:
Throughout training, I'm only getting the output of training accuracy,
but at the end of training, I"m getting both training, validation accuracy.
Epoch 1/10
1/363 [..............................] - ETA: 1:05:58 - loss: 2.4976 - acc: 0.0640
2/363 [..............................] - ETA: 51:33 - loss: 2.4927 - acc: 0.0760
3/363 [..............................] - ETA: 48:55 - loss: 2.5067 - acc: 0.0787
4/363 [..............................] - ETA: 47:26 - loss: 2.5110 - acc: 0.0770
5/363 [..............................] - ETA: 46:30 - loss: 2.5021 - acc: 0.0824
6/363 [..............................] - ETA: 45:56 - loss: 2.5063 - acc: 0.0820
The idea is that you go through you validation set after each epoch, not after each batch.
If after every batch, you had to evaluate the performances of the model on the whole validation set, you would loose a lot of time.
After each epoch, you will have the corresponding losses and accuracies both for training and validation. But during one epoch, you will only have access to the training loss and accuracy.
Validation loss and validation accuracy gets printed for every epoch once you specify the validation_split.
model.fit(X, Y, epochs=1000, batch_size=10, validation_split=0.2)
I have used the above in my code, and val_loss and val_acc are getting printed for every epoch, but not after every batch.
Hope that answers your question.
Epoch 1/500
1267/1267 [==============================] - 0s 376us/step - loss: 0.6428 - acc: 0.6409 - val_loss: 0.5963 - val_acc: 0.6656
In fit_generator,
fit_generator(generator, steps_per_epoch=None, epochs=1, verbose=1, callbacks=None, **validation_data=None, validation_steps=None**, validation_freq=1, class_weight=None, max_queue_size=10, workers=1, use_multiprocessing=False, shuffle=True, initial_epoch=0)
since there is no validation_split parameter, you can create two different ImageDataGenerator flow, one for training and one for validating and then place that 'validation_generator' in validation_data. Then it will print the validation loss and accuracy.