I want to binary classify breast cancer histopathological images from the BreakHis dataset (https://www.kaggle.com/ambarish/breakhis) using transfer learning and the Inception Resnet v2. The goal is to freeze all layers and train the fully connected layer by adding two neurons to the model. In particular, initially I want to consider the images related to the magnificant factor 40X (Benign: 625, Malignant: 1370). Here is a summary of what I do:
I read the images and resize them to 150x150
I partition the dataset into training, validation and test set
I load the pre-trained network Inception Resnet v2
I freeze all the layers I add the two neurons for binary
classification (1 = "benign", 0 = "malignant")
I compile the model using as activation function the Adam method
I carry out the training
I make the prediction
I calculate the accuracy
This is the code:
data = dataset[dataset["Magnificant"]=="40X"]
def preprocessing(dataset, img_size):
# images
X = []
# labels
y = []
i = 0
for image in list(dataset["Path"]):
# Ridimensiono e leggo le immagini
X.append(cv2.resize(cv2.imread(image, cv2.IMREAD_COLOR),
(img_size, img_size), interpolation=cv2.INTER_CUBIC))
basename = os.path.basename(image)
# Get labels
if dataset.loc[i][2] == "benign":
y.append(1)
else:
y.append(0)
i = i+1
return X, y
X, y = preprocessing(data, 150)
X = np.array(X)
y = np.array(y)
# Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify = y_40, shuffle=True, random_state=1)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, random_state=1)
conv_base = InceptionResNetV2(weights='imagenet', include_top=False, input_shape=[150, 150, 3])
# Freezing
for layer in conv_base.layers:
layer.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(1, activation='sigmoid'))
opt = tf.keras.optimizers.Adam(learning_rate=0.0002)
loss = tf.keras.losses.BinaryCrossentropy(from_logits=True)
model.compile(loss=loss, optimizer=opt, metrics = ["accuracy", tf.metrics.AUC()])
batch_size = 32
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow(X_train, y_train, batch_size=batch_size)
val_generator = val_datagen.flow(X_val, y_val, batch_size=batch_size)
ntrain =len(X_train)
nval = len(X_val)
len(y_train)
epochs = 70
history = model.fit_generator(train_generator,
steps_per_epoch=ntrain // batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=nval // batch_size)
This is the output of the training at the last epoch:
Epoch 70/70
32/32 [==============================] - 3s 84ms/step - loss: 0.0499 - accuracy: 0.9903 - auc_5: 0.9996 - val_loss: 0.5661 - val_accuracy: 0.8250 - val_auc_5: 0.8521
I make the prediction:
test_datagen = ImageDataGenerator(rescale=1./255)
x = X_test
y_pred = model.predict(test_datagen.flow(x))
y_p = []
for i in range(len(y_pred)):
if y_pred[i] > 0.5:
y_p.append(1)
else:
y_p.append(0)
I calculate the accuracy:
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_p)
print(accuracy)
This is the accuracy value I get: 0.5459098497495827
Why do I get such low accuracy, I have done several tests but I always get similar results?
Update
I have made the following changes but I always get the same results (place only the modified parts of the code):
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify = y, shuffle=True, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.25, stratify = y_train, shuffle=True, random_state=1)
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)
ntrain =len(X_train)
nval = len(X_val)
len(y_train)
epochs = 70
history = model.fit_generator(train_generator,
steps_per_epoch=ntrain // batch_size,
epochs=epochs,
validation_data=val_generator,
validation_steps=nval // batch_size, callbacks=[callback])
Update 2
I also changed from_logits from True to False, but of course that's not the problem yet. I always get 57% accuracy.
This is the model.fit output over 30 epochs:
Epoch 1/30
32/32 [==============================] - 23s 202ms/step - loss: 0.7994 - accuracy: 0.6010 - auc: 0.5272 - val_loss: 0.5338 - val_accuracy: 0.7688 - val_auc: 0.7943
Epoch 2/30
32/32 [==============================] - 3s 87ms/step - loss: 0.5778 - accuracy: 0.7206 - auc: 0.7521 - val_loss: 0.4763 - val_accuracy: 0.7781 - val_auc: 0.8155
Epoch 3/30
32/32 [==============================] - 3s 85ms/step - loss: 0.5311 - accuracy: 0.7581 - auc: 0.7710 - val_loss: 0.4740 - val_accuracy: 0.7719 - val_auc: 0.8212
Epoch 4/30
32/32 [==============================] - 3s 85ms/step - loss: 0.4684 - accuracy: 0.7718 - auc: 0.8219 - val_loss: 0.4270 - val_accuracy: 0.8031 - val_auc: 0.8611
Epoch 5/30
32/32 [==============================] - 3s 83ms/step - loss: 0.4280 - accuracy: 0.7943 - auc: 0.8617 - val_loss: 0.4496 - val_accuracy: 0.7969 - val_auc: 0.8468
Epoch 6/30
32/32 [==============================] - 3s 88ms/step - loss: 0.4237 - accuracy: 0.8250 - auc: 0.8673 - val_loss: 0.3993 - val_accuracy: 0.7937 - val_auc: 0.8840
Epoch 7/30
32/32 [==============================] - 3s 85ms/step - loss: 0.4130 - accuracy: 0.8513 - auc: 0.8767 - val_loss: 0.4207 - val_accuracy: 0.7781 - val_auc: 0.8692
Epoch 8/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3446 - accuracy: 0.8485 - auc: 0.9077 - val_loss: 0.4229 - val_accuracy: 0.7937 - val_auc: 0.8730
Epoch 9/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3690 - accuracy: 0.8514 - auc: 0.9003 - val_loss: 0.4300 - val_accuracy: 0.8062 - val_auc: 0.8696
Epoch 10/30
32/32 [==============================] - 3s 100ms/step - loss: 0.3204 - accuracy: 0.8533 - auc: 0.9270 - val_loss: 0.4235 - val_accuracy: 0.7969 - val_auc: 0.8731
Epoch 11/30
32/32 [==============================] - 3s 86ms/step - loss: 0.3555 - accuracy: 0.8508 - auc: 0.9124 - val_loss: 0.4124 - val_accuracy: 0.8000 - val_auc: 0.8797
Epoch 12/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3243 - accuracy: 0.8481 - auc: 0.9308 - val_loss: 0.3979 - val_accuracy: 0.7969 - val_auc: 0.8908
Epoch 13/30
32/32 [==============================] - 3s 85ms/step - loss: 0.3017 - accuracy: 0.8744 - auc: 0.9348 - val_loss: 0.4239 - val_accuracy: 0.8094 - val_auc: 0.8758
Epoch 14/30
32/32 [==============================] - 3s 89ms/step - loss: 0.3317 - accuracy: 0.8521 - auc: 0.9221 - val_loss: 0.4238 - val_accuracy: 0.8094 - val_auc: 0.8704
Epoch 15/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2840 - accuracy: 0.8908 - auc: 0.9490 - val_loss: 0.4131 - val_accuracy: 0.8281 - val_auc: 0.8858
Epoch 16/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2583 - accuracy: 0.8905 - auc: 0.9511 - val_loss: 0.3841 - val_accuracy: 0.8375 - val_auc: 0.9007
Epoch 17/30
32/32 [==============================] - 3s 87ms/step - loss: 0.2810 - accuracy: 0.8648 - auc: 0.9470 - val_loss: 0.3928 - val_accuracy: 0.8438 - val_auc: 0.8972
Epoch 18/30
32/32 [==============================] - 3s 89ms/step - loss: 0.2622 - accuracy: 0.8923 - auc: 0.9550 - val_loss: 0.3732 - val_accuracy: 0.8438 - val_auc: 0.9089
Epoch 19/30
32/32 [==============================] - 3s 84ms/step - loss: 0.2486 - accuracy: 0.8990 - auc: 0.9579 - val_loss: 0.4077 - val_accuracy: 0.8250 - val_auc: 0.8924
Epoch 20/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2412 - accuracy: 0.9074 - auc: 0.9635 - val_loss: 0.4249 - val_accuracy: 0.8219 - val_auc: 0.8787
Epoch 21/30
32/32 [==============================] - 3s 84ms/step - loss: 0.2386 - accuracy: 0.9095 - auc: 0.9657 - val_loss: 0.4177 - val_accuracy: 0.8094 - val_auc: 0.8904
Epoch 22/30
32/32 [==============================] - 3s 99ms/step - loss: 0.2313 - accuracy: 0.8996 - auc: 0.9668 - val_loss: 0.4089 - val_accuracy: 0.8406 - val_auc: 0.8890
Epoch 23/30
32/32 [==============================] - 3s 86ms/step - loss: 0.2424 - accuracy: 0.9067 - auc: 0.9654 - val_loss: 0.4033 - val_accuracy: 0.8500 - val_auc: 0.8953
Epoch 24/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2315 - accuracy: 0.9045 - auc: 0.9626 - val_loss: 0.3903 - val_accuracy: 0.8250 - val_auc: 0.9030
Epoch 25/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2001 - accuracy: 0.9321 - auc: 0.9788 - val_loss: 0.4276 - val_accuracy: 0.8000 - val_auc: 0.8855
Epoch 26/30
32/32 [==============================] - 3s 87ms/step - loss: 0.2118 - accuracy: 0.9212 - auc: 0.9695 - val_loss: 0.4335 - val_accuracy: 0.8125 - val_auc: 0.8897
Epoch 27/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2463 - accuracy: 0.8941 - auc: 0.9665 - val_loss: 0.4112 - val_accuracy: 0.8438 - val_auc: 0.8882
Epoch 28/30
32/32 [==============================] - 3s 85ms/step - loss: 0.2130 - accuracy: 0.9033 - auc: 0.9771 - val_loss: 0.3834 - val_accuracy: 0.8406 - val_auc: 0.9021
Epoch 29/30
32/32 [==============================] - 3s 86ms/step - loss: 0.2021 - accuracy: 0.9229 - auc: 0.9754 - val_loss: 0.3855 - val_accuracy: 0.8469 - val_auc: 0.9008
Epoch 30/30
32/32 [==============================] - 3s 88ms/step - loss: 0.1859 - accuracy: 0.9314 - auc: 0.9824 - val_loss: 0.4018 - val_accuracy: 0.8375 - val_auc: 0.8928
You have to changefrom_logits=True to from_logits=False in your loss function. Once again Credits - #Frightera.
It seems like your model is over-fitting somewhere. It would be best if you could check for that.
Do the K-Fold test for 10 folds. It would show the true results
In your metrics, do add the F1 score. The F1 value would give you a real look into the metrics of the TP in terms of both FP and FN
Add some augmentations (apart from the rescaling one) to make the model robust to changes in the dataset.
Tweak the training parameters (if you feel).
If these changes fail, then there might be a possibility that the model fails to learn the artifacts of the image. You should go ahead with a different model!
Related
I'm doing a species classification task from kaggle (https://www.kaggle.com/competitions/yum-or-yuck-butterfly-mimics-2022/overview). I decided to use transfer learning to tackle this problem since there aren't that many images. The model is as follows:
inputs = tf.keras.layers.Input(shape=(224, 224, 3))
base_model = tf.keras.applications.resnet50.ResNet50(
input_shape=(224,224,3),
include_top=False,
weights="imagenet")
for layer in base_model.layers:
layer.trainable = False
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(1024, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(512, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(64, activation="relu")(x)
output = tf.keras.layers.Dense(6, activation="softmax")(x)
model = tf.keras.Model(inputs=inputs, outputs=output)
As per the guidelines when doing transfer learning:https://keras.io/guides/transfer_learning/, I'm freezing the resnet layers and training the model on inference only (training=False). However, the results show that the model is not learning properly. Convergence doesn't seem like it will be possible even after nearly 200 epochs:
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="categorical_crossentropy",
metrics="accuracy",
)
stop_early = tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
min_delta=0.0001,
patience=20,
restore_best_weights=True
)
history = model.fit(train_generator,
validation_data = val_generator,
epochs = 200,
callbacks=[stop_early])
22/22 [==============================] - 19s 442ms/step - loss: 1.9317 - accuracy: 0.1794 - val_loss: 1.8272 - val_accuracy: 0.1618
Epoch 2/200
22/22 [==============================] - 9s 398ms/step - loss: 1.8250 - accuracy: 0.1882 - val_loss: 1.7681 - val_accuracy: 0.2197
Epoch 3/200
22/22 [==============================] - 9s 402ms/step - loss: 1.7927 - accuracy: 0.2294 - val_loss: 1.7612 - val_accuracy: 0.2139
Epoch 4/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7930 - accuracy: 0.2000 - val_loss: 1.7640 - val_accuracy: 0.2139
Epoch 5/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7872 - accuracy: 0.2132 - val_loss: 1.7489 - val_accuracy: 0.3121
Epoch 6/200
22/22 [==============================] - 9s 389ms/step - loss: 1.7700 - accuracy: 0.2574 - val_loss: 1.7378 - val_accuracy: 0.2543
Epoch 7/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7676 - accuracy: 0.2353 - val_loss: 1.7229 - val_accuracy: 0.3064
Epoch 8/200
22/22 [==============================] - 9s 427ms/step - loss: 1.7721 - accuracy: 0.2353 - val_loss: 1.7225 - val_accuracy: 0.2948
Epoch 9/200
22/22 [==============================] - 9s 399ms/step - loss: 1.7522 - accuracy: 0.2588 - val_loss: 1.7267 - val_accuracy: 0.2948
Epoch 10/200
22/22 [==============================] - 9s 395ms/step - loss: 1.7434 - accuracy: 0.2735 - val_loss: 1.7151 - val_accuracy: 0.2948
Epoch 11/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7500 - accuracy: 0.2632 - val_loss: 1.7083 - val_accuracy: 0.3064
Epoch 12/200
22/22 [==============================] - 9s 425ms/step - loss: 1.7307 - accuracy: 0.2721 - val_loss: 1.6899 - val_accuracy: 0.3179
Epoch 13/200
22/22 [==============================] - 9s 407ms/step - loss: 1.7439 - accuracy: 0.2794 - val_loss: 1.7045 - val_accuracy: 0.2948
Epoch 14/200
22/22 [==============================] - 9s 404ms/step - loss: 1.7376 - accuracy: 0.2706 - val_loss: 1.7118 - val_accuracy: 0.2659
Epoch 15/200
22/22 [==============================] - 9s 419ms/step - loss: 1.7588 - accuracy: 0.2647 - val_loss: 1.6684 - val_accuracy: 0.3237
Epoch 16/200
22/22 [==============================] - 9s 394ms/step - loss: 1.7289 - accuracy: 0.2824 - val_loss: 1.6733 - val_accuracy: 0.3064
Epoch 17/200
22/22 [==============================] - 9s 387ms/step - loss: 1.7184 - accuracy: 0.2809 - val_loss: 1.7185 - val_accuracy: 0.2659
Epoch 18/200
22/22 [==============================] - 9s 408ms/step - loss: 1.7242 - accuracy: 0.2765 - val_loss: 1.6961 - val_accuracy: 0.2717
Epoch 19/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7218 - accuracy: 0.2853 - val_loss: 1.6757 - val_accuracy: 0.3006
Epoch 20/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7248 - accuracy: 0.2882 - val_loss: 1.6716 - val_accuracy: 0.3064
Epoch 21/200
22/22 [==============================] - 9s 401ms/step - loss: 1.7134 - accuracy: 0.2838 - val_loss: 1.6666 - val_accuracy: 0.2948
Epoch 22/200
22/22 [==============================] - 9s 393ms/step - loss: 1.7140 - accuracy: 0.2941 - val_loss: 1.6427 - val_accuracy: 0.3064
I need to unfreeze the layers and turn off inference in order for the model to learn. I tested the same scenario with EfficientNet and the same thing happened. Finally, I also used Xception, and freezing the layers and running with inference was fine. So it seems they behave differently, even though they all have batchnorm layers.
I'm not understanding what is going on here. Why would I need to turn inference off? Could anyone have a clue about this?
EDIT:
results from Resnet50:
results from Xception:
I was just following a TensorFlow example from the book Hands-On Machine Learning with Scikit-Learn and TensorFlow but got weird results.
The example:
import tensorflow as tf
from tensorflow import keras
tf.__version__
keras.__version__
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
X_valid, X_train = X_train_full[:5000] / 255.0, X_train_full[5000:] / 255.0
y_valid, y_train = y_train_full[:5000] / 255.0, y_train_full[5000:] / 255.0
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(300, activation="relu"),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
optimizer='sgd',
metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=50, validation_data=(X_valid, y_valid))
As the epochs evolve we should se an improvement for accuracy as indicated in the book:
Train on 55000 samples, validate on 5000 samples
Epoch 1/30
55000/55000 [==========] - 3s 55us/sample - loss: 1.4948 - acc: 0.5757 - val_loss: 1.0042 - val_acc: 0.7166
Epoch 2/30
55000/55000 [==========] - 3s 55us/sample - loss: 0.8690 - acc: 0.7318 - val_loss: 0.7549 - val_acc: 0.7616
[...]
Epoch 50/50
55000/55000 [==========] - 4s 72us/sample - loss: 0.3607 - acc: 0.8752 - acc: 0.8752 -val_loss: 0.3706 - val_acc: 0.8728
But when I ran I got the following:
Epoch 1/30
1719/1719 [==============================] - 3s 2ms/step - loss: 0.0623 - accuracy: 0.1005 - val_loss: 0.0011 - val_accuracy: 0.0914
Epoch 2/30
1719/1719 [==============================] - 3s 2ms/step - loss: 8.7637e-04 - accuracy: 0.1011 - val_loss: 5.2079e-04 - val_accuracy: 0.0914
Epoch 3/30
1719/1719 [==============================] - 3s 2ms/step - loss: 4.9200e-04 - accuracy: 0.1019 - val_loss: 3.4211e-04 - val_accuracy: 0.0914
[...]
Epoch 49/50
1719/1719 [==============================] - 3s 2ms/step - loss: 3.1710e-05 - accuracy: 0.0992 - val_loss: 3.2966e-05 - val_accuracy: 0.0914
Epoch 50/50
1719/1719 [==============================] - 3s 2ms/step - loss: 2.7711e-05 - accuracy: 0.1022 - val_loss: 3.1833e-05 - val_accuracy: 0.0914
So, as you can see the reproduction got a strongly lower accuracy that has not improved: it stayed at 0.0914 instead of 0.8728.
Is there something wrong in my TensorFlow installation, setup or even in the code?
you can not divide y such as y_valid, y_train = y_train_full[:5000] / 255.0, y_train_full[5000:] / 255.0. The completed code is following :
import tensorflow as tf
from tensorflow import keras
tf.__version__
keras.__version__
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
X_train_full = X_train_full / 255.0
X_test = X_test / 255.0
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
optimizer='sgd',
metrics=['accuracy'])
history = model.fit(X_train_full, y_train_full, epochs=5, validation_data=(X_test, y_test))
It will give the acc like :
Epoch 1/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.9880 - accuracy: 0.6923 - val_loss: 0.5710 - val_accuracy: 0.8054
Epoch 2/5
1875/1875 [==============================] - 2s 944us/step - loss: 0.5281 - accuracy: 0.8227 - val_loss: 0.5112 - val_accuracy: 0.8228
Epoch 3/5
1875/1875 [==============================] - 2s 913us/step - loss: 0.4720 - accuracy: 0.8391 - val_loss: 0.4782 - val_accuracy: 0.8345
Epoch 4/5
1875/1875 [==============================] - 2s 915us/step - loss: 0.4492 - accuracy: 0.8462 - val_loss: 0.4568 - val_accuracy: 0.8410
Epoch 5/5
1875/1875 [==============================] - 2s 935us/step - loss: 0.4212 - accuracy: 0.8550 - val_loss: 0.4469 - val_accuracy: 0.8444
Also, optimizer adam may be give better result than sgd.
As the title suggests I am training my IRV2 network using the following EarlyStopping definition:
callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, mode="auto")
However, the training doesn't stop when I get three equal values of val_loss:
history = model.fit(
X_train_s,
y_train_categorical,
steps_per_epoch=steps_per_epoch,
epochs=epochs,
batch_size=batch_size,
validation_data=(X_validation_s, y_validation_categorical),
callbacks=[callback]
)
This is my model:
def Inception_Resnet_V2_Binary(x_train, batch_size):
conv_base = InceptionResNetV2(weights='imagenet', include_top=False, input_shape=image_size, pooling="avg")
for layer in conv_base.layers:
layer.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(2, activation='softmax'))
steps_per_epoch = (len(x_train))/ batch_size
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate=0.0002,
decay_steps=steps_per_epoch * 2,
decay_rate=0.7)
opt = Adam(learning_rate=lr_schedule)
model.compile(loss="binary_crossentropy", optimizer=opt, metrics = ["accuracy", tf.metrics.AUC()])
return model
This is the training output:
Epoch 28/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5149 - accuracy: 0.7398 - auc_1: 0.8339 - val_loss: 0.5217 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 29/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5127 - accuracy: 0.7441 - auc_1: 0.8354 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 30/70
151/151 [==============================] - 12s 79ms/step - loss: 0.5144 - accuracy: 0.7384 - auc_1: 0.8321 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 31/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5152 - accuracy: 0.7402 - auc_1: 0.8332 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 32/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5143 - accuracy: 0.7410 - auc_1: 0.8347 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 33/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5124 - accuracy: 0.7404 - auc_1: 0.8352 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 34/70
151/151 [==============================] - 12s 81ms/step - loss: 0.5106 - accuracy: 0.7441 - auc_1: 0.8363 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 35/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5129 - accuracy: 0.7389 - auc_1: 0.8342 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 36/70
151/151 [==============================] - 12s 79ms/step - loss: 0.5122 - accuracy: 0.7400 - auc_1: 0.8341 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 37/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5160 - accuracy: 0.7424 - auc_1: 0.8346 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 38/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5175 - accuracy: 0.7367 - auc_1: 0.8318 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246
That happens because the loss is actually decreasing, but by a value that is very low. If you don't set the min_delta in the early stopping callback, the training will consider a negligible improvement as an actual improvement. You can solve the problem by simply adding min_delta argument=0.001:
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, mode="auto", min_delta=0.001)
You can set the min_delta as you feel suitable. min_delta=0.001 will consider changes in loss that are less than that value as no improvement
That val_loss it's not a float of 4 decimal numbers, you are not seeing the entire value, patience should be used to prevent a network in overfitting to run for hours, if the val_loss keeps lowering itself just let the network run (and from the loss the learning rate seems a bit high).
Val_loss is a float32, those 4 values are only the 4 most significant value, to see what's really going on you can't rely on the fit output, you will need a callback of some sort that prints the val_loss with the format you want.
you can find some examples here:
https://keras.io/guides/writing_your_own_callbacks/
This is my model.
def get_model2():
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
0.001,
decay_steps=100000,
decay_rate=0.96,
staircase=True)
model = Sequential()
model.add(Dense(1024,activation='relu',input_shape=[44]))
model.add(Dropout(0.2))
model.add(Dense(2048,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(4098,activation='relu'))
model.add(Dense(2048,activation='relu'))
model.add(Dense(1024,activation='relu'))
model.add(Dense(512,activation='relu'))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss=my_binary_crossentropy,optimizer=tf.keras.optimizers.Adam(learning_rate=lr_schedule),metrics=['accuracy'])
return model
And this is my training:
from sklearn.model_selection import RepeatedStratifiedKFold
model = get_model2()
model.save_weights('model.h5')
o = 0
hlavny_list = []
skf = RepeatedStratifiedKFold(n_splits=10,n_repeats=10,random_state=2027)
for train_index,test_index in skf.split(X, Y):
o = o + 1
X_test, X_train = X[train_index], X[test_index]
y_test, y_train = Y[train_index], Y[test_index]
model.load_weights('model.h5')
model.fit(X_train,y_train,epochs=10000,batch_size=256,validation_data=(X_test,y_test),callbacks=[early_stop])
vys = model.predict_classes(X_test)
a,b,c,d = potrebne_miery(y_true=y_test,y_pred=vys)
hl_list = [a,b,c,d]
hlavny_list.append(hl_list)
if (o % 4 == 0):
np.savetxt('/content/drive/My Drive/siete/model_t_t_1_9_moja_loss_v2.csv',np.array(hlavny_list),delimiter=',')
np.savetxt('/content/drive/My Drive/siete/model_t_t_1_9_moja_loss_v2.csv',np.array(hlavny_list),delimiter=',')
Nothing special here, except my own loss function, which looks like:
import tensorflow as tf
from tensorflow.python.ops import clip_ops
def my_binary_crossentropy(target, output, from_logits=False):
target = ops.convert_to_tensor_v2(target)
output = ops.convert_to_tensor_v2(output)
target = tf.cast(target, tf.float32)
epsilon_ = 0.01
output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)
# Compute cross entropy from probabilities.
bce = 8 * target * math_ops.log(output + epsilon_)
bce += (1 - target) * math_ops.log(1 - output + epsilon_)
return -bce
Before i was using Binary crossentropy and everything worked fine, no problem. But when i change loss function, some problem occured which i don't understand. Model is training in loop, and sometimes and i don't why or when, model behave like doesn't train.
Epoch 1/10000
124/124 [==============================] - 6s 47ms/step - loss: 1.1125 - accuracy: 0.4600 - val_loss: 0.7640 - val_accuracy: 0.9312
Epoch 2/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.6418 - accuracy: 0.8598 - val_loss: 0.5307 - val_accuracy: 0.8718
Epoch 3/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.5434 - accuracy: 0.8768 - val_loss: 0.5416 - val_accuracy: 0.8736
Epoch 4/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.5167 - accuracy: 0.8820 - val_loss: 0.5383 - val_accuracy: 0.9165
Epoch 5/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.4948 - accuracy: 0.8898 - val_loss: 0.5136 - val_accuracy: 0.9156
Epoch 6/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.4693 - accuracy: 0.8910 - val_loss: 0.5088 - val_accuracy: 0.9130
Epoch 7/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.4533 - accuracy: 0.8925 - val_loss: 0.5163 - val_accuracy: 0.8551
Epoch 8/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.4257 - accuracy: 0.8883 - val_loss: 0.5490 - val_accuracy: 0.9189
Epoch 9/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.4237 - accuracy: 0.8919 - val_loss: 0.5302 - val_accuracy: 0.8172
Epoch 10/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.4072 - accuracy: 0.8859 - val_loss: 0.5591 - val_accuracy: 0.9278
Epoch 11/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.3831 - accuracy: 0.8908 - val_loss: 0.5563 - val_accuracy: 0.8937
Epoch 00011: early stopping
32695
24925
221726
5339
Epoch 1/10000
124/124 [==============================] - 6s 48ms/step - loss: 4.1699 - accuracy: 0.8661 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 2/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 3/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 4/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 5/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 6/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 00006: early stopping
Loss, val_loss nor accuracy is not decreasing. I think that is some problem in loss function, because this problem occured after new loss function, this loop i have done maybe 10 000 times without error. It happens maybe 1 in 4 cycles. What is wrong? I will be very grateful for help. Thank you
Your dropout values are too low and make learning harder for the model. Use higher dropout value to overcome problem you have.
Start by building simple model with one hidden layer and use popular hyper parameters.
You can start fine tunning hyper parameters while updating the model as you move.
That is the simple and best way to debug this from my point of view.
Let me know if you need further help.
I have a classification model that is clearly overfitting and the validation accuracy doesn't change.
I've tried using feature selection and feature extraction methods but they didn't help.
feature selection method:
fs = SelectKBest(f_classif, k=10)
fs.fit(x, y)
feature_index = fs.get_support(True)
feature_index = feature_index.tolist()
best_features = []
# makes list of best features
for index in feature_index:
best_features.append(x[:, index])
x = (np.array(best_features, dtype=np.float)).T
model:
def model_and_print(x, y, Epochs, Batch_Size, loss, opt, class_weight, callback):
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# define 10-fold cross validation test harness
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
# K-fold Cross Validation model evaluation
fold_no = 1
for train, test in kfold.split(x, y):
# create model
model = Sequential()
model.add(Dropout(0.2, input_shape=(len(x[0]),)))
model.add(Dense(6, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(8, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.4))
model.add(Dense(1, activation=tf.nn.sigmoid))
# compile model
model.compile(optimizer=opt,
loss=loss, metrics=['accuracy']
)
history = model.fit(x[train], y[train], validation_data=(x[test], y[test]), epochs=Epochs,
batch_size=Batch_Size, verbose=1)
def main():
data = ["data.pkl", "data_list.pkl", "data_mean.pkl"]
df = pd.read_pickle(data[2])
x, y = data_frame_to_feature_and_target_arrays(df)
# hyper meters
Epochs = 200
Batch_Size = 1
learning_rate = 0.003
optimizer = optimizers.Adam(learning_rate=learning_rate)
loss = "binary_crossentropy"
model_and_print(x, y, Epochs, Batch_Size, loss, optimizer, class_weight, es_callback)
if __name__ == "__main__":
main()
output for part of one fold:
1/73 [..............................] - ETA: 0s - loss: 0.6470 - accuracy: 1.0000
62/73 [========================>.....] - ETA: 0s - loss: 0.5665 - accuracy: 0.7097
73/73 [==============================] - 0s 883us/step - loss: 0.5404 - accuracy: 0.7534 - val_loss: 0.5576 - val_accuracy: 0.5000
Epoch 100/200
1/73 [..............................] - ETA: 0s - loss: 0.4743 - accuracy: 1.0000
69/73 [===========================>..] - ETA: 0s - loss: 0.6388 - accuracy: 0.6522
73/73 [==============================] - 0s 806us/step - loss: 0.6316 - accuracy: 0.6575 - val_loss: 0.5592 - val_accuracy: 0.5000
Epoch 101/200
1/73 [..............................] - ETA: 0s - loss: 0.6005 - accuracy: 1.0000
69/73 [===========================>..] - ETA: 0s - loss: 0.5656 - accuracy: 0.7101
73/73 [==============================] - 0s 806us/step - loss: 0.5641 - accuracy: 0.7123 - val_loss: 0.5629 - val_accuracy: 0.5000
Epoch 102/200
1/73 [..............................] - ETA: 0s - loss: 0.2126 - accuracy: 1.0000
65/73 [=========================>....] - ETA: 0s - loss: 0.5042 - accuracy: 0.8000
73/73 [==============================] - 0s 847us/step - loss: 0.5340 - accuracy: 0.7671 - val_loss: 0.5608 - val_accuracy: 0.5000
Epoch 103/200
1/73 [..............................] - ETA: 0s - loss: 0.8801 - accuracy: 0.0000e+00
68/73 [==========================>...] - ETA: 0s - loss: 0.5754 - accuracy: 0.6471
73/73 [==============================] - 0s 819us/step - loss: 0.5780 - accuracy: 0.6575 - val_loss: 0.5639 - val_accuracy: 0.5000
Epoch 104/200
1/73 [..............................] - ETA: 0s - loss: 0.0484 - accuracy: 1.0000
70/73 [===========================>..] - ETA: 0s - loss: 0.5711 - accuracy: 0.7571
73/73 [==============================] - 0s 806us/step - loss: 0.5689 - accuracy: 0.7534 - val_loss: 0.5608 - val_accuracy: 0.5000
Epoch 105/200
1/73 [..............................] - ETA: 0s - loss: 0.1237 - accuracy: 1.0000
69/73 [===========================>..] - ETA: 0s - loss: 0.5953 - accuracy: 0.7101
73/73 [==============================] - 0s 820us/step - loss: 0.5922 - accuracy: 0.7260 - val_loss: 0.5672 - val_accuracy: 0.5000
Epoch 106/200
1/73 [..............................] - ETA: 0s - loss: 0.3360 - accuracy: 1.0000
67/73 [==========================>...] - ETA: 0s - loss: 0.5175 - accuracy: 0.7313
73/73 [==============================] - 0s 847us/step - loss: 0.5320 - accuracy: 0.7397 - val_loss: 0.5567 - val_accuracy: 0.5000
Epoch 107/200
1/73 [..............................] - ETA: 0s - loss: 0.1384 - accuracy: 1.0000
67/73 [==========================>...] - ETA: 0s - loss: 0.5435 - accuracy: 0.6866
73/73 [==============================] - 0s 833us/step - loss: 0.5541 - accuracy: 0.6575 - val_loss: 0.5629 - val_accuracy: 0.5000
Epoch 108/200
1/73 [..............................] - ETA: 0s - loss: 0.2647 - accuracy: 1.0000
69/73 [===========================>..] - ETA: 0s - loss: 0.6047 - accuracy: 0.6232
73/73 [==============================] - 0s 820us/step - loss: 0.5948 - accuracy: 0.6301 - val_loss: 0.5660 - val_accuracy: 0.5000
Epoch 109/200
1/73 [..............................] - ETA: 0s - loss: 0.8837 - accuracy: 0.0000e+00
66/73 [==========================>...] - ETA: 0s - loss: 0.5250 - accuracy: 0.7576
73/73 [==============================] - 0s 861us/step - loss: 0.5357 - accuracy: 0.7397 - val_loss: 0.5583 - val_accuracy: 0.5000
Epoch 110/200
final accuracy:
Score for fold 10: loss of 0.5600861310958862; accuracy of 50.0%
my question is what can I do about the overfitting if I've tried feature extraction and dropout layers and why is the validation accuracy not changing?