EarlyStopping not stop training - python

As the title suggests I am training my IRV2 network using the following EarlyStopping definition:
callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, mode="auto")
However, the training doesn't stop when I get three equal values of val_loss:
history = model.fit(
X_train_s,
y_train_categorical,
steps_per_epoch=steps_per_epoch,
epochs=epochs,
batch_size=batch_size,
validation_data=(X_validation_s, y_validation_categorical),
callbacks=[callback]
)
This is my model:
def Inception_Resnet_V2_Binary(x_train, batch_size):
conv_base = InceptionResNetV2(weights='imagenet', include_top=False, input_shape=image_size, pooling="avg")
for layer in conv_base.layers:
layer.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(2, activation='softmax'))
steps_per_epoch = (len(x_train))/ batch_size
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate=0.0002,
decay_steps=steps_per_epoch * 2,
decay_rate=0.7)
opt = Adam(learning_rate=lr_schedule)
model.compile(loss="binary_crossentropy", optimizer=opt, metrics = ["accuracy", tf.metrics.AUC()])
return model
This is the training output:
Epoch 28/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5149 - accuracy: 0.7398 - auc_1: 0.8339 - val_loss: 0.5217 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 29/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5127 - accuracy: 0.7441 - auc_1: 0.8354 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 30/70
151/151 [==============================] - 12s 79ms/step - loss: 0.5144 - accuracy: 0.7384 - auc_1: 0.8321 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 31/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5152 - accuracy: 0.7402 - auc_1: 0.8332 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 32/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5143 - accuracy: 0.7410 - auc_1: 0.8347 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 33/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5124 - accuracy: 0.7404 - auc_1: 0.8352 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 34/70
151/151 [==============================] - 12s 81ms/step - loss: 0.5106 - accuracy: 0.7441 - auc_1: 0.8363 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 35/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5129 - accuracy: 0.7389 - auc_1: 0.8342 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 36/70
151/151 [==============================] - 12s 79ms/step - loss: 0.5122 - accuracy: 0.7400 - auc_1: 0.8341 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 37/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5160 - accuracy: 0.7424 - auc_1: 0.8346 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 38/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5175 - accuracy: 0.7367 - auc_1: 0.8318 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246

That happens because the loss is actually decreasing, but by a value that is very low. If you don't set the min_delta in the early stopping callback, the training will consider a negligible improvement as an actual improvement. You can solve the problem by simply adding min_delta argument=0.001:
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, mode="auto", min_delta=0.001)
You can set the min_delta as you feel suitable. min_delta=0.001 will consider changes in loss that are less than that value as no improvement

That val_loss it's not a float of 4 decimal numbers, you are not seeing the entire value, patience should be used to prevent a network in overfitting to run for hours, if the val_loss keeps lowering itself just let the network run (and from the loss the learning rate seems a bit high).
Val_loss is a float32, those 4 values are only the 4 most significant value, to see what's really going on you can't rely on the fit output, you will need a callback of some sort that prints the val_loss with the format you want.
you can find some examples here:
https://keras.io/guides/writing_your_own_callbacks/

Related

Transfer learning model running on inference does not learn

I'm doing a species classification task from kaggle (https://www.kaggle.com/competitions/yum-or-yuck-butterfly-mimics-2022/overview). I decided to use transfer learning to tackle this problem since there aren't that many images. The model is as follows:
inputs = tf.keras.layers.Input(shape=(224, 224, 3))
base_model = tf.keras.applications.resnet50.ResNet50(
input_shape=(224,224,3),
include_top=False,
weights="imagenet")
for layer in base_model.layers:
layer.trainable = False
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(1024, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(512, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(64, activation="relu")(x)
output = tf.keras.layers.Dense(6, activation="softmax")(x)
model = tf.keras.Model(inputs=inputs, outputs=output)
As per the guidelines when doing transfer learning:https://keras.io/guides/transfer_learning/, I'm freezing the resnet layers and training the model on inference only (training=False). However, the results show that the model is not learning properly. Convergence doesn't seem like it will be possible even after nearly 200 epochs:
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="categorical_crossentropy",
metrics="accuracy",
)
stop_early = tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
min_delta=0.0001,
patience=20,
restore_best_weights=True
)
history = model.fit(train_generator,
validation_data = val_generator,
epochs = 200,
callbacks=[stop_early])
22/22 [==============================] - 19s 442ms/step - loss: 1.9317 - accuracy: 0.1794 - val_loss: 1.8272 - val_accuracy: 0.1618
Epoch 2/200
22/22 [==============================] - 9s 398ms/step - loss: 1.8250 - accuracy: 0.1882 - val_loss: 1.7681 - val_accuracy: 0.2197
Epoch 3/200
22/22 [==============================] - 9s 402ms/step - loss: 1.7927 - accuracy: 0.2294 - val_loss: 1.7612 - val_accuracy: 0.2139
Epoch 4/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7930 - accuracy: 0.2000 - val_loss: 1.7640 - val_accuracy: 0.2139
Epoch 5/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7872 - accuracy: 0.2132 - val_loss: 1.7489 - val_accuracy: 0.3121
Epoch 6/200
22/22 [==============================] - 9s 389ms/step - loss: 1.7700 - accuracy: 0.2574 - val_loss: 1.7378 - val_accuracy: 0.2543
Epoch 7/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7676 - accuracy: 0.2353 - val_loss: 1.7229 - val_accuracy: 0.3064
Epoch 8/200
22/22 [==============================] - 9s 427ms/step - loss: 1.7721 - accuracy: 0.2353 - val_loss: 1.7225 - val_accuracy: 0.2948
Epoch 9/200
22/22 [==============================] - 9s 399ms/step - loss: 1.7522 - accuracy: 0.2588 - val_loss: 1.7267 - val_accuracy: 0.2948
Epoch 10/200
22/22 [==============================] - 9s 395ms/step - loss: 1.7434 - accuracy: 0.2735 - val_loss: 1.7151 - val_accuracy: 0.2948
Epoch 11/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7500 - accuracy: 0.2632 - val_loss: 1.7083 - val_accuracy: 0.3064
Epoch 12/200
22/22 [==============================] - 9s 425ms/step - loss: 1.7307 - accuracy: 0.2721 - val_loss: 1.6899 - val_accuracy: 0.3179
Epoch 13/200
22/22 [==============================] - 9s 407ms/step - loss: 1.7439 - accuracy: 0.2794 - val_loss: 1.7045 - val_accuracy: 0.2948
Epoch 14/200
22/22 [==============================] - 9s 404ms/step - loss: 1.7376 - accuracy: 0.2706 - val_loss: 1.7118 - val_accuracy: 0.2659
Epoch 15/200
22/22 [==============================] - 9s 419ms/step - loss: 1.7588 - accuracy: 0.2647 - val_loss: 1.6684 - val_accuracy: 0.3237
Epoch 16/200
22/22 [==============================] - 9s 394ms/step - loss: 1.7289 - accuracy: 0.2824 - val_loss: 1.6733 - val_accuracy: 0.3064
Epoch 17/200
22/22 [==============================] - 9s 387ms/step - loss: 1.7184 - accuracy: 0.2809 - val_loss: 1.7185 - val_accuracy: 0.2659
Epoch 18/200
22/22 [==============================] - 9s 408ms/step - loss: 1.7242 - accuracy: 0.2765 - val_loss: 1.6961 - val_accuracy: 0.2717
Epoch 19/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7218 - accuracy: 0.2853 - val_loss: 1.6757 - val_accuracy: 0.3006
Epoch 20/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7248 - accuracy: 0.2882 - val_loss: 1.6716 - val_accuracy: 0.3064
Epoch 21/200
22/22 [==============================] - 9s 401ms/step - loss: 1.7134 - accuracy: 0.2838 - val_loss: 1.6666 - val_accuracy: 0.2948
Epoch 22/200
22/22 [==============================] - 9s 393ms/step - loss: 1.7140 - accuracy: 0.2941 - val_loss: 1.6427 - val_accuracy: 0.3064
I need to unfreeze the layers and turn off inference in order for the model to learn. I tested the same scenario with EfficientNet and the same thing happened. Finally, I also used Xception, and freezing the layers and running with inference was fine. So it seems they behave differently, even though they all have batchnorm layers.
I'm not understanding what is going on here. Why would I need to turn inference off? Could anyone have a clue about this?
EDIT:
results from Resnet50:
results from Xception:

NMT LSTM gives an incorrect response, and a big loss

I am writing a neural network for translating texts from Russian to English, but I ran into the problem that my neural network gives a big loss, as well as a very far from the correct answer.
Below is the LSTM that I build using Keras:
def make_model(in_vocab, out_vocab, in_timesteps, out_timesteps, n):
model = Sequential()
model.add(Embedding(in_vocab, n, input_length=in_timesteps, mask_zero=True))
model.add(LSTM(n))
model.add(Dropout(0.3))
model.add(RepeatVector(out_timesteps))
model.add(LSTM(n, return_sequences=True))
model.add(Dropout(0.3))
model.add(Dense(out_vocab, activation='softmax'))
model.compile(optimizer=optimizers.RMSprop(lr=0.001), loss='sparse_categorical_crossentropy')
return model
And also the learning process is presented:
Epoch 1/10
3/3 [==============================] - 5s 1s/step - loss: 8.3635 - accuracy: 0.0197 - val_loss: 8.0575 - val_accuracy: 0.0563
Epoch 2/10
3/3 [==============================] - 2s 806ms/step - loss: 7.9505 - accuracy: 0.0334 - val_loss: 8.2927 - val_accuracy: 0.0743
Epoch 3/10
3/3 [==============================] - 2s 812ms/step - loss: 7.7977 - accuracy: 0.0349 - val_loss: 8.2959 - val_accuracy: 0.0571
Epoch 4/10
3/3 [==============================] - 3s 825ms/step - loss: 7.6700 - accuracy: 0.0389 - val_loss: 8.5628 - val_accuracy: 0.0751
Epoch 5/10
3/3 [==============================] - 3s 829ms/step - loss: 7.5595 - accuracy: 0.0411 - val_loss: 8.5854 - val_accuracy: 0.0743
Epoch 6/10
3/3 [==============================] - 3s 807ms/step - loss: 7.4604 - accuracy: 0.0406 - val_loss: 8.7633 - val_accuracy: 0.0743
Epoch 7/10
3/3 [==============================] - 2s 815ms/step - loss: 7.3475 - accuracy: 0.0436 - val_loss: 8.9103 - val_accuracy: 0.0743
Epoch 8/10
3/3 [==============================] - 3s 825ms/step - loss: 7.2548 - accuracy: 0.0455 - val_loss: 9.0493 - val_accuracy: 0.0721
Epoch 9/10
3/3 [==============================] - 2s 814ms/step - loss: 7.1751 - accuracy: 0.0449 - val_loss: 9.0740 - val_accuracy: 0.0788
Epoch 10/10
3/3 [==============================] - 3s 831ms/step - loss: 7.1132 - accuracy: 0.0479 - val_loss: 9.2443 - val_accuracy: 0.0773
And the parameters that I transmit for training:
model = make_model(# the size of tokenized words
russian_vocab_size,
english_vocab_size,
# maximum sentence lengths
max_russian_sequence_length,
max_english_sequence_length,
512)
model.fit(preproc_russian_sentences, # all tokenized Russian offers that are transmitted in the format shape (X, Y)
preproc_english_sentences, # all tokenized English offers that are transmitted in the format shape (X, Y, 1)
epochs=10,
batch_size=1024,
validation_split=0.2,
callbacks=None,
verbose=1)
Thank you in advance.

How to increse and decreses the model accuracy and batch size respectively

İ am working on transfer learning for multiclass classification of image datasets that consists of 12 classes. As a result, İ am using VGG19. However, the accuracy of the model is as much lower than the expectation. İn addition train and valid accuracy do not increase. Besides that İ ma trying to decrease the batch size which is still 383
My code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import shutil
import os
import glob as gb
import tensorflow as tf
from tensorflow.keras.models import Sequential
from zipfile import ZipFile
import cv2
from tensorflow.keras.layers import Flatten,Dense,BatchNormalization,Activation,Dropout
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras import optimizers
IMAGE_SHAPE = (256, 256)
BATCH_SIZE = 32
#--------------------------------------------------Train--------------------------------
train = ImageDataGenerator()
train_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale= 1./255, fill_mode= 'nearest')
train_data = train_generator.flow_from_directory(directory="/content/dataset_base/train",target_size=IMAGE_SHAPE , color_mode="rgb" , class_mode='categorical', batch_size=BATCH_SIZE , shuffle = True )
#--------------------------------------------------valid-------------------------------
valid = ImageDataGenerator()
validation_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
valid_data = validation_generator.flow_from_directory(directory="/content/dataset_base/valid", target_size=IMAGE_SHAPE , color_mode="rgb" , class_mode='categorical' , batch_size=BATCH_SIZE , shuffle = True )
#--------------------------------------------------Test---------------------------------------------------------------------------------------------------
test = ImageDataGenerator()
test_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
test_data = test_generator.flow_from_directory(directory="/content/dataset_base/valid",target_size=IMAGE_SHAPE , color_mode="rgb" , class_mode='categorical' , batch_size=1 , shuffle = False )
test_data.reset()
for image_batch, labels_batch in train_data:
print(image_batch.shape)
print(labels_batch.shape)
break
#Defining the VGG Convolutional Neural Net
base_model = VGG19(weights='imagenet', input_shape=(256, 256, 3), include_top=False)
from tensorflow.keras.layers import Flatten,Dense,BatchNormalization,Activation,Dropout
from tensorflow.keras import optimizers
inputs = tf.keras.Input(shape=(256, 256, 3))
x = base_model(inputs, training=False)
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dense(256, activation='relu')(x)
outputs = Dense(12,activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)
model.summary()
Model: "model_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_10 (InputLayer) [(None, 256, 256, 3)] 0
_________________________________________________________________
vgg19 (Functional) (None, 8, 8, 512) 20024384
_________________________________________________________________
flatten_8 (Flatten) (None, 32768) 0
_________________________________________________________________
dense_32 (Dense) (None, 256) 8388864
_________________________________________________________________
dense_33 (Dense) (None, 256) 65792
_________________________________________________________________
dense_34 (Dense) (None, 12) 3084
=================================================================
Total params: 28,482,124
Trainable params: 28,482,124
Non-trainable params: 0
model.compile(optimizer = optimizers.Adam(learning_rate=0.05), loss='categorical_crossentropy', metrics=["accuracy"])
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
checkpoint = ModelCheckpoint("vgg16_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, period=1)
history = model.fit(
train_data,
validation_data=valid_data,
batch_size = 256,
epochs=10,
callbacks=[
tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=55,
restore_best_weights=True
)
]
)
model_final.save_weights("vgg16_1.h5")
The result after 10 epochs:
Epoch 1/10
383/383 [==============================] - 214s 557ms/step - loss: 2.4934 - accuracy: 0.0781 - val_loss: 2.4919 - val_accuracy: 0.0833
Epoch 2/10
383/383 [==============================] - 219s 572ms/step - loss: 2.4918 - accuracy: 0.0847 - val_loss: 2.4888 - val_accuracy: 0.0833
Epoch 3/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4930 - accuracy: 0.0840 - val_loss: 2.4918 - val_accuracy: 0.0833
Epoch 4/10
383/383 [==============================] - 220s 574ms/step - loss: 2.4919 - accuracy: 0.0842 - val_loss: 2.4934 - val_accuracy: 0.0833
Epoch 5/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4928 - accuracy: 0.0820 - val_loss: 2.4893 - val_accuracy: 0.0833
Epoch 6/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4921 - accuracy: 0.0842 - val_loss: 2.4920 - val_accuracy: 0.0833
Epoch 7/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4922 - accuracy: 0.0858 - val_loss: 2.4910 - val_accuracy: 0.0833
Epoch 8/10
383/383 [==============================] - 219s 573ms/step - loss: 2.4920 - accuracy: 0.0862 - val_loss: 2.4912 - val_accuracy: 0.0833
Epoch 9/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4926 - accuracy: 0.0813 - val_loss: 2.4943 - val_accuracy: 0.0833
Epoch 10/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4920 - accuracy: 0.0829 - val_loss: 2.4948 - val_accuracy: 0.0833
İ updated my code. Now, İ am getting good accuracy for the train-set, whereas the lower accuracy of the valid set.İt is obviously overfitting. Now how may İ get the best accuracy for valid too?
Updated code:
# Separate in train and test data
train_df, test_df = train_test_split(image_df, train_size=0.8, shuffle=True, random_state=1)
# Create the generators
train_generator,test_generator,train_images,val_images,test_images = create_gen()
Found 3545 validated image filenames belonging to 12 classes.
Found 886 validated image filenames belonging to 12 classes.
Found 1108 validated image filenames belonging to 12 classes.
from tensorflow.keras.applications.vgg16 import VGG16
vggmodel = VGG16(weights='imagenet', include_top=True)
vggmodel.trainable = False
for layers in (vggmodel.layers)[:19]:
print(layers)
layers.trainable = False
<tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7ff3dc0e10d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3dc0e8910>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0f6910>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca0f65d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca08d810>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca08d150>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca089f90>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca090c10>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca09b8d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff4291d8290>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca0825d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0aae10>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3cb71b7d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0a7a10>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca03c090>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca03c250>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0a7810>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0aa410>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca0434d0>
X= vggmodel.layers[-17].output
X=tf.keras.layers.Dropout(0.09)(X)
X = tf.keras.layers.Flatten()(X)
predictions = Dense(12, activation="softmax")(X)
model_final = Model(vggmodel.input, predictions)
model_final.compile(optimizer = optimizers.Adam(learning_rate=0.005), loss='categorical_crossentropy', metrics=["accuracy"])
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
checkpoint = ModelCheckpoint("vgg16_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, period=1)
rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.005,patience=1, verbose=1)
History = model_final.fit(train_images,validation_data=val_images,batch_size = 64,epochs=60,
callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_loss',patience=60,restore_best_weights=True), ])
tf.keras.callbacks.ModelCheckpoint(filepaths, monitor='val_loss', verbose=0, save_best_only=False,save_weights_only=False, mode='auto', save_freq='epoch')
model_final.save_weights("vgg16_1.h5")
Epoch 1/60
111/111 [==============================] - 49s 436ms/step - loss: 2.3832 - accuracy: 0.5396 - val_loss: 1.7135 - val_accuracy: 0.6095
Epoch 2/60
111/111 [==============================] - 39s 348ms/step - loss: 1.0699 - accuracy: 0.7385 - val_loss: 1.6680 - val_accuracy: 0.6208
Epoch 3/60
111/111 [==============================] - 38s 338ms/step - loss: 0.8366 - accuracy: 0.7853 - val_loss: 1.3777 - val_accuracy: 0.6975
Epoch 4/60
111/111 [==============================] - 37s 337ms/step - loss: 0.6213 - accuracy: 0.8299 - val_loss: 1.3766 - val_accuracy: 0.7269
Epoch 5/60
111/111 [==============================] - 38s 339ms/step - loss: 0.6173 - accuracy: 0.8446 - val_loss: 1.9170 - val_accuracy: 0.7133
Epoch 6/60
111/111 [==============================] - 38s 338ms/step - loss: 0.5782 - accuracy: 0.8511 - val_loss: 1.9968 - val_accuracy: 0.6501
Epoch 7/60
111/111 [==============================] - 37s 337ms/step - loss: 0.5672 - accuracy: 0.8564 - val_loss: 1.6436 - val_accuracy: 0.7088
Epoch 8/60
111/111 [==============================] - 38s 340ms/step - loss: 0.3971 - accuracy: 0.8894 - val_loss: 1.6819 - val_accuracy: 0.7314
Epoch 9/60
111/111 [==============================] - 38s 342ms/step - loss: 0.3657 - accuracy: 0.9038 - val_loss: 1.9244 - val_accuracy: 0.7133
Epoch 10/60
111/111 [==============================] - 37s 337ms/step - loss: 0.4003 - accuracy: 0.9016 - val_loss: 1.8337 - val_accuracy: 0.7246
Epoch 11/60
111/111 [==============================] - 38s 341ms/step - loss: 0.6439 - accuracy: 0.8731 - val_loss: 1.8070 - val_accuracy: 0.7460
Epoch 12/60
111/111 [==============================] - 37s 338ms/step - loss: 0.2917 - accuracy: 0.9190 - val_loss: 1.7533 - val_accuracy: 0.7494
Epoch 13/60
111/111 [==============================] - 37s 336ms/step - loss: 0.4685 - accuracy: 0.9032 - val_loss: 1.9534 - val_accuracy: 0.7393
Epoch 14/60
111/111 [==============================] - 38s 339ms/step - loss: 0.3936 - accuracy: 0.9061 - val_loss: 1.8643 - val_accuracy: 0.7280
Epoch 15/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2598 - accuracy: 0.9368 - val_loss: 1.7242 - val_accuracy: 0.7856
Epoch 16/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2884 - accuracy: 0.9360 - val_loss: 1.7374 - val_accuracy: 0.7517
Epoch 17/60
111/111 [==============================] - 38s 341ms/step - loss: 0.2487 - accuracy: 0.9362 - val_loss: 1.6373 - val_accuracy: 0.7889
Epoch 18/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1683 - accuracy: 0.9532 - val_loss: 1.6612 - val_accuracy: 0.7698
Epoch 19/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1354 - accuracy: 0.9591 - val_loss: 1.7372 - val_accuracy: 0.7889
Epoch 20/60
111/111 [==============================] - 38s 339ms/step - loss: 0.2793 - accuracy: 0.9329 - val_loss: 2.0405 - val_accuracy: 0.7596
Epoch 21/60
111/111 [==============================] - 38s 338ms/step - loss: 0.3049 - accuracy: 0.9306 - val_loss: 1.8485 - val_accuracy: 0.7912
Epoch 22/60
111/111 [==============================] - 37s 338ms/step - loss: 0.2724 - accuracy: 0.9399 - val_loss: 1.8225 - val_accuracy: 0.7856
Epoch 23/60
111/111 [==============================] - 37s 337ms/step - loss: 0.2088 - accuracy: 0.9475 - val_loss: 2.1015 - val_accuracy: 0.7675
Epoch 24/60
111/111 [==============================] - 38s 341ms/step - loss: 0.2112 - accuracy: 0.9470 - val_loss: 2.2647 - val_accuracy: 0.7404
Epoch 25/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2172 - accuracy: 0.9467 - val_loss: 2.4213 - val_accuracy: 0.7675
Epoch 26/60
111/111 [==============================] - 37s 336ms/step - loss: 0.3093 - accuracy: 0.9300 - val_loss: 2.3260 - val_accuracy: 0.7630
Epoch 27/60
111/111 [==============================] - 37s 338ms/step - loss: 0.3036 - accuracy: 0.9427 - val_loss: 2.4329 - val_accuracy: 0.7460
Epoch 28/60
111/111 [==============================] - 37s 338ms/step - loss: 0.2641 - accuracy: 0.9436 - val_loss: 2.6936 - val_accuracy: 0.7472
Epoch 29/60
111/111 [==============================] - 37s 333ms/step - loss: 0.2258 - accuracy: 0.9509 - val_loss: 2.3055 - val_accuracy: 0.7788
Epoch 30/60
111/111 [==============================] - 37s 334ms/step - loss: 0.2921 - accuracy: 0.9436 - val_loss: 2.3668 - val_accuracy: 0.7517
Epoch 31/60
111/111 [==============================] - 37s 334ms/step - loss: 0.2830 - accuracy: 0.9447 - val_loss: 2.1422 - val_accuracy: 0.7720
Epoch 32/60
111/111 [==============================] - 37s 337ms/step - loss: 0.3584 - accuracy: 0.9312 - val_loss: 3.2875 - val_accuracy: 0.7122
Epoch 33/60
111/111 [==============================] - 37s 337ms/step - loss: 0.3279 - accuracy: 0.9413 - val_loss: 2.3641 - val_accuracy: 0.7686
Epoch 34/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2326 - accuracy: 0.9526 - val_loss: 2.8010 - val_accuracy: 0.7754
Epoch 35/60
111/111 [==============================] - 38s 337ms/step - loss: 0.3131 - accuracy: 0.9388 - val_loss: 2.6276 - val_accuracy: 0.7698
Epoch 36/60
111/111 [==============================] - 37s 335ms/step - loss: 0.1961 - accuracy: 0.9585 - val_loss: 2.4269 - val_accuracy: 0.7912
Epoch 37/60
111/111 [==============================] - 37s 336ms/step - loss: 0.1915 - accuracy: 0.9599 - val_loss: 2.9607 - val_accuracy: 0.7630
Epoch 38/60
111/111 [==============================] - 38s 339ms/step - loss: 0.2457 - accuracy: 0.9512 - val_loss: 3.2177 - val_accuracy: 0.7438
Epoch 39/60
111/111 [==============================] - 38s 346ms/step - loss: 0.1575 - accuracy: 0.9670 - val_loss: 2.7473 - val_accuracy: 0.7675
Epoch 40/60
111/111 [==============================] - 38s 343ms/step - loss: 0.1841 - accuracy: 0.9591 - val_loss: 3.1237 - val_accuracy: 0.7415
Epoch 41/60
111/111 [==============================] - 38s 339ms/step - loss: 0.2344 - accuracy: 0.9498 - val_loss: 2.7585 - val_accuracy: 0.7630
Epoch 42/60
111/111 [==============================] - 37s 337ms/step - loss: 0.2115 - accuracy: 0.9588 - val_loss: 3.0896 - val_accuracy: 0.7314
Epoch 43/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1407 - accuracy: 0.9721 - val_loss: 2.9159 - val_accuracy: 0.7381
Epoch 44/60
111/111 [==============================] - 38s 336ms/step - loss: 0.2045 - accuracy: 0.9616 - val_loss: 2.7607 - val_accuracy: 0.7630
Epoch 45/60
111/111 [==============================] - 38s 338ms/step - loss: 0.1343 - accuracy: 0.9698 - val_loss: 2.8174 - val_accuracy: 0.7709
Epoch 46/60
111/111 [==============================] - 37s 335ms/step - loss: 0.2094 - accuracy: 0.9605 - val_loss: 3.3286 - val_accuracy: 0.7630
Epoch 47/60
111/111 [==============================] - 38s 340ms/step - loss: 0.3178 - accuracy: 0.9433 - val_loss: 4.0310 - val_accuracy: 0.7201
Epoch 48/60
111/111 [==============================] - 39s 350ms/step - loss: 0.2973 - accuracy: 0.9515 - val_loss: 3.3076 - val_accuracy: 0.7698
Epoch 49/60
111/111 [==============================] - 38s 339ms/step - loss: 0.2969 - accuracy: 0.9535 - val_loss: 3.3232 - val_accuracy: 0.7630
Epoch 50/60
111/111 [==============================] - 38s 339ms/step - loss: 0.1693 - accuracy: 0.9678 - val_loss: 3.3474 - val_accuracy: 0.7528
Epoch 51/60
111/111 [==============================] - 38s 338ms/step - loss: 0.2769 - accuracy: 0.9484 - val_loss: 3.5787 - val_accuracy: 0.7573
Epoch 52/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2483 - accuracy: 0.9611 - val_loss: 3.3097 - val_accuracy: 0.7822
Epoch 53/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1674 - accuracy: 0.9670 - val_loss: 3.9156 - val_accuracy: 0.7494
Epoch 54/60
111/111 [==============================] - 38s 340ms/step - loss: 0.1372 - accuracy: 0.9701 - val_loss: 3.4078 - val_accuracy: 0.7810
Epoch 55/60
111/111 [==============================] - 39s 350ms/step - loss: 0.1391 - accuracy: 0.9726 - val_loss: 3.4077 - val_accuracy: 0.7709
Epoch 56/60
111/111 [==============================] - 38s 342ms/step - loss: 0.2218 - accuracy: 0.9628 - val_loss: 3.5559 - val_accuracy: 0.7517
Epoch 57/60
111/111 [==============================] - 38s 340ms/step - loss: 0.1355 - accuracy: 0.9749 - val_loss: 3.4460 - val_accuracy: 0.7709
Epoch 58/60
111/111 [==============================] - 38s 344ms/step - loss: 0.2096 - accuracy: 0.9650 - val_loss: 3.7931 - val_accuracy: 0.7460
Epoch 59/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1943 - accuracy: 0.9701 - val_loss: 3.6928 - val_accuracy: 0.7573
Epoch 60/60
111/111 [==============================] - 37s 336ms/step - loss: 0.1337 - accuracy: 0.9743 - val_loss: 3.5595 - val_accuracy: 0.7641
Result:
Test Loss: 3.83508
Accuracy on the test set: 78.61%
Since you are using generators to provide the input to model.fit you should NOT specify the batch size in model.fit as it is already set as 32 in the generators.
I would lower the initial learning rate to .001. Also in callbacks I recommend using the ReduceLROnPlateau callback with the settings shown below
rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5,
patience=1, verbose=1)
add rlronp to the list of callbacks. Also in your model I would add a dropout layer
after the second dense layer with code below. This will help prevent over-fitting
x=tf.keras.layers.Dropout(.3)(x)
383 on the log is not the batch size. It's the number of steps which is data_size / batch_size.
The problem that training does not work properly is probably because of very low or high learning rate. Try adjusting the learning rate.

Why does accuracy not increase in training but loss and val_loss decrease?

I state that I am not at all familiar with neural networks and this is the first time that I have tried to develop one.
The problem lies in predicting a week's pollution forecast, based on the previous month.
Unstructured data with 15 features are:
Start data
The data to be predicted is 'gas', for a total of 168 hours in the next week, is the hours in a week.
MinMaxScaler(feature_range (0,1)) is applied to the data. And then the data is split into train and test data. Since only one year of hourly measurements is available, the data is resampled in series of 672 hourly samples that each starts from every day of the year at midnight. Therefore, from about 8000 starting hourly surveys, about 600 series of 672 samples are obtained.
The 'date' is removed from the initial data and the form of train_x and train_y is:
Shape of train_x and train_y
In train_x[0] there are 672 hourly readings for the first 4 weeks of the data set and consist of all features including 'gas'.
In train_y [0], on the other hand, there are 168 hourly readings for the following week which begins when the month ends in train_x [0].
Train_X[0] where column 0 is 'gas' and Train_y[0] with only gas column for the next week after train_x[0]
TRAIN X SHAPE = (631, 672, 14)
TRAIN Y SHAPE = (631, 168, 1)
After organizing the data in this way (if it's wrong please let me know), I built the neural network as the following:
train_x, train_y = to_supervised(train, n_input)
train_x = train_x.astype(float)
train_y = train_y.astype(float)
# define parameters
verbose, epochs, batch_size = 1, 200, 50
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# define model
model = Sequential()
opt = optimizers.RMSprop(learning_rate=1e-3)
model.add(layers.GRU(14, activation='relu', input_shape=(n_timesteps, n_features),return_sequences=False, stateful=False))
model.add(layers.Dense(1, activation='relu'))
#model.add(layers.Dense(14, activation='linear'))
model.add(layers.Dense(n_outputs, activation='sigmoid'))
model.summary()
model.compile(loss='mse', optimizer=opt, metrics=['accuracy'])
train_y = np.concatenate(train_y).reshape(len(train_y), 168)
callback_early_stopping = EarlyStopping(monitor='val_loss',
patience=5, verbose=1)
callback_tensorboard = TensorBoard(log_dir='./23_logs/',
histogram_freq=0,
write_graph=False)
callback_reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.1,
min_lr=1e-4,
patience=0,
verbose=1)
callbacks = [callback_early_stopping,
callback_tensorboard,
callback_reduce_lr]
history = model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose, shuffle=False
, validation_split=0.2, callbacks=callbacks)
When i fit the network i get:
11/11 [==============================] - 5s 305ms/step - loss: 0.1625 - accuracy: 0.0207 - val_loss: 0.1905 - val_accuracy: 0.0157
Epoch 2/200
11/11 [==============================] - 2s 179ms/step - loss: 0.1594 - accuracy: 0.0037 - val_loss: 0.1879 - val_accuracy: 0.0157
Epoch 3/200
11/11 [==============================] - 2s 169ms/step - loss: 0.1571 - accuracy: 0.0040 - val_loss: 0.1855 - val_accuracy: 0.0079
Epoch 4/200
11/11 [==============================] - 2s 165ms/step - loss: 0.1550 - accuracy: 0.0092 - val_loss: 0.1832 - val_accuracy: 0.0079
Epoch 5/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1529 - accuracy: 0.0102 - val_loss: 0.1809 - val_accuracy: 0.0079
Epoch 6/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1508 - accuracy: 0.0085 - val_loss: 0.1786 - val_accuracy: 0.0079
Epoch 7/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1487 - accuracy: 0.0023 - val_loss: 0.1763 - val_accuracy: 0.0079
Epoch 8/200
11/11 [==============================] - 2s 158ms/step - loss: 0.1467 - accuracy: 0.0023 - val_loss: 0.1740 - val_accuracy: 0.0079
Epoch 9/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1446 - accuracy: 0.0034 - val_loss: 0.1718 - val_accuracy: 0.0000e+00
Epoch 10/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1426 - accuracy: 0.0034 - val_loss: 0.1695 - val_accuracy: 0.0000e+00
Epoch 11/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1406 - accuracy: 0.0034 - val_loss: 0.1673 - val_accuracy: 0.0000e+00
Epoch 12/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1387 - accuracy: 0.0034 - val_loss: 0.1651 - val_accuracy: 0.0000e+00
Epoch 13/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1367 - accuracy: 0.0052 - val_loss: 0.1629 - val_accuracy: 0.0000e+00
Epoch 14/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1348 - accuracy: 0.0052 - val_loss: 0.1608 - val_accuracy: 0.0000e+00
Epoch 15/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1328 - accuracy: 0.0052 - val_loss: 0.1586 - val_accuracy: 0.0000e+00
Epoch 16/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1309 - accuracy: 0.0052 - val_loss: 0.1565 - val_accuracy: 0.0000e+00
Epoch 17/200
11/11 [==============================] - 2s 171ms/step - loss: 0.1290 - accuracy: 0.0052 - val_loss: 0.1544 - val_accuracy: 0.0000e+00
Epoch 18/200
11/11 [==============================] - 2s 174ms/step - loss: 0.1271 - accuracy: 0.0052 - val_loss: 0.1523 - val_accuracy: 0.0000e+00
Epoch 19/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1253 - accuracy: 0.0052 - val_loss: 0.1502 - val_accuracy: 0.0000e+00
Epoch 20/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1234 - accuracy: 0.0052 - val_loss: 0.1482 - val_accuracy: 0.0000e+00
Epoch 21/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1216 - accuracy: 0.0052 - val_loss: 0.1461 - val_accuracy: 0.0000e+00
Epoch 22/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1198 - accuracy: 0.0052 - val_loss: 0.1441 - val_accuracy: 0.0000e+00
Epoch 23/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1180 - accuracy: 0.0052 - val_loss: 0.1421 - val_accuracy: 0.0000e+00
Epoch 24/200
11/11 [==============================] - 2s 163ms/step - loss: 0.1162 - accuracy: 0.0052 - val_loss: 0.1401 - val_accuracy: 0.0000e+00
Epoch 25/200
11/11 [==============================] - 2s 167ms/step - loss: 0.1145 - accuracy: 0.0052 - val_loss: 0.1381 - val_accuracy: 0.0000e+00
Epoch 26/200
11/11 [==============================] - 2s 188ms/step - loss: 0.1127 - accuracy: 0.0052 - val_loss: 0.1361 - val_accuracy: 0.0000e+00
Epoch 27/200
11/11 [==============================] - 2s 169ms/step - loss: 0.1110 - accuracy: 0.0052 - val_loss: 0.1342 - val_accuracy: 0.0000e+00
Epoch 28/200
11/11 [==============================] - 2s 189ms/step - loss: 0.1093 - accuracy: 0.0052 - val_loss: 0.1323 - val_accuracy: 0.0000e+00
Epoch 29/200
11/11 [==============================] - 2s 183ms/step - loss: 0.1076 - accuracy: 0.0079 - val_loss: 0.1304 - val_accuracy: 0.0000e+00
Epoch 30/200
11/11 [==============================] - 2s 172ms/step - loss: 0.1059 - accuracy: 0.0079 - val_loss: 0.1285 - val_accuracy: 0.0000e+00
Epoch 31/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1042 - accuracy: 0.0079 - val_loss: 0.1266 - val_accuracy: 0.0000e+00
Epoch 32/200
Accuracy always remains very low and sometimes (like this case) val_accuracy becomes 0 and never changes. While loss and val_loss do not converge well but decrease. I realize that I am certainly doing many things wrong and I cannot understand how I can fix it. I have obviously tried with other hyperparameters and also with other networks like LSTM, but I didn't get satisfactory results.
How can I improve the model so that the accuracy is at least decent? Any advice is welcome, thank you very much!

Keras model sometimes doesn't train

This is my model.
def get_model2():
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
0.001,
decay_steps=100000,
decay_rate=0.96,
staircase=True)
model = Sequential()
model.add(Dense(1024,activation='relu',input_shape=[44]))
model.add(Dropout(0.2))
model.add(Dense(2048,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(4098,activation='relu'))
model.add(Dense(2048,activation='relu'))
model.add(Dense(1024,activation='relu'))
model.add(Dense(512,activation='relu'))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss=my_binary_crossentropy,optimizer=tf.keras.optimizers.Adam(learning_rate=lr_schedule),metrics=['accuracy'])
return model
And this is my training:
from sklearn.model_selection import RepeatedStratifiedKFold
model = get_model2()
model.save_weights('model.h5')
o = 0
hlavny_list = []
skf = RepeatedStratifiedKFold(n_splits=10,n_repeats=10,random_state=2027)
for train_index,test_index in skf.split(X, Y):
o = o + 1
X_test, X_train = X[train_index], X[test_index]
y_test, y_train = Y[train_index], Y[test_index]
model.load_weights('model.h5')
model.fit(X_train,y_train,epochs=10000,batch_size=256,validation_data=(X_test,y_test),callbacks=[early_stop])
vys = model.predict_classes(X_test)
a,b,c,d = potrebne_miery(y_true=y_test,y_pred=vys)
hl_list = [a,b,c,d]
hlavny_list.append(hl_list)
if (o % 4 == 0):
np.savetxt('/content/drive/My Drive/siete/model_t_t_1_9_moja_loss_v2.csv',np.array(hlavny_list),delimiter=',')
np.savetxt('/content/drive/My Drive/siete/model_t_t_1_9_moja_loss_v2.csv',np.array(hlavny_list),delimiter=',')
Nothing special here, except my own loss function, which looks like:
import tensorflow as tf
from tensorflow.python.ops import clip_ops
def my_binary_crossentropy(target, output, from_logits=False):
target = ops.convert_to_tensor_v2(target)
output = ops.convert_to_tensor_v2(output)
target = tf.cast(target, tf.float32)
epsilon_ = 0.01
output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)
# Compute cross entropy from probabilities.
bce = 8 * target * math_ops.log(output + epsilon_)
bce += (1 - target) * math_ops.log(1 - output + epsilon_)
return -bce
Before i was using Binary crossentropy and everything worked fine, no problem. But when i change loss function, some problem occured which i don't understand. Model is training in loop, and sometimes and i don't why or when, model behave like doesn't train.
Epoch 1/10000
124/124 [==============================] - 6s 47ms/step - loss: 1.1125 - accuracy: 0.4600 - val_loss: 0.7640 - val_accuracy: 0.9312
Epoch 2/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.6418 - accuracy: 0.8598 - val_loss: 0.5307 - val_accuracy: 0.8718
Epoch 3/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.5434 - accuracy: 0.8768 - val_loss: 0.5416 - val_accuracy: 0.8736
Epoch 4/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.5167 - accuracy: 0.8820 - val_loss: 0.5383 - val_accuracy: 0.9165
Epoch 5/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.4948 - accuracy: 0.8898 - val_loss: 0.5136 - val_accuracy: 0.9156
Epoch 6/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.4693 - accuracy: 0.8910 - val_loss: 0.5088 - val_accuracy: 0.9130
Epoch 7/10000
124/124 [==============================] - 6s 47ms/step - loss: 0.4533 - accuracy: 0.8925 - val_loss: 0.5163 - val_accuracy: 0.8551
Epoch 8/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.4257 - accuracy: 0.8883 - val_loss: 0.5490 - val_accuracy: 0.9189
Epoch 9/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.4237 - accuracy: 0.8919 - val_loss: 0.5302 - val_accuracy: 0.8172
Epoch 10/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.4072 - accuracy: 0.8859 - val_loss: 0.5591 - val_accuracy: 0.9278
Epoch 11/10000
124/124 [==============================] - 6s 46ms/step - loss: 0.3831 - accuracy: 0.8908 - val_loss: 0.5563 - val_accuracy: 0.8937
Epoch 00011: early stopping
32695
24925
221726
5339
Epoch 1/10000
124/124 [==============================] - 6s 48ms/step - loss: 4.1699 - accuracy: 0.8661 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 2/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 3/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 4/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 5/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 6/10000
124/124 [==============================] - 6s 47ms/step - loss: 4.1813 - accuracy: 0.8664 - val_loss: 4.1812 - val_accuracy: 0.8664
Epoch 00006: early stopping
Loss, val_loss nor accuracy is not decreasing. I think that is some problem in loss function, because this problem occured after new loss function, this loop i have done maybe 10 000 times without error. It happens maybe 1 in 4 cycles. What is wrong? I will be very grateful for help. Thank you
Your dropout values are too low and make learning harder for the model. Use higher dropout value to overcome problem you have.
Start by building simple model with one hidden layer and use popular hyper parameters.
You can start fine tunning hyper parameters while updating the model as you move.
That is the simple and best way to debug this from my point of view.
Let me know if you need further help.

Categories