I'm doing a species classification task from kaggle (https://www.kaggle.com/competitions/yum-or-yuck-butterfly-mimics-2022/overview). I decided to use transfer learning to tackle this problem since there aren't that many images. The model is as follows:
inputs = tf.keras.layers.Input(shape=(224, 224, 3))
base_model = tf.keras.applications.resnet50.ResNet50(
input_shape=(224,224,3),
include_top=False,
weights="imagenet")
for layer in base_model.layers:
layer.trainable = False
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(1024, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(512, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(64, activation="relu")(x)
output = tf.keras.layers.Dense(6, activation="softmax")(x)
model = tf.keras.Model(inputs=inputs, outputs=output)
As per the guidelines when doing transfer learning:https://keras.io/guides/transfer_learning/, I'm freezing the resnet layers and training the model on inference only (training=False). However, the results show that the model is not learning properly. Convergence doesn't seem like it will be possible even after nearly 200 epochs:
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="categorical_crossentropy",
metrics="accuracy",
)
stop_early = tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
min_delta=0.0001,
patience=20,
restore_best_weights=True
)
history = model.fit(train_generator,
validation_data = val_generator,
epochs = 200,
callbacks=[stop_early])
22/22 [==============================] - 19s 442ms/step - loss: 1.9317 - accuracy: 0.1794 - val_loss: 1.8272 - val_accuracy: 0.1618
Epoch 2/200
22/22 [==============================] - 9s 398ms/step - loss: 1.8250 - accuracy: 0.1882 - val_loss: 1.7681 - val_accuracy: 0.2197
Epoch 3/200
22/22 [==============================] - 9s 402ms/step - loss: 1.7927 - accuracy: 0.2294 - val_loss: 1.7612 - val_accuracy: 0.2139
Epoch 4/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7930 - accuracy: 0.2000 - val_loss: 1.7640 - val_accuracy: 0.2139
Epoch 5/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7872 - accuracy: 0.2132 - val_loss: 1.7489 - val_accuracy: 0.3121
Epoch 6/200
22/22 [==============================] - 9s 389ms/step - loss: 1.7700 - accuracy: 0.2574 - val_loss: 1.7378 - val_accuracy: 0.2543
Epoch 7/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7676 - accuracy: 0.2353 - val_loss: 1.7229 - val_accuracy: 0.3064
Epoch 8/200
22/22 [==============================] - 9s 427ms/step - loss: 1.7721 - accuracy: 0.2353 - val_loss: 1.7225 - val_accuracy: 0.2948
Epoch 9/200
22/22 [==============================] - 9s 399ms/step - loss: 1.7522 - accuracy: 0.2588 - val_loss: 1.7267 - val_accuracy: 0.2948
Epoch 10/200
22/22 [==============================] - 9s 395ms/step - loss: 1.7434 - accuracy: 0.2735 - val_loss: 1.7151 - val_accuracy: 0.2948
Epoch 11/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7500 - accuracy: 0.2632 - val_loss: 1.7083 - val_accuracy: 0.3064
Epoch 12/200
22/22 [==============================] - 9s 425ms/step - loss: 1.7307 - accuracy: 0.2721 - val_loss: 1.6899 - val_accuracy: 0.3179
Epoch 13/200
22/22 [==============================] - 9s 407ms/step - loss: 1.7439 - accuracy: 0.2794 - val_loss: 1.7045 - val_accuracy: 0.2948
Epoch 14/200
22/22 [==============================] - 9s 404ms/step - loss: 1.7376 - accuracy: 0.2706 - val_loss: 1.7118 - val_accuracy: 0.2659
Epoch 15/200
22/22 [==============================] - 9s 419ms/step - loss: 1.7588 - accuracy: 0.2647 - val_loss: 1.6684 - val_accuracy: 0.3237
Epoch 16/200
22/22 [==============================] - 9s 394ms/step - loss: 1.7289 - accuracy: 0.2824 - val_loss: 1.6733 - val_accuracy: 0.3064
Epoch 17/200
22/22 [==============================] - 9s 387ms/step - loss: 1.7184 - accuracy: 0.2809 - val_loss: 1.7185 - val_accuracy: 0.2659
Epoch 18/200
22/22 [==============================] - 9s 408ms/step - loss: 1.7242 - accuracy: 0.2765 - val_loss: 1.6961 - val_accuracy: 0.2717
Epoch 19/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7218 - accuracy: 0.2853 - val_loss: 1.6757 - val_accuracy: 0.3006
Epoch 20/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7248 - accuracy: 0.2882 - val_loss: 1.6716 - val_accuracy: 0.3064
Epoch 21/200
22/22 [==============================] - 9s 401ms/step - loss: 1.7134 - accuracy: 0.2838 - val_loss: 1.6666 - val_accuracy: 0.2948
Epoch 22/200
22/22 [==============================] - 9s 393ms/step - loss: 1.7140 - accuracy: 0.2941 - val_loss: 1.6427 - val_accuracy: 0.3064
I need to unfreeze the layers and turn off inference in order for the model to learn. I tested the same scenario with EfficientNet and the same thing happened. Finally, I also used Xception, and freezing the layers and running with inference was fine. So it seems they behave differently, even though they all have batchnorm layers.
I'm not understanding what is going on here. Why would I need to turn inference off? Could anyone have a clue about this?
EDIT:
results from Resnet50:
results from Xception:
Related
As the title suggests I am training my IRV2 network using the following EarlyStopping definition:
callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, mode="auto")
However, the training doesn't stop when I get three equal values of val_loss:
history = model.fit(
X_train_s,
y_train_categorical,
steps_per_epoch=steps_per_epoch,
epochs=epochs,
batch_size=batch_size,
validation_data=(X_validation_s, y_validation_categorical),
callbacks=[callback]
)
This is my model:
def Inception_Resnet_V2_Binary(x_train, batch_size):
conv_base = InceptionResNetV2(weights='imagenet', include_top=False, input_shape=image_size, pooling="avg")
for layer in conv_base.layers:
layer.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(2, activation='softmax'))
steps_per_epoch = (len(x_train))/ batch_size
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate=0.0002,
decay_steps=steps_per_epoch * 2,
decay_rate=0.7)
opt = Adam(learning_rate=lr_schedule)
model.compile(loss="binary_crossentropy", optimizer=opt, metrics = ["accuracy", tf.metrics.AUC()])
return model
This is the training output:
Epoch 28/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5149 - accuracy: 0.7398 - auc_1: 0.8339 - val_loss: 0.5217 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 29/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5127 - accuracy: 0.7441 - auc_1: 0.8354 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 30/70
151/151 [==============================] - 12s 79ms/step - loss: 0.5144 - accuracy: 0.7384 - auc_1: 0.8321 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 31/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5152 - accuracy: 0.7402 - auc_1: 0.8332 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 32/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5143 - accuracy: 0.7410 - auc_1: 0.8347 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 33/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5124 - accuracy: 0.7404 - auc_1: 0.8352 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 34/70
151/151 [==============================] - 12s 81ms/step - loss: 0.5106 - accuracy: 0.7441 - auc_1: 0.8363 - val_loss: 0.5216 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 35/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5129 - accuracy: 0.7389 - auc_1: 0.8342 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8245
Epoch 36/70
151/151 [==============================] - 12s 79ms/step - loss: 0.5122 - accuracy: 0.7400 - auc_1: 0.8341 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 37/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5160 - accuracy: 0.7424 - auc_1: 0.8346 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246
Epoch 38/70
151/151 [==============================] - 12s 78ms/step - loss: 0.5175 - accuracy: 0.7367 - auc_1: 0.8318 - val_loss: 0.5215 - val_accuracy: 0.7365 - val_auc_1: 0.8246
That happens because the loss is actually decreasing, but by a value that is very low. If you don't set the min_delta in the early stopping callback, the training will consider a negligible improvement as an actual improvement. You can solve the problem by simply adding min_delta argument=0.001:
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, mode="auto", min_delta=0.001)
You can set the min_delta as you feel suitable. min_delta=0.001 will consider changes in loss that are less than that value as no improvement
That val_loss it's not a float of 4 decimal numbers, you are not seeing the entire value, patience should be used to prevent a network in overfitting to run for hours, if the val_loss keeps lowering itself just let the network run (and from the loss the learning rate seems a bit high).
Val_loss is a float32, those 4 values are only the 4 most significant value, to see what's really going on you can't rely on the fit output, you will need a callback of some sort that prints the val_loss with the format you want.
you can find some examples here:
https://keras.io/guides/writing_your_own_callbacks/
İ am working on transfer learning for multiclass classification of image datasets that consists of 12 classes. As a result, İ am using VGG19. However, the accuracy of the model is as much lower than the expectation. İn addition train and valid accuracy do not increase. Besides that İ ma trying to decrease the batch size which is still 383
My code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import shutil
import os
import glob as gb
import tensorflow as tf
from tensorflow.keras.models import Sequential
from zipfile import ZipFile
import cv2
from tensorflow.keras.layers import Flatten,Dense,BatchNormalization,Activation,Dropout
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras import optimizers
IMAGE_SHAPE = (256, 256)
BATCH_SIZE = 32
#--------------------------------------------------Train--------------------------------
train = ImageDataGenerator()
train_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale= 1./255, fill_mode= 'nearest')
train_data = train_generator.flow_from_directory(directory="/content/dataset_base/train",target_size=IMAGE_SHAPE , color_mode="rgb" , class_mode='categorical', batch_size=BATCH_SIZE , shuffle = True )
#--------------------------------------------------valid-------------------------------
valid = ImageDataGenerator()
validation_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
valid_data = validation_generator.flow_from_directory(directory="/content/dataset_base/valid", target_size=IMAGE_SHAPE , color_mode="rgb" , class_mode='categorical' , batch_size=BATCH_SIZE , shuffle = True )
#--------------------------------------------------Test---------------------------------------------------------------------------------------------------
test = ImageDataGenerator()
test_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
test_data = test_generator.flow_from_directory(directory="/content/dataset_base/valid",target_size=IMAGE_SHAPE , color_mode="rgb" , class_mode='categorical' , batch_size=1 , shuffle = False )
test_data.reset()
for image_batch, labels_batch in train_data:
print(image_batch.shape)
print(labels_batch.shape)
break
#Defining the VGG Convolutional Neural Net
base_model = VGG19(weights='imagenet', input_shape=(256, 256, 3), include_top=False)
from tensorflow.keras.layers import Flatten,Dense,BatchNormalization,Activation,Dropout
from tensorflow.keras import optimizers
inputs = tf.keras.Input(shape=(256, 256, 3))
x = base_model(inputs, training=False)
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dense(256, activation='relu')(x)
outputs = Dense(12,activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)
model.summary()
Model: "model_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_10 (InputLayer) [(None, 256, 256, 3)] 0
_________________________________________________________________
vgg19 (Functional) (None, 8, 8, 512) 20024384
_________________________________________________________________
flatten_8 (Flatten) (None, 32768) 0
_________________________________________________________________
dense_32 (Dense) (None, 256) 8388864
_________________________________________________________________
dense_33 (Dense) (None, 256) 65792
_________________________________________________________________
dense_34 (Dense) (None, 12) 3084
=================================================================
Total params: 28,482,124
Trainable params: 28,482,124
Non-trainable params: 0
model.compile(optimizer = optimizers.Adam(learning_rate=0.05), loss='categorical_crossentropy', metrics=["accuracy"])
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
checkpoint = ModelCheckpoint("vgg16_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, period=1)
history = model.fit(
train_data,
validation_data=valid_data,
batch_size = 256,
epochs=10,
callbacks=[
tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=55,
restore_best_weights=True
)
]
)
model_final.save_weights("vgg16_1.h5")
The result after 10 epochs:
Epoch 1/10
383/383 [==============================] - 214s 557ms/step - loss: 2.4934 - accuracy: 0.0781 - val_loss: 2.4919 - val_accuracy: 0.0833
Epoch 2/10
383/383 [==============================] - 219s 572ms/step - loss: 2.4918 - accuracy: 0.0847 - val_loss: 2.4888 - val_accuracy: 0.0833
Epoch 3/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4930 - accuracy: 0.0840 - val_loss: 2.4918 - val_accuracy: 0.0833
Epoch 4/10
383/383 [==============================] - 220s 574ms/step - loss: 2.4919 - accuracy: 0.0842 - val_loss: 2.4934 - val_accuracy: 0.0833
Epoch 5/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4928 - accuracy: 0.0820 - val_loss: 2.4893 - val_accuracy: 0.0833
Epoch 6/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4921 - accuracy: 0.0842 - val_loss: 2.4920 - val_accuracy: 0.0833
Epoch 7/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4922 - accuracy: 0.0858 - val_loss: 2.4910 - val_accuracy: 0.0833
Epoch 8/10
383/383 [==============================] - 219s 573ms/step - loss: 2.4920 - accuracy: 0.0862 - val_loss: 2.4912 - val_accuracy: 0.0833
Epoch 9/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4926 - accuracy: 0.0813 - val_loss: 2.4943 - val_accuracy: 0.0833
Epoch 10/10
383/383 [==============================] - 220s 573ms/step - loss: 2.4920 - accuracy: 0.0829 - val_loss: 2.4948 - val_accuracy: 0.0833
İ updated my code. Now, İ am getting good accuracy for the train-set, whereas the lower accuracy of the valid set.İt is obviously overfitting. Now how may İ get the best accuracy for valid too?
Updated code:
# Separate in train and test data
train_df, test_df = train_test_split(image_df, train_size=0.8, shuffle=True, random_state=1)
# Create the generators
train_generator,test_generator,train_images,val_images,test_images = create_gen()
Found 3545 validated image filenames belonging to 12 classes.
Found 886 validated image filenames belonging to 12 classes.
Found 1108 validated image filenames belonging to 12 classes.
from tensorflow.keras.applications.vgg16 import VGG16
vggmodel = VGG16(weights='imagenet', include_top=True)
vggmodel.trainable = False
for layers in (vggmodel.layers)[:19]:
print(layers)
layers.trainable = False
<tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7ff3dc0e10d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3dc0e8910>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0f6910>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca0f65d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca08d810>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca08d150>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca089f90>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca090c10>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca09b8d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff4291d8290>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca0825d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0aae10>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3cb71b7d0>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0a7a10>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca03c090>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca03c250>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0a7810>
<tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ff3ca0aa410>
<tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ff3ca0434d0>
X= vggmodel.layers[-17].output
X=tf.keras.layers.Dropout(0.09)(X)
X = tf.keras.layers.Flatten()(X)
predictions = Dense(12, activation="softmax")(X)
model_final = Model(vggmodel.input, predictions)
model_final.compile(optimizer = optimizers.Adam(learning_rate=0.005), loss='categorical_crossentropy', metrics=["accuracy"])
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
checkpoint = ModelCheckpoint("vgg16_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, period=1)
rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.005,patience=1, verbose=1)
History = model_final.fit(train_images,validation_data=val_images,batch_size = 64,epochs=60,
callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_loss',patience=60,restore_best_weights=True), ])
tf.keras.callbacks.ModelCheckpoint(filepaths, monitor='val_loss', verbose=0, save_best_only=False,save_weights_only=False, mode='auto', save_freq='epoch')
model_final.save_weights("vgg16_1.h5")
Epoch 1/60
111/111 [==============================] - 49s 436ms/step - loss: 2.3832 - accuracy: 0.5396 - val_loss: 1.7135 - val_accuracy: 0.6095
Epoch 2/60
111/111 [==============================] - 39s 348ms/step - loss: 1.0699 - accuracy: 0.7385 - val_loss: 1.6680 - val_accuracy: 0.6208
Epoch 3/60
111/111 [==============================] - 38s 338ms/step - loss: 0.8366 - accuracy: 0.7853 - val_loss: 1.3777 - val_accuracy: 0.6975
Epoch 4/60
111/111 [==============================] - 37s 337ms/step - loss: 0.6213 - accuracy: 0.8299 - val_loss: 1.3766 - val_accuracy: 0.7269
Epoch 5/60
111/111 [==============================] - 38s 339ms/step - loss: 0.6173 - accuracy: 0.8446 - val_loss: 1.9170 - val_accuracy: 0.7133
Epoch 6/60
111/111 [==============================] - 38s 338ms/step - loss: 0.5782 - accuracy: 0.8511 - val_loss: 1.9968 - val_accuracy: 0.6501
Epoch 7/60
111/111 [==============================] - 37s 337ms/step - loss: 0.5672 - accuracy: 0.8564 - val_loss: 1.6436 - val_accuracy: 0.7088
Epoch 8/60
111/111 [==============================] - 38s 340ms/step - loss: 0.3971 - accuracy: 0.8894 - val_loss: 1.6819 - val_accuracy: 0.7314
Epoch 9/60
111/111 [==============================] - 38s 342ms/step - loss: 0.3657 - accuracy: 0.9038 - val_loss: 1.9244 - val_accuracy: 0.7133
Epoch 10/60
111/111 [==============================] - 37s 337ms/step - loss: 0.4003 - accuracy: 0.9016 - val_loss: 1.8337 - val_accuracy: 0.7246
Epoch 11/60
111/111 [==============================] - 38s 341ms/step - loss: 0.6439 - accuracy: 0.8731 - val_loss: 1.8070 - val_accuracy: 0.7460
Epoch 12/60
111/111 [==============================] - 37s 338ms/step - loss: 0.2917 - accuracy: 0.9190 - val_loss: 1.7533 - val_accuracy: 0.7494
Epoch 13/60
111/111 [==============================] - 37s 336ms/step - loss: 0.4685 - accuracy: 0.9032 - val_loss: 1.9534 - val_accuracy: 0.7393
Epoch 14/60
111/111 [==============================] - 38s 339ms/step - loss: 0.3936 - accuracy: 0.9061 - val_loss: 1.8643 - val_accuracy: 0.7280
Epoch 15/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2598 - accuracy: 0.9368 - val_loss: 1.7242 - val_accuracy: 0.7856
Epoch 16/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2884 - accuracy: 0.9360 - val_loss: 1.7374 - val_accuracy: 0.7517
Epoch 17/60
111/111 [==============================] - 38s 341ms/step - loss: 0.2487 - accuracy: 0.9362 - val_loss: 1.6373 - val_accuracy: 0.7889
Epoch 18/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1683 - accuracy: 0.9532 - val_loss: 1.6612 - val_accuracy: 0.7698
Epoch 19/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1354 - accuracy: 0.9591 - val_loss: 1.7372 - val_accuracy: 0.7889
Epoch 20/60
111/111 [==============================] - 38s 339ms/step - loss: 0.2793 - accuracy: 0.9329 - val_loss: 2.0405 - val_accuracy: 0.7596
Epoch 21/60
111/111 [==============================] - 38s 338ms/step - loss: 0.3049 - accuracy: 0.9306 - val_loss: 1.8485 - val_accuracy: 0.7912
Epoch 22/60
111/111 [==============================] - 37s 338ms/step - loss: 0.2724 - accuracy: 0.9399 - val_loss: 1.8225 - val_accuracy: 0.7856
Epoch 23/60
111/111 [==============================] - 37s 337ms/step - loss: 0.2088 - accuracy: 0.9475 - val_loss: 2.1015 - val_accuracy: 0.7675
Epoch 24/60
111/111 [==============================] - 38s 341ms/step - loss: 0.2112 - accuracy: 0.9470 - val_loss: 2.2647 - val_accuracy: 0.7404
Epoch 25/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2172 - accuracy: 0.9467 - val_loss: 2.4213 - val_accuracy: 0.7675
Epoch 26/60
111/111 [==============================] - 37s 336ms/step - loss: 0.3093 - accuracy: 0.9300 - val_loss: 2.3260 - val_accuracy: 0.7630
Epoch 27/60
111/111 [==============================] - 37s 338ms/step - loss: 0.3036 - accuracy: 0.9427 - val_loss: 2.4329 - val_accuracy: 0.7460
Epoch 28/60
111/111 [==============================] - 37s 338ms/step - loss: 0.2641 - accuracy: 0.9436 - val_loss: 2.6936 - val_accuracy: 0.7472
Epoch 29/60
111/111 [==============================] - 37s 333ms/step - loss: 0.2258 - accuracy: 0.9509 - val_loss: 2.3055 - val_accuracy: 0.7788
Epoch 30/60
111/111 [==============================] - 37s 334ms/step - loss: 0.2921 - accuracy: 0.9436 - val_loss: 2.3668 - val_accuracy: 0.7517
Epoch 31/60
111/111 [==============================] - 37s 334ms/step - loss: 0.2830 - accuracy: 0.9447 - val_loss: 2.1422 - val_accuracy: 0.7720
Epoch 32/60
111/111 [==============================] - 37s 337ms/step - loss: 0.3584 - accuracy: 0.9312 - val_loss: 3.2875 - val_accuracy: 0.7122
Epoch 33/60
111/111 [==============================] - 37s 337ms/step - loss: 0.3279 - accuracy: 0.9413 - val_loss: 2.3641 - val_accuracy: 0.7686
Epoch 34/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2326 - accuracy: 0.9526 - val_loss: 2.8010 - val_accuracy: 0.7754
Epoch 35/60
111/111 [==============================] - 38s 337ms/step - loss: 0.3131 - accuracy: 0.9388 - val_loss: 2.6276 - val_accuracy: 0.7698
Epoch 36/60
111/111 [==============================] - 37s 335ms/step - loss: 0.1961 - accuracy: 0.9585 - val_loss: 2.4269 - val_accuracy: 0.7912
Epoch 37/60
111/111 [==============================] - 37s 336ms/step - loss: 0.1915 - accuracy: 0.9599 - val_loss: 2.9607 - val_accuracy: 0.7630
Epoch 38/60
111/111 [==============================] - 38s 339ms/step - loss: 0.2457 - accuracy: 0.9512 - val_loss: 3.2177 - val_accuracy: 0.7438
Epoch 39/60
111/111 [==============================] - 38s 346ms/step - loss: 0.1575 - accuracy: 0.9670 - val_loss: 2.7473 - val_accuracy: 0.7675
Epoch 40/60
111/111 [==============================] - 38s 343ms/step - loss: 0.1841 - accuracy: 0.9591 - val_loss: 3.1237 - val_accuracy: 0.7415
Epoch 41/60
111/111 [==============================] - 38s 339ms/step - loss: 0.2344 - accuracy: 0.9498 - val_loss: 2.7585 - val_accuracy: 0.7630
Epoch 42/60
111/111 [==============================] - 37s 337ms/step - loss: 0.2115 - accuracy: 0.9588 - val_loss: 3.0896 - val_accuracy: 0.7314
Epoch 43/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1407 - accuracy: 0.9721 - val_loss: 2.9159 - val_accuracy: 0.7381
Epoch 44/60
111/111 [==============================] - 38s 336ms/step - loss: 0.2045 - accuracy: 0.9616 - val_loss: 2.7607 - val_accuracy: 0.7630
Epoch 45/60
111/111 [==============================] - 38s 338ms/step - loss: 0.1343 - accuracy: 0.9698 - val_loss: 2.8174 - val_accuracy: 0.7709
Epoch 46/60
111/111 [==============================] - 37s 335ms/step - loss: 0.2094 - accuracy: 0.9605 - val_loss: 3.3286 - val_accuracy: 0.7630
Epoch 47/60
111/111 [==============================] - 38s 340ms/step - loss: 0.3178 - accuracy: 0.9433 - val_loss: 4.0310 - val_accuracy: 0.7201
Epoch 48/60
111/111 [==============================] - 39s 350ms/step - loss: 0.2973 - accuracy: 0.9515 - val_loss: 3.3076 - val_accuracy: 0.7698
Epoch 49/60
111/111 [==============================] - 38s 339ms/step - loss: 0.2969 - accuracy: 0.9535 - val_loss: 3.3232 - val_accuracy: 0.7630
Epoch 50/60
111/111 [==============================] - 38s 339ms/step - loss: 0.1693 - accuracy: 0.9678 - val_loss: 3.3474 - val_accuracy: 0.7528
Epoch 51/60
111/111 [==============================] - 38s 338ms/step - loss: 0.2769 - accuracy: 0.9484 - val_loss: 3.5787 - val_accuracy: 0.7573
Epoch 52/60
111/111 [==============================] - 37s 336ms/step - loss: 0.2483 - accuracy: 0.9611 - val_loss: 3.3097 - val_accuracy: 0.7822
Epoch 53/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1674 - accuracy: 0.9670 - val_loss: 3.9156 - val_accuracy: 0.7494
Epoch 54/60
111/111 [==============================] - 38s 340ms/step - loss: 0.1372 - accuracy: 0.9701 - val_loss: 3.4078 - val_accuracy: 0.7810
Epoch 55/60
111/111 [==============================] - 39s 350ms/step - loss: 0.1391 - accuracy: 0.9726 - val_loss: 3.4077 - val_accuracy: 0.7709
Epoch 56/60
111/111 [==============================] - 38s 342ms/step - loss: 0.2218 - accuracy: 0.9628 - val_loss: 3.5559 - val_accuracy: 0.7517
Epoch 57/60
111/111 [==============================] - 38s 340ms/step - loss: 0.1355 - accuracy: 0.9749 - val_loss: 3.4460 - val_accuracy: 0.7709
Epoch 58/60
111/111 [==============================] - 38s 344ms/step - loss: 0.2096 - accuracy: 0.9650 - val_loss: 3.7931 - val_accuracy: 0.7460
Epoch 59/60
111/111 [==============================] - 37s 337ms/step - loss: 0.1943 - accuracy: 0.9701 - val_loss: 3.6928 - val_accuracy: 0.7573
Epoch 60/60
111/111 [==============================] - 37s 336ms/step - loss: 0.1337 - accuracy: 0.9743 - val_loss: 3.5595 - val_accuracy: 0.7641
Result:
Test Loss: 3.83508
Accuracy on the test set: 78.61%
Since you are using generators to provide the input to model.fit you should NOT specify the batch size in model.fit as it is already set as 32 in the generators.
I would lower the initial learning rate to .001. Also in callbacks I recommend using the ReduceLROnPlateau callback with the settings shown below
rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5,
patience=1, verbose=1)
add rlronp to the list of callbacks. Also in your model I would add a dropout layer
after the second dense layer with code below. This will help prevent over-fitting
x=tf.keras.layers.Dropout(.3)(x)
383 on the log is not the batch size. It's the number of steps which is data_size / batch_size.
The problem that training does not work properly is probably because of very low or high learning rate. Try adjusting the learning rate.
I am trying to run an autoencoder for dimensionality reduction on a Fraud Detection dataset (https://www.kaggle.com/kartik2112/fraud-detection?select=fraudTest.csv) and am receiving very high loss values for each iteration. Below is the autoencoder code.
nb_epoch = 100
batch_size = 128
input_dim = X_train.shape[1]
encoding_dim = 14
hidden_dim = int(encoding_dim / 2)
learning_rate = 1e-7
input_layer = Input(shape=(input_dim, ))
encoder = Dense(encoding_dim, activation="tanh", activity_regularizer=regularizers.l1(learning_rate))(input_layer)
encoder = Dense(hidden_dim, activation="relu")(encoder)
decoder = Dense(hidden_dim, activation='tanh')(encoder)
decoder = Dense(input_dim, activation='relu')(decoder)
autoencoder = Model(inputs=input_layer, outputs=decoder)
autoencoder.compile(metrics=['accuracy'],
loss='mean_squared_error',
optimizer='adam')
cp = ModelCheckpoint(filepath="autoencoder_fraud.h5",
save_best_only=True,
verbose=0)
tb = TensorBoard(log_dir='./logs',
histogram_freq=0,
write_graph=True,
write_images=True)
history = autoencoder.fit(X_train, X_train,
epochs=nb_epoch,
batch_size=batch_size,
shuffle=True,
validation_data=(X_test, X_test),
verbose=1,
callbacks=[cp, tb]).history
here is a snippet of the loss values.
Epoch 1/100
10131/10131 [==============================] - 32s 3ms/step - loss: 52445827358.6230 - accuracy: 0.3389 - val_loss: 9625651200.0000 - val_accuracy: 0.5083
Epoch 2/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52393605025.8066 - accuracy: 0.5083 - val_loss: 9621398528.0000 - val_accuracy: 0.5083
Epoch 3/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52486496629.1354 - accuracy: 0.5082 - val_loss: 9617147904.0000 - val_accuracy: 0.5083
Epoch 4/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52514002255.9432 - accuracy: 0.5070 - val_loss: 9612887040.0000 - val_accuracy: 0.5083
Epoch 5/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52436489238.6388 - accuracy: 0.5076 - val_loss: 9608664064.0000 - val_accuracy: 0.5083
Epoch 6/100
10131/10131 [==============================] - 31s 3ms/step - loss: 52430005774.7556 - accuracy: 0.5081 - val_loss: 9604417536.0000 - val_accuracy: 0.5083
Epoch 7/100
10131/10131 [==============================] - 31s 3ms/step - loss: 52474495714.5898 - accuracy: 0.5079 - val_loss: 9600195584.0000 - val_accuracy: 0.5083
Epoch 8/100
10131/10131 [==============================] - 31s 3ms/step - loss: 52423052560.0695 - accuracy: 0.5076 - val_loss: 9595947008.0000 - val_accuracy: 0.5083
Epoch 9/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52442358260.0742 - accuracy: 0.5072 - val_loss: 9591708672.0000 - val_accuracy: 0.5083
Epoch 10/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52402494704.5369 - accuracy: 0.5089 - val_loss: 9587487744.0000 - val_accuracy: 0.5083
Epoch 11/100
10131/10131 [==============================] - 31s 3ms/step - loss: 52396583628.3553 - accuracy: 0.5081 - val_loss: 9583238144.0000 - val_accuracy: 0.5083
Epoch 12/100
10131/10131 [==============================] - 31s 3ms/step - loss: 52349824708.2700 - accuracy: 0.5076 - val_loss: 9579020288.0000 - val_accuracy: 0.5083
Epoch 13/100
10131/10131 [==============================] - 31s 3ms/step - loss: 52332072133.6850 - accuracy: 0.5083 - val_loss: 9574786048.0000 - val_accuracy: 0.5083
Epoch 14/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52353680011.6731 - accuracy: 0.5086 - val_loss: 9570555904.0000 - val_accuracy: 0.5083
Epoch 15/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52347432594.5456 - accuracy: 0.5088 - val_loss: 9566344192.0000 - val_accuracy: 0.5083
Epoch 16/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52327825554.3435 - accuracy: 0.5076 - val_loss: 9562103808.0000 - val_accuracy: 0.5083
Epoch 17/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52347251610.1255 - accuracy: 0.5080 - val_loss: 9557892096.0000 - val_accuracy: 0.5083
Epoch 18/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52292632667.3636 - accuracy: 0.5079 - val_loss: 9553654784.0000 - val_accuracy: 0.5083
Epoch 19/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52354135093.7671 - accuracy: 0.5083 - val_loss: 9549425664.0000 - val_accuracy: 0.5083
Epoch 20/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52295668148.2006 - accuracy: 0.5086 - val_loss: 9545219072.0000 - val_accuracy: 0.5083
Epoch 21/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52314219115.3320 - accuracy: 0.5079 - val_loss: 9540980736.0000 - val_accuracy: 0.5083
Epoch 22/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52328022934.0829 - accuracy: 0.5079 - val_loss: 9536788480.0000 - val_accuracy: 0.5083
Epoch 23/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52268139834.5172 - accuracy: 0.5074 - val_loss: 9532554240.0000 - val_accuracy: 0.5083
Epoch 24/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52308370726.3040 - accuracy: 0.5077 - val_loss: 9528341504.0000 - val_accuracy: 0.5083
Epoch 25/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52224468101.4070 - accuracy: 0.5081 - val_loss: 9524126720.0000 - val_accuracy: 0.5083
Epoch 26/100
10131/10131 [==============================] - 30s 3ms/step - loss: 52200100823.1694 - accuracy: 0.5080 - val_loss: 9519915008.0000 - val_accuracy: 0.5083
Any advice/solution will be highly appreciated. Thank you
I have scaled the numarical data using StandardScaler and encoded
categorical data using LabelEncoder
First of all, check what numerical data you scaled.
I think you wrongly scaled cc_num, because cc_num is a categorical column.
This should solve your problem with high loss, but it doen't mean your model will be good.
You should first make a good check on the features and try to get some useful relationships between label and features (data preprocessing/featurezation)
I state that I am not at all familiar with neural networks and this is the first time that I have tried to develop one.
The problem lies in predicting a week's pollution forecast, based on the previous month.
Unstructured data with 15 features are:
Start data
The data to be predicted is 'gas', for a total of 168 hours in the next week, is the hours in a week.
MinMaxScaler(feature_range (0,1)) is applied to the data. And then the data is split into train and test data. Since only one year of hourly measurements is available, the data is resampled in series of 672 hourly samples that each starts from every day of the year at midnight. Therefore, from about 8000 starting hourly surveys, about 600 series of 672 samples are obtained.
The 'date' is removed from the initial data and the form of train_x and train_y is:
Shape of train_x and train_y
In train_x[0] there are 672 hourly readings for the first 4 weeks of the data set and consist of all features including 'gas'.
In train_y [0], on the other hand, there are 168 hourly readings for the following week which begins when the month ends in train_x [0].
Train_X[0] where column 0 is 'gas' and Train_y[0] with only gas column for the next week after train_x[0]
TRAIN X SHAPE = (631, 672, 14)
TRAIN Y SHAPE = (631, 168, 1)
After organizing the data in this way (if it's wrong please let me know), I built the neural network as the following:
train_x, train_y = to_supervised(train, n_input)
train_x = train_x.astype(float)
train_y = train_y.astype(float)
# define parameters
verbose, epochs, batch_size = 1, 200, 50
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# define model
model = Sequential()
opt = optimizers.RMSprop(learning_rate=1e-3)
model.add(layers.GRU(14, activation='relu', input_shape=(n_timesteps, n_features),return_sequences=False, stateful=False))
model.add(layers.Dense(1, activation='relu'))
#model.add(layers.Dense(14, activation='linear'))
model.add(layers.Dense(n_outputs, activation='sigmoid'))
model.summary()
model.compile(loss='mse', optimizer=opt, metrics=['accuracy'])
train_y = np.concatenate(train_y).reshape(len(train_y), 168)
callback_early_stopping = EarlyStopping(monitor='val_loss',
patience=5, verbose=1)
callback_tensorboard = TensorBoard(log_dir='./23_logs/',
histogram_freq=0,
write_graph=False)
callback_reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.1,
min_lr=1e-4,
patience=0,
verbose=1)
callbacks = [callback_early_stopping,
callback_tensorboard,
callback_reduce_lr]
history = model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose, shuffle=False
, validation_split=0.2, callbacks=callbacks)
When i fit the network i get:
11/11 [==============================] - 5s 305ms/step - loss: 0.1625 - accuracy: 0.0207 - val_loss: 0.1905 - val_accuracy: 0.0157
Epoch 2/200
11/11 [==============================] - 2s 179ms/step - loss: 0.1594 - accuracy: 0.0037 - val_loss: 0.1879 - val_accuracy: 0.0157
Epoch 3/200
11/11 [==============================] - 2s 169ms/step - loss: 0.1571 - accuracy: 0.0040 - val_loss: 0.1855 - val_accuracy: 0.0079
Epoch 4/200
11/11 [==============================] - 2s 165ms/step - loss: 0.1550 - accuracy: 0.0092 - val_loss: 0.1832 - val_accuracy: 0.0079
Epoch 5/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1529 - accuracy: 0.0102 - val_loss: 0.1809 - val_accuracy: 0.0079
Epoch 6/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1508 - accuracy: 0.0085 - val_loss: 0.1786 - val_accuracy: 0.0079
Epoch 7/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1487 - accuracy: 0.0023 - val_loss: 0.1763 - val_accuracy: 0.0079
Epoch 8/200
11/11 [==============================] - 2s 158ms/step - loss: 0.1467 - accuracy: 0.0023 - val_loss: 0.1740 - val_accuracy: 0.0079
Epoch 9/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1446 - accuracy: 0.0034 - val_loss: 0.1718 - val_accuracy: 0.0000e+00
Epoch 10/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1426 - accuracy: 0.0034 - val_loss: 0.1695 - val_accuracy: 0.0000e+00
Epoch 11/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1406 - accuracy: 0.0034 - val_loss: 0.1673 - val_accuracy: 0.0000e+00
Epoch 12/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1387 - accuracy: 0.0034 - val_loss: 0.1651 - val_accuracy: 0.0000e+00
Epoch 13/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1367 - accuracy: 0.0052 - val_loss: 0.1629 - val_accuracy: 0.0000e+00
Epoch 14/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1348 - accuracy: 0.0052 - val_loss: 0.1608 - val_accuracy: 0.0000e+00
Epoch 15/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1328 - accuracy: 0.0052 - val_loss: 0.1586 - val_accuracy: 0.0000e+00
Epoch 16/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1309 - accuracy: 0.0052 - val_loss: 0.1565 - val_accuracy: 0.0000e+00
Epoch 17/200
11/11 [==============================] - 2s 171ms/step - loss: 0.1290 - accuracy: 0.0052 - val_loss: 0.1544 - val_accuracy: 0.0000e+00
Epoch 18/200
11/11 [==============================] - 2s 174ms/step - loss: 0.1271 - accuracy: 0.0052 - val_loss: 0.1523 - val_accuracy: 0.0000e+00
Epoch 19/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1253 - accuracy: 0.0052 - val_loss: 0.1502 - val_accuracy: 0.0000e+00
Epoch 20/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1234 - accuracy: 0.0052 - val_loss: 0.1482 - val_accuracy: 0.0000e+00
Epoch 21/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1216 - accuracy: 0.0052 - val_loss: 0.1461 - val_accuracy: 0.0000e+00
Epoch 22/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1198 - accuracy: 0.0052 - val_loss: 0.1441 - val_accuracy: 0.0000e+00
Epoch 23/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1180 - accuracy: 0.0052 - val_loss: 0.1421 - val_accuracy: 0.0000e+00
Epoch 24/200
11/11 [==============================] - 2s 163ms/step - loss: 0.1162 - accuracy: 0.0052 - val_loss: 0.1401 - val_accuracy: 0.0000e+00
Epoch 25/200
11/11 [==============================] - 2s 167ms/step - loss: 0.1145 - accuracy: 0.0052 - val_loss: 0.1381 - val_accuracy: 0.0000e+00
Epoch 26/200
11/11 [==============================] - 2s 188ms/step - loss: 0.1127 - accuracy: 0.0052 - val_loss: 0.1361 - val_accuracy: 0.0000e+00
Epoch 27/200
11/11 [==============================] - 2s 169ms/step - loss: 0.1110 - accuracy: 0.0052 - val_loss: 0.1342 - val_accuracy: 0.0000e+00
Epoch 28/200
11/11 [==============================] - 2s 189ms/step - loss: 0.1093 - accuracy: 0.0052 - val_loss: 0.1323 - val_accuracy: 0.0000e+00
Epoch 29/200
11/11 [==============================] - 2s 183ms/step - loss: 0.1076 - accuracy: 0.0079 - val_loss: 0.1304 - val_accuracy: 0.0000e+00
Epoch 30/200
11/11 [==============================] - 2s 172ms/step - loss: 0.1059 - accuracy: 0.0079 - val_loss: 0.1285 - val_accuracy: 0.0000e+00
Epoch 31/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1042 - accuracy: 0.0079 - val_loss: 0.1266 - val_accuracy: 0.0000e+00
Epoch 32/200
Accuracy always remains very low and sometimes (like this case) val_accuracy becomes 0 and never changes. While loss and val_loss do not converge well but decrease. I realize that I am certainly doing many things wrong and I cannot understand how I can fix it. I have obviously tried with other hyperparameters and also with other networks like LSTM, but I didn't get satisfactory results.
How can I improve the model so that the accuracy is at least decent? Any advice is welcome, thank you very much!
I am building a DNN with keras to classify between background or signal events (HEP). Nevertheless the loss and the accuracy are not changing.
I already tried changing the parameters on the optimizer, normalizing the data, changing the number of layers, neurons, epochs, initializing the weights, etc.
Here's the model:
epochs = 20
num_features = 2
num_classes = 2
batch_size = 32
# model
print("\n Building model...")
model = Sequential()
model.add(Dropout(0.2))
model.add(Dense(128, input_shape=(2,), activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes,activation=tf.nn.softmax))
print("\n Compiling model...")
opt = adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0,
amsgrad=False)
# compile model
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
print("\n Fitting model...")
history = model.fit(x_train, y_train, epochs = epochs,
batch_size = batch_size, validation_data = (x_test, y_test))
I'm expecting a change in the loss but it won't decrease from 0.69-ish.
The epochs report
Building model...
Compiling model...
Fitting model...
Train on 18400 samples, validate on 4600 samples
Epoch 1/20
18400/18400 [==============================] - 1s 71us/step - loss: 0.6939 - acc: 0.4965 - val_loss: 0.6933 - val_acc: 0.5000
Epoch 2/20
18400/18400 [==============================] - 1s 60us/step - loss: 0.6935 - acc: 0.5045 - val_loss: 0.6933 - val_acc: 0.5000
Epoch 3/20
18400/18400 [==============================] - 1s 69us/step - loss: 0.6937 - acc: 0.4993 - val_loss: 0.6934 - val_acc: 0.5000
Epoch 4/20
18400/18400 [==============================] - 1s 65us/step - loss: 0.6939 - acc: 0.4984 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 5/20
18400/18400 [==============================] - 1s 58us/step - loss: 0.6936 - acc: 0.5000 - val_loss: 0.6936 - val_acc: 0.5000
Epoch 6/20
18400/18400 [==============================] - 1s 57us/step - loss: 0.6937 - acc: 0.4913 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 7/20
18400/18400 [==============================] - 1s 58us/step - loss: 0.6935 - acc: 0.5008 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 8/20
18400/18400 [==============================] - 1s 63us/step - loss: 0.6936 - acc: 0.5013 - val_loss: 0.6936 - val_acc: 0.5000
Epoch 9/20
18400/18400 [==============================] - 1s 67us/step - loss: 0.6936 - acc: 0.4924 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 10/20
18400/18400 [==============================] - 1s 61us/step - loss: 0.6933 - acc: 0.5067 - val_loss: 0.6934 - val_acc: 0.5000
Epoch 11/20
18400/18400 [==============================] - 1s 64us/step - loss: 0.6938 - acc: 0.4972 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 12/20
18400/18400 [==============================] - 1s 64us/step - loss: 0.6936 - acc: 0.4991 - val_loss: 0.6934 - val_acc: 0.5000
Epoch 13/20
18400/18400 [==============================] - 1s 70us/step - loss: 0.6937 - acc: 0.4960 - val_loss: 0.6935 - val_acc: 0.5000
Epoch 14/20
18400/18400 [==============================] - 1s 63us/step - loss: 0.6935 - acc: 0.4992 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 15/20
18400/18400 [==============================] - 1s 61us/step - loss: 0.6937 - acc: 0.4940 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 16/20
18400/18400 [==============================] - 1s 68us/step - loss: 0.6933 - acc: 0.5067 - val_loss: 0.6936 - val_acc: 0.5000
Epoch 17/20
18400/18400 [==============================] - 1s 58us/step - loss: 0.6938 - acc: 0.4997 - val_loss: 0.6935 - val_acc: 0.5000
Epoch 18/20
18400/18400 [==============================] - 1s 56us/step - loss: 0.6936 - acc: 0.4972 - val_loss: 0.6941 - val_acc: 0.5000
Epoch 19/20
18400/18400 [==============================] - 1s 57us/step - loss: 0.6934 - acc: 0.5061 - val_loss: 0.6954 - val_acc: 0.5000
Epoch 20/20
18400/18400 [==============================] - 1s 58us/step - loss: 0.6936 - acc: 0.5037 - val_loss: 0.6939 - val_acc: 0.5000
Update: My data preparation contains this
np.random.shuffle(x_train)
np.random.shuffle(y_train)
np.random.shuffle(x_test)
np.random.shuffle(y_test)
And I'm thinking it's changing the class for each data point cause the shuffle is done separately.