I am trying to train some 200 pairs of images using siamese CNNs using Keras and notice that the validation accuracy doesn't change through the epochs.
Train on 144 samples, validate on 16 samples
Epoch 1/20
144/144 [==============================] - 51s 352ms/step - loss: 0.3041 - accuracy: 0.4375 - val_loss: 0.4816 - val_accuracy: 0.5000
Epoch 2/20
144/144 [==============================] - 56s 387ms/step - loss: 0.2819 - accuracy: 0.5208 - val_loss: 0.4816 - val_accuracy: 0.5000
Epoch 3/20
144/144 [==============================] - 47s 325ms/step - loss: 0.2784 - accuracy: 0.4861 - val_loss: 0.4816 - val_accuracy: 0.5000
Epoch 00003: ReduceLROnPlateau reducing learning rate to 0.0001500000071246177.
Epoch 4/20
144/144 [==============================] - 50s 349ms/step - loss: 0.2865 - accuracy: 0.4306 - val_loss: 0.4816 - val_accuracy: 0.5000
Epoch 5/20
144/144 [==============================] - 54s 377ms/step - loss: 0.2936 - accuracy: 0.4375 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 00005: ReduceLROnPlateau reducing learning rate to 4.500000213738531e-05.
Epoch 6/20
144/144 [==============================] - 50s 349ms/step - loss: 0.2980 - accuracy: 0.4097 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 7/20
144/144 [==============================] - 47s 324ms/step - loss: 0.2824 - accuracy: 0.4931 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 00007: ReduceLROnPlateau reducing learning rate to 1.3500000204658135e-05.
Epoch 8/20
144/144 [==============================] - 48s 336ms/step - loss: 0.2888 - accuracy: 0.4722 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 9/20
144/144 [==============================] - 45s 315ms/step - loss: 0.2572 - accuracy: 0.5417 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 00009: ReduceLROnPlateau reducing learning rate to 4.050000006827758e-06.
Epoch 10/20
144/144 [==============================] - 45s 313ms/step - loss: 0.2827 - accuracy: 0.5139 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 11/20
144/144 [==============================] - 46s 318ms/step - loss: 0.2660 - accuracy: 0.5764 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 00011: ReduceLROnPlateau reducing learning rate to 1.2149999747634864e-06.
Epoch 12/20
144/144 [==============================] - 58s 401ms/step - loss: 0.2869 - accuracy: 0.4583 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 13/20
144/144 [==============================] - 60s 417ms/step - loss: 0.2779 - accuracy: 0.5486 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 00013: ReduceLROnPlateau reducing learning rate to 3.644999992502562e-07.
Epoch 14/20
144/144 [==============================] - 51s 357ms/step - loss: 0.2959 - accuracy: 0.4722 - val_loss: 0.4815 - val_accuracy: 0.5000
Epoch 15/20
144/144 [==============================] - 49s 343ms/step - loss: 0.2729 - accuracy: 0.5069 - val_loss: 0.4815 - val_accuracy: 0.5000
My neural nets look like the below:
input_shape = X_train.shape[1:]
model = Sequential()
model.add(Conv2D(nb_filters, nb_conv, border_mode = 'valid', input_shape=(1, img_rows, img_cols), data_format = 'channels_first'))
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes, activation='sigmoid'))
left_input = Input(input_shape)
right_input = Input(input_shape)
processed_a = model(left_input)
processed_b = model(right_input)
distance = Lambda(euclidean_distance,output_shape=eucl_dist_output_shape)([processed_a, processed_b])
siamese_net = Model([left_input, right_input], distance)
I have tried different optimizers, different learning rates and regularization(dropouts) but not change in validation accuracy/ Loss.
How to improve it?
Related
I'm doing a species classification task from kaggle (https://www.kaggle.com/competitions/yum-or-yuck-butterfly-mimics-2022/overview). I decided to use transfer learning to tackle this problem since there aren't that many images. The model is as follows:
inputs = tf.keras.layers.Input(shape=(224, 224, 3))
base_model = tf.keras.applications.resnet50.ResNet50(
input_shape=(224,224,3),
include_top=False,
weights="imagenet")
for layer in base_model.layers:
layer.trainable = False
x = base_model(inputs, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(1024, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(512, activation="relu")(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(64, activation="relu")(x)
output = tf.keras.layers.Dense(6, activation="softmax")(x)
model = tf.keras.Model(inputs=inputs, outputs=output)
As per the guidelines when doing transfer learning:https://keras.io/guides/transfer_learning/, I'm freezing the resnet layers and training the model on inference only (training=False). However, the results show that the model is not learning properly. Convergence doesn't seem like it will be possible even after nearly 200 epochs:
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="categorical_crossentropy",
metrics="accuracy",
)
stop_early = tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
min_delta=0.0001,
patience=20,
restore_best_weights=True
)
history = model.fit(train_generator,
validation_data = val_generator,
epochs = 200,
callbacks=[stop_early])
22/22 [==============================] - 19s 442ms/step - loss: 1.9317 - accuracy: 0.1794 - val_loss: 1.8272 - val_accuracy: 0.1618
Epoch 2/200
22/22 [==============================] - 9s 398ms/step - loss: 1.8250 - accuracy: 0.1882 - val_loss: 1.7681 - val_accuracy: 0.2197
Epoch 3/200
22/22 [==============================] - 9s 402ms/step - loss: 1.7927 - accuracy: 0.2294 - val_loss: 1.7612 - val_accuracy: 0.2139
Epoch 4/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7930 - accuracy: 0.2000 - val_loss: 1.7640 - val_accuracy: 0.2139
Epoch 5/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7872 - accuracy: 0.2132 - val_loss: 1.7489 - val_accuracy: 0.3121
Epoch 6/200
22/22 [==============================] - 9s 389ms/step - loss: 1.7700 - accuracy: 0.2574 - val_loss: 1.7378 - val_accuracy: 0.2543
Epoch 7/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7676 - accuracy: 0.2353 - val_loss: 1.7229 - val_accuracy: 0.3064
Epoch 8/200
22/22 [==============================] - 9s 427ms/step - loss: 1.7721 - accuracy: 0.2353 - val_loss: 1.7225 - val_accuracy: 0.2948
Epoch 9/200
22/22 [==============================] - 9s 399ms/step - loss: 1.7522 - accuracy: 0.2588 - val_loss: 1.7267 - val_accuracy: 0.2948
Epoch 10/200
22/22 [==============================] - 9s 395ms/step - loss: 1.7434 - accuracy: 0.2735 - val_loss: 1.7151 - val_accuracy: 0.2948
Epoch 11/200
22/22 [==============================] - 9s 391ms/step - loss: 1.7500 - accuracy: 0.2632 - val_loss: 1.7083 - val_accuracy: 0.3064
Epoch 12/200
22/22 [==============================] - 9s 425ms/step - loss: 1.7307 - accuracy: 0.2721 - val_loss: 1.6899 - val_accuracy: 0.3179
Epoch 13/200
22/22 [==============================] - 9s 407ms/step - loss: 1.7439 - accuracy: 0.2794 - val_loss: 1.7045 - val_accuracy: 0.2948
Epoch 14/200
22/22 [==============================] - 9s 404ms/step - loss: 1.7376 - accuracy: 0.2706 - val_loss: 1.7118 - val_accuracy: 0.2659
Epoch 15/200
22/22 [==============================] - 9s 419ms/step - loss: 1.7588 - accuracy: 0.2647 - val_loss: 1.6684 - val_accuracy: 0.3237
Epoch 16/200
22/22 [==============================] - 9s 394ms/step - loss: 1.7289 - accuracy: 0.2824 - val_loss: 1.6733 - val_accuracy: 0.3064
Epoch 17/200
22/22 [==============================] - 9s 387ms/step - loss: 1.7184 - accuracy: 0.2809 - val_loss: 1.7185 - val_accuracy: 0.2659
Epoch 18/200
22/22 [==============================] - 9s 408ms/step - loss: 1.7242 - accuracy: 0.2765 - val_loss: 1.6961 - val_accuracy: 0.2717
Epoch 19/200
22/22 [==============================] - 9s 424ms/step - loss: 1.7218 - accuracy: 0.2853 - val_loss: 1.6757 - val_accuracy: 0.3006
Epoch 20/200
22/22 [==============================] - 9s 396ms/step - loss: 1.7248 - accuracy: 0.2882 - val_loss: 1.6716 - val_accuracy: 0.3064
Epoch 21/200
22/22 [==============================] - 9s 401ms/step - loss: 1.7134 - accuracy: 0.2838 - val_loss: 1.6666 - val_accuracy: 0.2948
Epoch 22/200
22/22 [==============================] - 9s 393ms/step - loss: 1.7140 - accuracy: 0.2941 - val_loss: 1.6427 - val_accuracy: 0.3064
I need to unfreeze the layers and turn off inference in order for the model to learn. I tested the same scenario with EfficientNet and the same thing happened. Finally, I also used Xception, and freezing the layers and running with inference was fine. So it seems they behave differently, even though they all have batchnorm layers.
I'm not understanding what is going on here. Why would I need to turn inference off? Could anyone have a clue about this?
EDIT:
results from Resnet50:
results from Xception:
I am writing a neural network for translating texts from Russian to English, but I ran into the problem that my neural network gives a big loss, as well as a very far from the correct answer.
Below is the LSTM that I build using Keras:
def make_model(in_vocab, out_vocab, in_timesteps, out_timesteps, n):
model = Sequential()
model.add(Embedding(in_vocab, n, input_length=in_timesteps, mask_zero=True))
model.add(LSTM(n))
model.add(Dropout(0.3))
model.add(RepeatVector(out_timesteps))
model.add(LSTM(n, return_sequences=True))
model.add(Dropout(0.3))
model.add(Dense(out_vocab, activation='softmax'))
model.compile(optimizer=optimizers.RMSprop(lr=0.001), loss='sparse_categorical_crossentropy')
return model
And also the learning process is presented:
Epoch 1/10
3/3 [==============================] - 5s 1s/step - loss: 8.3635 - accuracy: 0.0197 - val_loss: 8.0575 - val_accuracy: 0.0563
Epoch 2/10
3/3 [==============================] - 2s 806ms/step - loss: 7.9505 - accuracy: 0.0334 - val_loss: 8.2927 - val_accuracy: 0.0743
Epoch 3/10
3/3 [==============================] - 2s 812ms/step - loss: 7.7977 - accuracy: 0.0349 - val_loss: 8.2959 - val_accuracy: 0.0571
Epoch 4/10
3/3 [==============================] - 3s 825ms/step - loss: 7.6700 - accuracy: 0.0389 - val_loss: 8.5628 - val_accuracy: 0.0751
Epoch 5/10
3/3 [==============================] - 3s 829ms/step - loss: 7.5595 - accuracy: 0.0411 - val_loss: 8.5854 - val_accuracy: 0.0743
Epoch 6/10
3/3 [==============================] - 3s 807ms/step - loss: 7.4604 - accuracy: 0.0406 - val_loss: 8.7633 - val_accuracy: 0.0743
Epoch 7/10
3/3 [==============================] - 2s 815ms/step - loss: 7.3475 - accuracy: 0.0436 - val_loss: 8.9103 - val_accuracy: 0.0743
Epoch 8/10
3/3 [==============================] - 3s 825ms/step - loss: 7.2548 - accuracy: 0.0455 - val_loss: 9.0493 - val_accuracy: 0.0721
Epoch 9/10
3/3 [==============================] - 2s 814ms/step - loss: 7.1751 - accuracy: 0.0449 - val_loss: 9.0740 - val_accuracy: 0.0788
Epoch 10/10
3/3 [==============================] - 3s 831ms/step - loss: 7.1132 - accuracy: 0.0479 - val_loss: 9.2443 - val_accuracy: 0.0773
And the parameters that I transmit for training:
model = make_model(# the size of tokenized words
russian_vocab_size,
english_vocab_size,
# maximum sentence lengths
max_russian_sequence_length,
max_english_sequence_length,
512)
model.fit(preproc_russian_sentences, # all tokenized Russian offers that are transmitted in the format shape (X, Y)
preproc_english_sentences, # all tokenized English offers that are transmitted in the format shape (X, Y, 1)
epochs=10,
batch_size=1024,
validation_split=0.2,
callbacks=None,
verbose=1)
Thank you in advance.
I am using tensorflow and keras to classify build a classification model. When running the code below it seems that the output does not seem to converge after each epoch, with the loss steadily increasing and the accuracy contantly set to 0.0000e+00. I am new to machine learning and am not too sure why this is happening.
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.models import Sequential
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
import numpy as np
import time
import tensorflow as tf
from google.colab import drive
drive.mount('/content/drive')
import pandas as pd
data = pd.read_csv("hmnist_28_28_RGB.csv")
X = data.iloc[:, 0:-1]
y = data.iloc[:, -1]
X = X / 255.0
X = X.values.reshape(-1,28,28,3)
print(X.shape)
model = Sequential()
model.add(Conv2D(256, (3, 3), input_shape=X.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X, y, batch_size=32, epochs=10, validation_split=0.3)
Output
(378, 28, 28, 3)
Epoch 1/10
9/9 [==============================] - 4s 429ms/step - loss: -34.6735 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 2/10
9/9 [==============================] - 4s 400ms/step - loss: -1074.2162 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 3/10
9/9 [==============================] - 4s 399ms/step - loss: -7446.1872 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 4/10
9/9 [==============================] - 4s 396ms/step - loss: -30012.9553 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 5/10
9/9 [==============================] - 4s 406ms/step - loss: -89006.4180 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 6/10
9/9 [==============================] - 4s 400ms/step - loss: -221087.9078 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 7/10
9/9 [==============================] - 4s 399ms/step - loss: -480032.9313 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 8/10
9/9 [==============================] - 4s 403ms/step - loss: -956052.3375 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 9/10
9/9 [==============================] - 4s 396ms/step - loss: -1733128.9000 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
Epoch 10/10
9/9 [==============================] - 4s 401ms/step - loss: -2953626.5750 - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
You need to make several changes to your model to make it work.
There are 7 different labels in the dataset, so your last layer needs 7 output neurons.
For your last layer you are currently using sigmoid activation. This is not suitable for multi-class classification. Instead you should use the softmax activation.
As loss function you are using loss='binary_crossentropy'. This is only to be used for binary classification. In your case, since your labels consist of integers loss='sparse_categorical_crossentropy' should be used. You can find more information here.
With the following changes to the last lines of your code:
model.add(Dense(7))
model.add(Activation('softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X, y, batch_size=32, epochs=10, validation_split=0.3)
You'll get this training history:
(10015, 28, 28, 3)
Epoch 1/10
220/220 [==============================] - 89s 403ms/step - loss: 1.0345 - accuracy: 0.6193 - val_loss: 1.7980 - val_accuracy: 0.4353
Epoch 2/10
220/220 [==============================] - 88s 398ms/step - loss: 0.8282 - accuracy: 0.6851 - val_loss: 3.3646 - val_accuracy: 0.0676
Epoch 3/10
220/220 [==============================] - 88s 399ms/step - loss: 0.6944 - accuracy: 0.7502 - val_loss: 2.9686 - val_accuracy: 0.1228
Epoch 4/10
220/220 [==============================] - 87s 395ms/step - loss: 0.6630 - accuracy: 0.7611 - val_loss: 3.3777 - val_accuracy: 0.0646
Epoch 5/10
220/220 [==============================] - 87s 396ms/step - loss: 0.5976 - accuracy: 0.7812 - val_loss: 2.3929 - val_accuracy: 0.2532
Epoch 6/10
220/220 [==============================] - 87s 396ms/step - loss: 0.5577 - accuracy: 0.7935 - val_loss: 2.9879 - val_accuracy: 0.2592
Epoch 7/10
220/220 [==============================] - 88s 398ms/step - loss: 0.7644 - accuracy: 0.7215 - val_loss: 2.5258 - val_accuracy: 0.2852
Epoch 8/10
220/220 [==============================] - 87s 395ms/step - loss: 0.5629 - accuracy: 0.7879 - val_loss: 2.6053 - val_accuracy: 0.3055
Epoch 9/10
220/220 [==============================] - 89s 404ms/step - loss: 0.5380 - accuracy: 0.8008 - val_loss: 2.7401 - val_accuracy: 0.1694
Epoch 10/10
220/220 [==============================] - 92s 419ms/step - loss: 0.5296 - accuracy: 0.8065 - val_loss: 3.7208 - val_accuracy: 0.0529
The model still needs to be optimized to achieve better results, but in general it works.
I was using this file for the training.
I state that I am not at all familiar with neural networks and this is the first time that I have tried to develop one.
The problem lies in predicting a week's pollution forecast, based on the previous month.
Unstructured data with 15 features are:
Start data
The data to be predicted is 'gas', for a total of 168 hours in the next week, is the hours in a week.
MinMaxScaler(feature_range (0,1)) is applied to the data. And then the data is split into train and test data. Since only one year of hourly measurements is available, the data is resampled in series of 672 hourly samples that each starts from every day of the year at midnight. Therefore, from about 8000 starting hourly surveys, about 600 series of 672 samples are obtained.
The 'date' is removed from the initial data and the form of train_x and train_y is:
Shape of train_x and train_y
In train_x[0] there are 672 hourly readings for the first 4 weeks of the data set and consist of all features including 'gas'.
In train_y [0], on the other hand, there are 168 hourly readings for the following week which begins when the month ends in train_x [0].
Train_X[0] where column 0 is 'gas' and Train_y[0] with only gas column for the next week after train_x[0]
TRAIN X SHAPE = (631, 672, 14)
TRAIN Y SHAPE = (631, 168, 1)
After organizing the data in this way (if it's wrong please let me know), I built the neural network as the following:
train_x, train_y = to_supervised(train, n_input)
train_x = train_x.astype(float)
train_y = train_y.astype(float)
# define parameters
verbose, epochs, batch_size = 1, 200, 50
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# define model
model = Sequential()
opt = optimizers.RMSprop(learning_rate=1e-3)
model.add(layers.GRU(14, activation='relu', input_shape=(n_timesteps, n_features),return_sequences=False, stateful=False))
model.add(layers.Dense(1, activation='relu'))
#model.add(layers.Dense(14, activation='linear'))
model.add(layers.Dense(n_outputs, activation='sigmoid'))
model.summary()
model.compile(loss='mse', optimizer=opt, metrics=['accuracy'])
train_y = np.concatenate(train_y).reshape(len(train_y), 168)
callback_early_stopping = EarlyStopping(monitor='val_loss',
patience=5, verbose=1)
callback_tensorboard = TensorBoard(log_dir='./23_logs/',
histogram_freq=0,
write_graph=False)
callback_reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.1,
min_lr=1e-4,
patience=0,
verbose=1)
callbacks = [callback_early_stopping,
callback_tensorboard,
callback_reduce_lr]
history = model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose, shuffle=False
, validation_split=0.2, callbacks=callbacks)
When i fit the network i get:
11/11 [==============================] - 5s 305ms/step - loss: 0.1625 - accuracy: 0.0207 - val_loss: 0.1905 - val_accuracy: 0.0157
Epoch 2/200
11/11 [==============================] - 2s 179ms/step - loss: 0.1594 - accuracy: 0.0037 - val_loss: 0.1879 - val_accuracy: 0.0157
Epoch 3/200
11/11 [==============================] - 2s 169ms/step - loss: 0.1571 - accuracy: 0.0040 - val_loss: 0.1855 - val_accuracy: 0.0079
Epoch 4/200
11/11 [==============================] - 2s 165ms/step - loss: 0.1550 - accuracy: 0.0092 - val_loss: 0.1832 - val_accuracy: 0.0079
Epoch 5/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1529 - accuracy: 0.0102 - val_loss: 0.1809 - val_accuracy: 0.0079
Epoch 6/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1508 - accuracy: 0.0085 - val_loss: 0.1786 - val_accuracy: 0.0079
Epoch 7/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1487 - accuracy: 0.0023 - val_loss: 0.1763 - val_accuracy: 0.0079
Epoch 8/200
11/11 [==============================] - 2s 158ms/step - loss: 0.1467 - accuracy: 0.0023 - val_loss: 0.1740 - val_accuracy: 0.0079
Epoch 9/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1446 - accuracy: 0.0034 - val_loss: 0.1718 - val_accuracy: 0.0000e+00
Epoch 10/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1426 - accuracy: 0.0034 - val_loss: 0.1695 - val_accuracy: 0.0000e+00
Epoch 11/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1406 - accuracy: 0.0034 - val_loss: 0.1673 - val_accuracy: 0.0000e+00
Epoch 12/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1387 - accuracy: 0.0034 - val_loss: 0.1651 - val_accuracy: 0.0000e+00
Epoch 13/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1367 - accuracy: 0.0052 - val_loss: 0.1629 - val_accuracy: 0.0000e+00
Epoch 14/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1348 - accuracy: 0.0052 - val_loss: 0.1608 - val_accuracy: 0.0000e+00
Epoch 15/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1328 - accuracy: 0.0052 - val_loss: 0.1586 - val_accuracy: 0.0000e+00
Epoch 16/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1309 - accuracy: 0.0052 - val_loss: 0.1565 - val_accuracy: 0.0000e+00
Epoch 17/200
11/11 [==============================] - 2s 171ms/step - loss: 0.1290 - accuracy: 0.0052 - val_loss: 0.1544 - val_accuracy: 0.0000e+00
Epoch 18/200
11/11 [==============================] - 2s 174ms/step - loss: 0.1271 - accuracy: 0.0052 - val_loss: 0.1523 - val_accuracy: 0.0000e+00
Epoch 19/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1253 - accuracy: 0.0052 - val_loss: 0.1502 - val_accuracy: 0.0000e+00
Epoch 20/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1234 - accuracy: 0.0052 - val_loss: 0.1482 - val_accuracy: 0.0000e+00
Epoch 21/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1216 - accuracy: 0.0052 - val_loss: 0.1461 - val_accuracy: 0.0000e+00
Epoch 22/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1198 - accuracy: 0.0052 - val_loss: 0.1441 - val_accuracy: 0.0000e+00
Epoch 23/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1180 - accuracy: 0.0052 - val_loss: 0.1421 - val_accuracy: 0.0000e+00
Epoch 24/200
11/11 [==============================] - 2s 163ms/step - loss: 0.1162 - accuracy: 0.0052 - val_loss: 0.1401 - val_accuracy: 0.0000e+00
Epoch 25/200
11/11 [==============================] - 2s 167ms/step - loss: 0.1145 - accuracy: 0.0052 - val_loss: 0.1381 - val_accuracy: 0.0000e+00
Epoch 26/200
11/11 [==============================] - 2s 188ms/step - loss: 0.1127 - accuracy: 0.0052 - val_loss: 0.1361 - val_accuracy: 0.0000e+00
Epoch 27/200
11/11 [==============================] - 2s 169ms/step - loss: 0.1110 - accuracy: 0.0052 - val_loss: 0.1342 - val_accuracy: 0.0000e+00
Epoch 28/200
11/11 [==============================] - 2s 189ms/step - loss: 0.1093 - accuracy: 0.0052 - val_loss: 0.1323 - val_accuracy: 0.0000e+00
Epoch 29/200
11/11 [==============================] - 2s 183ms/step - loss: 0.1076 - accuracy: 0.0079 - val_loss: 0.1304 - val_accuracy: 0.0000e+00
Epoch 30/200
11/11 [==============================] - 2s 172ms/step - loss: 0.1059 - accuracy: 0.0079 - val_loss: 0.1285 - val_accuracy: 0.0000e+00
Epoch 31/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1042 - accuracy: 0.0079 - val_loss: 0.1266 - val_accuracy: 0.0000e+00
Epoch 32/200
Accuracy always remains very low and sometimes (like this case) val_accuracy becomes 0 and never changes. While loss and val_loss do not converge well but decrease. I realize that I am certainly doing many things wrong and I cannot understand how I can fix it. I have obviously tried with other hyperparameters and also with other networks like LSTM, but I didn't get satisfactory results.
How can I improve the model so that the accuracy is at least decent? Any advice is welcome, thank you very much!
I am building a DNN with keras to classify between background or signal events (HEP). Nevertheless the loss and the accuracy are not changing.
I already tried changing the parameters on the optimizer, normalizing the data, changing the number of layers, neurons, epochs, initializing the weights, etc.
Here's the model:
epochs = 20
num_features = 2
num_classes = 2
batch_size = 32
# model
print("\n Building model...")
model = Sequential()
model.add(Dropout(0.2))
model.add(Dense(128, input_shape=(2,), activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes,activation=tf.nn.softmax))
print("\n Compiling model...")
opt = adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0,
amsgrad=False)
# compile model
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
print("\n Fitting model...")
history = model.fit(x_train, y_train, epochs = epochs,
batch_size = batch_size, validation_data = (x_test, y_test))
I'm expecting a change in the loss but it won't decrease from 0.69-ish.
The epochs report
Building model...
Compiling model...
Fitting model...
Train on 18400 samples, validate on 4600 samples
Epoch 1/20
18400/18400 [==============================] - 1s 71us/step - loss: 0.6939 - acc: 0.4965 - val_loss: 0.6933 - val_acc: 0.5000
Epoch 2/20
18400/18400 [==============================] - 1s 60us/step - loss: 0.6935 - acc: 0.5045 - val_loss: 0.6933 - val_acc: 0.5000
Epoch 3/20
18400/18400 [==============================] - 1s 69us/step - loss: 0.6937 - acc: 0.4993 - val_loss: 0.6934 - val_acc: 0.5000
Epoch 4/20
18400/18400 [==============================] - 1s 65us/step - loss: 0.6939 - acc: 0.4984 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 5/20
18400/18400 [==============================] - 1s 58us/step - loss: 0.6936 - acc: 0.5000 - val_loss: 0.6936 - val_acc: 0.5000
Epoch 6/20
18400/18400 [==============================] - 1s 57us/step - loss: 0.6937 - acc: 0.4913 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 7/20
18400/18400 [==============================] - 1s 58us/step - loss: 0.6935 - acc: 0.5008 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 8/20
18400/18400 [==============================] - 1s 63us/step - loss: 0.6936 - acc: 0.5013 - val_loss: 0.6936 - val_acc: 0.5000
Epoch 9/20
18400/18400 [==============================] - 1s 67us/step - loss: 0.6936 - acc: 0.4924 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 10/20
18400/18400 [==============================] - 1s 61us/step - loss: 0.6933 - acc: 0.5067 - val_loss: 0.6934 - val_acc: 0.5000
Epoch 11/20
18400/18400 [==============================] - 1s 64us/step - loss: 0.6938 - acc: 0.4972 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 12/20
18400/18400 [==============================] - 1s 64us/step - loss: 0.6936 - acc: 0.4991 - val_loss: 0.6934 - val_acc: 0.5000
Epoch 13/20
18400/18400 [==============================] - 1s 70us/step - loss: 0.6937 - acc: 0.4960 - val_loss: 0.6935 - val_acc: 0.5000
Epoch 14/20
18400/18400 [==============================] - 1s 63us/step - loss: 0.6935 - acc: 0.4992 - val_loss: 0.6932 - val_acc: 0.5000
Epoch 15/20
18400/18400 [==============================] - 1s 61us/step - loss: 0.6937 - acc: 0.4940 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 16/20
18400/18400 [==============================] - 1s 68us/step - loss: 0.6933 - acc: 0.5067 - val_loss: 0.6936 - val_acc: 0.5000
Epoch 17/20
18400/18400 [==============================] - 1s 58us/step - loss: 0.6938 - acc: 0.4997 - val_loss: 0.6935 - val_acc: 0.5000
Epoch 18/20
18400/18400 [==============================] - 1s 56us/step - loss: 0.6936 - acc: 0.4972 - val_loss: 0.6941 - val_acc: 0.5000
Epoch 19/20
18400/18400 [==============================] - 1s 57us/step - loss: 0.6934 - acc: 0.5061 - val_loss: 0.6954 - val_acc: 0.5000
Epoch 20/20
18400/18400 [==============================] - 1s 58us/step - loss: 0.6936 - acc: 0.5037 - val_loss: 0.6939 - val_acc: 0.5000
Update: My data preparation contains this
np.random.shuffle(x_train)
np.random.shuffle(y_train)
np.random.shuffle(x_test)
np.random.shuffle(y_test)
And I'm thinking it's changing the class for each data point cause the shuffle is done separately.