Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I'm training a deep learning model and get a very low accuracy. I used L2 regularization to stop overfitting and to have high accuracy but it didn't solve the problem. what would be the cause of this very low accuracy and how can I stop it ?
The model accuracy is almost perfect (>90%) whereas the validation accuracy is very low (<51%) (shown bellow)
Epoch 1/15
2601/2601 - 38s - loss: 1.6510 - accuracy: 0.5125 - val_loss: 1.6108 - val_accuracy: 0.4706
Epoch 2/15
2601/2601 - 38s - loss: 1.1733 - accuracy: 0.7009 - val_loss: 1.5660 - val_accuracy: 0.4971
Epoch 3/15
2601/2601 - 38s - loss: 0.9169 - accuracy: 0.8147 - val_loss: 1.6223 - val_accuracy: 0.4948
Epoch 4/15
2601/2601 - 38s - loss: 0.7820 - accuracy: 0.8551 - val_loss: 1.7773 - val_accuracy: 0.4683
Epoch 5/15
2601/2601 - 38s - loss: 0.6539 - accuracy: 0.8989 - val_loss: 1.7968 - val_accuracy: 0.4937
Epoch 6/15
2601/2601 - 38s - loss: 0.5691 - accuracy: 0.9204 - val_loss: 1.8743 - val_accuracy: 0.4844
Epoch 7/15
2601/2601 - 38s - loss: 0.5090 - accuracy: 0.9327 - val_loss: 1.9348 - val_accuracy: 0.5029
Epoch 8/15
2601/2601 - 40s - loss: 0.4465 - accuracy: 0.9500 - val_loss: 1.9566 - val_accuracy: 0.4787
Epoch 9/15
2601/2601 - 38s - loss: 0.3931 - accuracy: 0.9596 - val_loss: 2.0824 - val_accuracy: 0.4764
Epoch 10/15
2601/2601 - 41s - loss: 0.3786 - accuracy: 0.9596 - val_loss: 2.1185 - val_accuracy: 0.4925
Epoch 11/15
2601/2601 - 38s - loss: 0.3471 - accuracy: 0.9604 - val_loss: 2.1972 - val_accuracy: 0.4879
Epoch 12/15
2601/2601 - 38s - loss: 0.3169 - accuracy: 0.9669 - val_loss: 2.1091 - val_accuracy: 0.4948
Epoch 13/15
2601/2601 - 38s - loss: 0.3018 - accuracy: 0.9685 - val_loss: 2.2073 - val_accuracy: 0.5006
Epoch 14/15
2601/2601 - 38s - loss: 0.2629 - accuracy: 0.9746 - val_loss: 2.2086 - val_accuracy: 0.4971
Epoch 15/15
2601/2601 - 38s - loss: 0.2700 - accuracy: 0.9650 - val_loss: 2.2178 - val_accuracy: 0.4879
I tried to increase the number of epoch, and it only increases the model accuracy and lowers the validation accuracy.
Any advice on how to overcome this issue?
My code:
def createModel():
input_shape=(11, 3840,1)
model = Sequential()
#C1
model.add(Conv2D(16, (5, 5), strides=( 2, 2), padding='same',activation='relu', input_shape=input_shape))
model.add(keras.layers.MaxPooling2D(pool_size=( 2, 2), padding='same'))
model.add(BatchNormalization())
#C2
model.add(Conv2D(32, ( 3, 3), strides=(1,1), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(BatchNormalization())
#C3
model.add(Conv2D(64, (3, 3), strides=( 1,1), padding='same', activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(BatchNormalization())
model.add(Dense(64, input_dim=64,kernel_regularizer=regularizers.l2(0.01)))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(256, activation='sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
opt_adam = keras.optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='categorical_crossentropy', optimizer=opt_adam, metrics=['accuracy'])
return model
def getFilesPathWithoutSeizure(indexSeizure, indexPat):
filesPath=[]
print(indexSeizure)
print(indexPat)
for i in range(0, nSeizure):
if(i==indexSeizure):
filesPath.extend(interictalSpectograms[i])
filesPath.extend(preictalSpectograms[i])
shuffle(filesPath)
return filesPath
def generate_arrays_for_training(indexPat, paths, start=0, end=100):
while True:
from_=int(len(paths)/100*start)
to_=int(len(paths)/100*end)
for i in range(from_, int(to_)):
f=paths[i]
x = np.load(PathSpectogramFolder+f)
x = np.expand_dims(np.expand_dims(x, axis=0), axis = 0)
x = x.transpose(0, 2, 3, 1)
if('P' in f):
y = np.repeat([[0,1]],x.shape[0], axis=0)
else:
y =np.repeat([[1,0]],x.shape[0], axis=0)
yield(x,y)
filesPath=getFilesPathWithoutSeizure(i, indexPat)
history=model.fit_generator(generate_arrays_for_training(indexPat, filesPath, end=75),#It take the first 75%
validation_data=generate_arrays_for_training(indexPat, filesPath, start=75), #It take the last 25%
steps_per_epoch=int((len(filesPath)-int(len(filesPath)/100*25))),
validation_steps=int((len(filesPath)-int(len(filesPath)/100*75))),
verbose=2,class_weight = {0:1, 1:1},
epochs=15, max_queue_size=2, shuffle=True)
You seem to have implemented shuffling in the function getFilesPathWithoutSeizure(), though you could verify whether the shuffling is actually working or not by printing out the filenames multiple times.
filesPath=getFilesPathWithoutSeizure(i, indexPat) - is the i getting updated?
As per your code if(i==indexSeizure): in the method getFilesPathWithoutSeizure, only 1 file would return when indexSeizure is equal to the counter (i of the for loop)
If you are not changing i argument being passed during the function call, it could mean that only 1 file is being returned to the filePath variable and your whole training is done on 1 input data instead of the 75% of 3467 files.
--
After confirming that shuffling works and that your function call is inserting all the data in your filePath variable, it still doesn't solve your problem, then try the following:
Data Augmentation could help solve over-fitting by increasing the diversity of your dataset by applying random but realistic transformations such as image rotation, shearing, hortizontal & vertical flips, zooming, de-centering etc.
But more importantly you would need to manually look into your data and understand the similarity in your training data.
Another option would be to just get more and diverse data to train on.
Your model is overfitting and not generalizing properly. If your training set is completely different to your validation set (you are splitting 75% and 25% but the 75% could be completely different to the 25% ), your model will have a hard time generalizing.
Shuffle your data before you split into training and validation. That should improve your results.
Related
Developing a neural network for the Spaceship Titanic comp binary classification problem. However, I keep getting a score of 0.0000 for train and val data, and can't figure out why. Models have worked for knn, lightxgb and random forest, so I don't think it's a data issue.
Code as below
print(X_train_scaled.shape)
print(y_train2.shape)
(6085, 23)
(6085, 1)
# Create model
model1 = Sequential()
model1.add(Dense(18, activation = 'relu', kernel_initializer='he_uniform', input_dim = X_train_scaled.shape[1]))
model1.add(Dense(9, activation='relu', kernel_initializer='he_uniform'))
model1.add(Dense(1, activation = 'sigmoid'))
optimizer = Adam(learning_rate=0.001)
model1.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=[tf.keras.metrics.Accuracy()])
history = model1.fit(X_train_scaled, y_train2, batch_size=100, epochs=30, validation_split = 0.3)
Epoch 1/30
43/43 [==============================] - 1s 7ms/step - loss: 0.7348 - accuracy: 0.0000e+00 - val_loss: 0.6989 - val_accuracy: 0.0000e+00
Epoch 2/30
43/43 [==============================] - 0s 4ms/step - loss: 0.6603 - accuracy: 0.0000e+00 - val_loss: 0.6324 - val_accuracy: 0.0000e+00
Epoch 3/30
43/43 [==============================] - 0s 3ms/step - loss: 0.5994 - accuracy: 0.0000e+00 - val_loss: 0.5784 - val_accuracy: 0.0000e+00
Epoch 4/30
43/43 [==============================] - 0s 3ms/step - loss: 0.5539 - accuracy: 0.0000e+00 - val_loss: 0.5401 - val_accuracy: 0.0000e+00
in place of:
metrics=[tf.keras.metrics.Accuracy()]
try:
metrics=['accuracy']
Im trying to implement a Cnn using Keras on a Sklearn dataset for handwritten digits recognition (load_digits). I have got the model to run but it is not improving the accuracy for each 'epochs' cycle, Im guessing its because my labels are incorrect, I have tried encoding my Y values with use of 'to_categorical' but it displays the following error:
C:\Users\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\backend.py:4979 binary_crossentropy
return nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)
C:\Users\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\util\dispatch.py:201 wrapper
return target(*args, **kwargs)
C:\Users\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\ops\nn_impl.py:173 sigmoid_cross_entropy_with_logits
raise ValueError("logits and labels must have the same shape (%s vs %s)" %
ValueError: logits and labels must have the same shape ((None, 1) vs (None, 10))
When i run my code without trying to encode the Y values it seems to go through the Cnn Model however it isn't accurate and it doesn't increase, this is my code:
import tensorflow as tf
from sklearn import datasets
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
#from keras.utils.np_utils import to_categorical
X,y = datasets.load_digits(return_X_y = True)
X = X/16
#X = X.reshape(1797,8,8,1)
train_x, test_x, train_y, test_y = train_test_split(X, y)
train_x = train_x.reshape(1347,8,8,1)
#test_x = test_x.reshape()
#train_y = to_categorical(train_y, num_classes = 10)
model = Sequential()
model.add(Conv2D(32, (2, 2), input_shape=( 8, 8, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (2, 2)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(train_x, train_y, batch_size=32, epochs=6, validation_split=0.3)
print(train_x[0])
And this gives me the following output:
Epoch 1/6
1/30 [>.............................] - ETA: 13s - loss: 1.1026 - accuracy: 0.0938
6/30 [=====>........................] - ETA: 0s - loss: 0.2949 - accuracy: 0.0652
30/30 [==============================] - 1s 33ms/step - loss: -5.4832 - accuracy: 0.0893 - val_loss: -49.9462 - val_accuracy: 0.1012
Epoch 2/6
1/30 [>.............................] - ETA: 0s - loss: -52.2145 - accuracy: 0.0625
30/30 [==============================] - 0s 3ms/step - loss: -120.6972 - accuracy: 0.0961 - val_loss: -513.0211 - val_accuracy: 0.1012
Epoch 3/6
1/30 [>.............................] - ETA: 0s - loss: -638.2873 - accuracy: 0.1250
30/30 [==============================] - 0s 3ms/step - loss: -968.3621 - accuracy: 0.1006 - val_loss: -2804.1062 - val_accuracy: 0.1012
Epoch 4/6
1/30 [>.............................] - ETA: 0s - loss: -3427.3135 - accuracy: 0.0000e+00
30/30 [==============================] - 0s 3ms/step - loss: -4571.7894 - accuracy: 0.0934 - val_loss: -10332.9727 - val_accuracy: 0.1012
Epoch 5/6
1/30 [>.............................] - ETA: 0s - loss: -12963.2559 - accuracy: 0.0625
30/30 [==============================] - 0s 3ms/step - loss: -15268.3010 - accuracy: 0.0887 - val_loss: -29262.1191 - val_accuracy: 0.1012
Epoch 6/6
1/30 [>.............................] - ETA: 0s - loss: -30990.6758 - accuracy: 0.1562
30/30 [==============================] - 0s 3ms/step - loss: -40321.9540 - accuracy: 0.0960 - val_loss: -68548.6094 - val_accuracy: 0.1012
Any guidance is greatly appricated, Thanks!
When you have a CNN you want the last layer to have as many nodes as the labels. So if you have 10 digits you want the last layer to have an output size 10. It usually has the activation function "softmax", which makes every value go to 0, except on value which is 1.
model.add(Dense(10))
model.add(Activation('softmax'))
I tried to train a CNN to classify 9 class of image. Each class has 1000 image for training. I tried training on VGG16 and VGG19, both can achieve validation accuracy of 90%. But when I tried to train on InceptionResNetV2 model, the model seems to stuck around 20% and 30%. Below is my code for InceptionResNetV2 and the training. What can I do to improve the training?
base_model = tf.keras.applications.InceptionResNetV2(input_shape=(IMG_HEIGHT, IMG_WIDTH ,3),weights = 'imagenet',include_top=False)
base_model.trainable = False
model = tf.keras.Sequential([
base_model,
Flatten(),
Dense(1024, activation = 'relu', kernel_regularizer=regularizers.l2(0.001)),
LeakyReLU(alpha=0.4),
Dropout(0.5),
BatchNormalization(),
Dense(1024, activation = 'relu', kernel_regularizer=regularizers.l2(0.001)),
LeakyReLU(alpha=0.4),
Dense(9, activation = 'softmax')])
optimizer_model = tf.keras.optimizers.Adam(learning_rate=0.0001, name='Adam', decay=0.00001)
loss_model = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
model.compile(optimizer_model, loss="categorical_crossentropy", metrics=['accuracy'])
Epoch 1/10
899/899 [==============================] - 255s 283ms/step - loss: 4.3396 - acc: 0.3548 - val_loss: 4.2744 - val_acc: 0.3874
Epoch 2/10
899/899 [==============================] - 231s 257ms/step - loss: 3.5856 - acc: 0.4695 - val_loss: 3.9151 - val_acc: 0.3816
Epoch 3/10
899/899 [==============================] - 225s 250ms/step - loss: 3.1451 - acc: 0.4959 - val_loss: 4.8801 - val_acc: 0.2425
Epoch 4/10
899/899 [==============================] - 227s 252ms/step - loss: 2.7771 - acc: 0.5124 - val_loss: 3.7167 - val_acc: 0.3023
Epoch 5/10
899/899 [==============================] - 231s 257ms/step - loss: 2.4993 - acc: 0.5260 - val_loss: 3.7276 - val_acc: 0.3770
Epoch 6/10
899/899 [==============================] - 227s 252ms/step - loss: 2.3148 - acc: 0.5251 - val_loss: 3.7677 - val_acc: 0.3115
Epoch 7/10
899/899 [==============================] - 234s 260ms/step - loss: 2.1381 - acc: 0.5379 - val_loss: 3.4867 - val_acc: 0.2862
Epoch 8/10
899/899 [==============================] - 230s 256ms/step - loss: 2.0091 - acc: 0.5367 - val_loss: 4.1032 - val_acc: 0.3080
Epoch 9/10
899/899 [==============================] - 225s 251ms/step - loss: 1.9155 - acc: 0.5399 - val_loss: 4.1270 - val_acc: 0.2954
Epoch 10/10
899/899 [==============================] - 232s 258ms/step - loss: 1.8349 - acc: 0.5508 - val_loss: 4.3918 - val_acc: 0.2276
VGG-16/19 has a depth of 23/26 layers, whereas, InceptionResNetV2 has a depth of 572 layers. Now, there is minimal domain similarity between medical images and imagenet dataset. In VGG, due to low depth the features you're getting are not that complex and network is able to classify it on the basis of Dense layer features. However, in IRV2 network, as it's too much deep, the output of the fc layer is more complex (visualize it something object like but for imagenet dataset), and, then the features obtained from these layers are unable to connect to the Dense layer features, and, hence overfitting. I think you were able to get my point.
Check out my answer to very similar question of yours on this link: Link. It will help improve your accuracy.
This is my CNN model structure.
def make_dcnn_model():
model = models.Sequential()
model.add(layers.Conv2D(5, (5, 5), input_shape=(9, 128,1), padding='same', strides = (1,2), activity_regularizer=tf.keras.regularizers.l1(0.001)))
model.add(layers.LeakyReLU())
model.add(BatchNormalization())
model.add(layers.AveragePooling2D((4, 4), strides = (2,4)))
model.add(layers.Conv2D(10, (5, 5), padding='same', activity_regularizer=tf.keras.regularizers.l1(0.001)))
model.add(layers.LeakyReLU())
model.add(BatchNormalization())
model.add(layers.AveragePooling2D((2, 2), strides = (1,2)))
model.add(layers.Flatten())
model.add(layers.Dense(50, activity_regularizer=tf.keras.regularizers.l1(0.001)))
model.add(layers.LeakyReLU())
model.add(BatchNormalization())
model.add(layers.Dense(6, activation='softmax'))
return model
The result shows that this model fit well the training data and for the validation data the great fluctuation of validation accuracy occurred.
Train on 7352 samples, validate on 2947 samples
Epoch 1/3000 7352/7352
[==============================] - 3s 397us/sample - loss: 0.1016 -
accuracy: 0.9698 - val_loss: 4.0896 - val_accuracy: 0.5816
Epoch
2/3000 7352/7352 [==============================] - 2s 214us/sample -
loss: 0.0965 - accuracy: 0.9727 - val_loss: 1.2296 - val_accuracy:
0.7384 Epoch 3/3000 7352/7352 [==============================] - 1s 198us/sample - loss: 0.0930 - accuracy: 0.9727 - val_loss: 0.9901 -
val_accuracy: 0.7855 Epoch 4/3000 7352/7352
[==============================] - 2s 211us/sample - loss: 0.1013 -
accuracy: 0.9701 - val_loss: 0.5319 - val_accuracy: 0.9114 Epoch
5/3000 7352/7352 [==============================] - 1s 201us/sample -
loss: 0.0958 - accuracy: 0.9721 - val_loss: 0.6938 - val_accuracy:
0.8388 Epoch 6/3000 7352/7352 [==============================] - 2s 205us/sample - loss: 0.0925 - accuracy: 0.9743 - val_loss: 1.4033 -
val_accuracy: 0.7472 Epoch 7/3000 7352/7352
[==============================] - 1s 203us/sample - loss: 0.0948 -
accuracy: 0.9740 - val_loss: 0.8375 - val_accuracy: 0.7998
Reducing overfitting is a matter of trial and error. There are many ways to deal with it.
Try to add more data to the model or maybe augmenting your data if you're dealing with images. (very helpful)
Try reducing the complexity of the model by tweaking the parameters of the layers.
Try stopping the training earlier.
Regularization and batch normalization are very helpful but it may be the case that your model is already performing much worse without them in terms of overfitting. Try different types of regularization. (maybe Dropout)
My guess is that by adding more variety in the data you're model will overfit less.
I am new to keras and deep learnin.When i crate a sample basic model,i fit it and my model's log loss is same always.
model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same', init='he_normal',
input_shape=(color_type, img_rows, img_cols)))
model.add(MaxPooling2D(pool_size=(2, 2), dim_ordering="th"))
model.add(Dropout(0.5))
model.add(Convolution2D(64, 3, 3, border_mode='same', init='he_normal'))
model.add(MaxPooling2D(pool_size=(2, 2), dim_ordering="th")) #this part is wrong
model.add(Dropout(0.5))
model.add(Convolution2D(128, 3, 3, border_mode='same', init='he_normal'))
model.add(MaxPooling2D(pool_size=(2, 2), dim_ordering="th"))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(10))
model.add(Activation('softmax'))
model.compile(Adam(lr=1e-3), loss='categorical_crossentropy')
model.fit(x_train, y_train, batch_size=64, nb_epoch=200,
verbose=1, validation_data=(x_valid,y_valid))
Train on 17939 samples, validate on 4485 samples
Epoch 1/200
17939/17939 [==============================] - 8s - loss: 99.8137 - acc: 0.3096 - val_loss: 99.9626 - val_acc: 0.0000e+00
Epoch 2/200
17939/17939 [==============================] - 8s - loss: 99.8135 - acc: 0.2864 - val_loss: 99.9626 - val_acc: 0.0000e+00
Epoch 3/200
17939/17939 [==============================] - 8s - loss: 99.8135 - acc: 0.3120 - val_loss: 99.9626 - val_acc: 1.0000
Epoch 4/200
17939/17939 [==============================] - 10s - loss: 99.8135 - acc: 0.3315 - val_loss: 99.9626 - val_acc: 1.0000
Epoch 5/200
17939/17939 [==============================] - 10s - loss: 99.8138 - acc: 0.3435 - val_loss: 99.9626 - val_acc: 0.4620
..
...
it's going like this
Do you know whicc part i made wrong ?
One reason for such behavior might be a too small learning rate. Try to increase your learning rate by using Adam(lr=1e-2) or Adam(lr=1e-1). Also, wait couple of more iterations (epochs) and see whether it improves. If not, you may try to decrease the dropout. In addition, I would suggest to normalize your input data if you haven't done it yet.