Keras model does not improve on multi-class classification problem

Keras model does not improve on multi-class classification problem - python

I wanted to learn more about machine learning / deep learning so I have been attempting to solve the Kaggle Diabetic Retinopathy competition as a learning experience. However, my Keras model's accuracy and loss function do not seem to improve.
I downloaded the Diabetic Retinopathy dataset. Balanced the classes and created equally distributed batches of 100 images per batch. I have tried many combinations of parameterisation like more Epochs, different learning rates, complexer models, whatsoever. They all seem to have no effects. So here is my code.
My imports:
from tqdm import tqdm
import os
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPooling2D, Dropout, Flatten
from keras import optimizers
from keras.callbacks import ModelCheckpoint
from keras import backend as K
K.tensorflow_backend._get_available_gpus()
My parameters:
HEIGHT = 512
WIDTH = 512
DEPTH = 3
inputShape = (HEIGHT, WIDTH, DEPTH)
NUM_CLASSES = 5
EPOCHS = 15
INIT_LR = 0.001
BS = 1
I check the batches in a given directory:
''' read batches '''
train_dir = '/DATA/npy_data/train_dir/'
batch_path_list = []
for batch in tqdm(os.listdir(train_dir)):
batch_full_path = os.path.join(os.path.sep, train_dir, batch)
batch_path_list.append(str(batch_full_path))
AMOUNT_OF_BATCHES = len(batch_path_list)
if AMOUNT_OF_BATCHES == 0:
print('We found no batches. Either no data or wrong directory...')
if AMOUNT_OF_BATCHES != 0:
print('We found ' + str(AMOUNT_OF_BATCHES) + ' batches.')
I read the CSV file to obtain the labels
''' read csv labels '''
csv_dir = '/DATA/data/trainLabels_normalised.csv'
dataframe = pd.read_csv(csv_dir, sep=',')
patientIDList = []
for index, row in dataframe.iterrows():
patientID = row[0] + ''
patientID = patientID.replace('_right', '')
patientID = patientID.replace('_left', '')
dataframe.at[index, 'PatientID'] = patientID
patientIDList.append(patientID)
I create and compile my model
model = Sequential(name='test')
model.add(Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=inputShape))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(activation='softmax', units=5))
opt = optimizers.SGD(decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy", "mse"])
checkpointer = ModelCheckpoint(filepath="/home/user/Desktop/code/model/best_weights.hdf5",
verbose=1,
save_best_only=True)
I load a batch and join the labels with the 100 images from the batch.
''' load batches '''
for item in batch_path_list:
batch_data = np.load(item).tolist()
df1 = dataframe[dataframe['image'].isin(batch_data)]
imageNameArr = []
dataArr = []
for index, row in df1.iterrows():
key = str(row[0])
if key in batch_data:
imageNameArr.append(key)
dataArr.append(batch_data[key])
df2 = pd.DataFrame({'image': imageNameArr, 'data': dataArr})
for idx in range(0, len(df1)):
if (df1.loc[df1.index[idx], 'image'] != df2.loc[df2.index[idx], 'image']):
print("Error " + df1.loc[df1.index[idx], 'image'] + "==" + df2.loc[df2.index[idx], 'image'])
merged_files = pd.merge(df2, df1, left_on='image', right_on='image', how='outer')
I generate splits
train_ids, valid_ids = train_test_split(patientIDList, test_size=0.25, random_state=10)
traindf = merged_files[merged_files.PatientID.isin(train_ids)] #data (data) image (img name) level (fase)
valSet = merged_files[merged_files.PatientID.isin(valid_ids)]
trainX = traindf['data']
trainY = traindf['level']
valX = valSet['data']
valY = valSet['level']
trainY = to_categorical(trainY, num_classes=NUM_CLASSES)
valY = to_categorical(valY, num_classes=NUM_CLASSES)
Xtrain = np.zeros([trainX.shape[0], HEIGHT, WIDTH, DEPTH])
Xval = np.zeros([valX.shape[0], HEIGHT, WIDTH, DEPTH])
I use a generator and call the fit function.
aug = ImageDataGenerator(rotation_range=30, width_shift_range=0.1,
height_shift_range=0.1, shear_range=0.2, zoom_range=0.2,
horizontal_flip=True, fill_mode="nearest")
model.fit_generator(aug.flow(Xtrain, trainY,
batch_size=BS),
validation_data=(Xval, valY),
steps_per_epoch=(len(trainX) // BS),
epochs=EPOCHS,
verbose=1,
callbacks=[checkpointer])
However, this results in a very low accuracy and does not seem to improve over 29 batches.
Result:
54/87 [=================>............] - ETA: 0s - loss: 1.6092 - acc: 0.2037 - mean_squared_error: 0.1599
56/87 [==================>...........] - ETA: 0s - loss: 1.6089 - acc: 0.2143 - mean_squared_error: 0.1598
58/87 [===================>..........] - ETA: 0s - loss: 1.6169 - acc: 0.2069 - mean_squared_error: 0.1605
60/87 [===================>..........] - ETA: 0s - loss: 1.6146 - acc: 0.2167 - mean_squared_error: 0.1602
62/87 [====================>.........] - ETA: 0s - loss: 1.6172 - acc: 0.2097 - mean_squared_error: 0.1605
64/87 [=====================>........] - ETA: 0s - loss: 1.6196 - acc: 0.2031 - mean_squared_error: 0.1607
66/87 [=====================>........] - ETA: 0s - loss: 1.6180 - acc: 0.2121 - mean_squared_error: 0.1605
68/87 [======================>.......] - ETA: 0s - loss: 1.6164 - acc: 0.2206 - mean_squared_error: 0.1604
70/87 [=======================>......] - ETA: 0s - loss: 1.6144 - acc: 0.2286 - mean_squared_error: 0.1602
72/87 [=======================>......] - ETA: 0s - loss: 1.6163 - acc: 0.2222 - mean_squared_error: 0.1604
74/87 [========================>.....] - ETA: 0s - loss: 1.6134 - acc: 0.2297 - mean_squared_error: 0.1601
76/87 [=========================>....] - ETA: 0s - loss: 1.6102 - acc: 0.2368 - mean_squared_error: 0.1598
78/87 [=========================>....] - ETA: 0s - loss: 1.6119 - acc: 0.2308 - mean_squared_error: 0.1600
80/87 [==========================>...] - ETA: 0s - loss: 1.6159 - acc: 0.2250 - mean_squared_error: 0.1604
82/87 [===========================>..] - ETA: 0s - loss: 1.6150 - acc: 0.2195 - mean_squared_error: 0.1603
84/87 [===========================>..] - ETA: 0s - loss: 1.6206 - acc: 0.2143 - mean_squared_error: 0.1608
86/87 [============================>.] - ETA: 0s - loss: 1.6230 - acc: 0.2093 - mean_squared_error: 0.1610
87/87 [==============================] - 3s 31ms/step - loss: 1.6234 - acc: 0.2069 - mean_squared_error: 0.1610 - val_loss: 1.6435 - val_acc: 0.1282 - val_mean_squared_error: 0.1629
Epoch 00015: val_loss did not improve from 1.57533
Suggestions and feedback to improve my model are highly appreciated!

Related

ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (10, 24)

import numpy as np
import pandas as pd
from numpy.random import seed
import tensorflow as tf
from tensorflow import keras
from keras import Sequential
from keras.layers import Dense, Conv1D, MaxPooling2D, Activation
from sklearn.model_selection import train_test_split
seed(1)
tf.random.set_seed(2)
droprate = 0.5
dataset = pd.read_csv('filecounts.csv')
data = np.array(pd.get_dummies(dataset['counts']))
model = Sequential()
model.add(Conv1D(8, kernel_size=3, padding="same", activation="relu",input_shape=(12, 12, 10)))
model.add(MaxPooling2D(pool_size=2))
...
model.add(Conv1D(4, kernel_size=3, padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=2))
...
model.add(Conv1D(1, kernel_size=3, padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=2))
...
model.add(Activation("softmax"))
sgd = keras.optimizers.SGD(learning_rate=1)
train, test = train_test_split(data, test_size=0.5)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train, epochs=100, batch_size=10)
_, accuracy = model.evaluate(test, verbose=0, steps=1)
print('Accuracy: %.2f' % (accuracy*100))

Conv1D expects input shape 3+D tensor with shape: batch_shape + (steps, input_dim)and output 3+D tensor with shape: batch_shape + (new_steps, filters) with or without padding='same'.
Error is due to Maxpooling2D expects 4D tensor with shape (batch_size, rows, cols, channels).
Working sample code
import tensorflow as tf
import numpy as np
import tensorflow.keras as keras
X_train = np.random.random((12,12,10))
y_train = np.random.random((12, 1))
model = tf.keras.Sequential()
model.add(keras.layers.Conv1D(8, kernel_size=3, padding="same", activation="relu",input_shape=(12, 10)))
model.add(keras.layers.MaxPool1D(pool_size=2))
model.add(keras.layers.Conv1D(4, kernel_size=3, padding="same", activation="relu"))
model.add(keras.layers.MaxPool1D(pool_size=2))
model.add(keras.layers.Conv1D(1, kernel_size=3, padding="same", activation="relu"))
model.add(keras.layers.MaxPool1D(pool_size=2))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(units = 128, activation = 'relu'))
model.add(keras.layers.Dense(units = 1, activation = 'softmax'))
model.compile(optimizer = 'adam',loss = 'binary_crossentropy',metrics = ['accuracy'])
model.fit(X_train, y_train, epochs = 15)
Output
Epoch 1/10
1/1 [==============================] - 1s 944ms/step - loss: 0.6914 - accuracy: 0.0000e+00
Epoch 2/10
1/1 [==============================] - 0s 9ms/step - loss: 0.6900 - accuracy: 0.0000e+00
Epoch 3/10
1/1 [==============================] - 0s 9ms/step - loss: 0.6885 - accuracy: 0.0000e+00
Epoch 4/10
1/1 [==============================] - 0s 7ms/step - loss: 0.6870 - accuracy: 0.0000e+00
Epoch 5/10
1/1 [==============================] - 0s 8ms/step - loss: 0.6856 - accuracy: 0.0000e+00
Epoch 6/10
1/1 [==============================] - 0s 9ms/step - loss: 0.6841 - accuracy: 0.0000e+00
Epoch 7/10
1/1 [==============================] - 0s 8ms/step - loss: 0.6828 - accuracy: 0.0000e+00
Epoch 8/10
1/1 [==============================] - 0s 14ms/step - loss: 0.6814 - accuracy: 0.0000e+00
Epoch 9/10
1/1 [==============================] - 0s 7ms/step - loss: 0.6801 - accuracy: 0.0000e+00
Epoch 10/10
1/1 [==============================] - 0s 11ms/step - loss: 0.6789 - accuracy: 0.0000e+00
<keras.callbacks.History at 0x7eff56169810>

Keras Cnn Model wont improve Accuracy

Im trying to implement a Cnn using Keras on a Sklearn dataset for handwritten digits recognition (load_digits). I have got the model to run but it is not improving the accuracy for each 'epochs' cycle, Im guessing its because my labels are incorrect, I have tried encoding my Y values with use of 'to_categorical' but it displays the following error:
C:\Users\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\backend.py:4979 binary_crossentropy
return nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)
C:\Users\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\util\dispatch.py:201 wrapper
return target(*args, **kwargs)
C:\Users\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\ops\nn_impl.py:173 sigmoid_cross_entropy_with_logits
raise ValueError("logits and labels must have the same shape (%s vs %s)" %
ValueError: logits and labels must have the same shape ((None, 1) vs (None, 10))
When i run my code without trying to encode the Y values it seems to go through the Cnn Model however it isn't accurate and it doesn't increase, this is my code:
import tensorflow as tf
from sklearn import datasets
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
#from keras.utils.np_utils import to_categorical
X,y = datasets.load_digits(return_X_y = True)
X = X/16
#X = X.reshape(1797,8,8,1)
train_x, test_x, train_y, test_y = train_test_split(X, y)
train_x = train_x.reshape(1347,8,8,1)
#test_x = test_x.reshape()
#train_y = to_categorical(train_y, num_classes = 10)
model = Sequential()
model.add(Conv2D(32, (2, 2), input_shape=( 8, 8, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (2, 2)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(train_x, train_y, batch_size=32, epochs=6, validation_split=0.3)
print(train_x[0])
And this gives me the following output:
Epoch 1/6
1/30 [>.............................] - ETA: 13s - loss: 1.1026 - accuracy: 0.0938
6/30 [=====>........................] - ETA: 0s - loss: 0.2949 - accuracy: 0.0652
30/30 [==============================] - 1s 33ms/step - loss: -5.4832 - accuracy: 0.0893 - val_loss: -49.9462 - val_accuracy: 0.1012
Epoch 2/6
1/30 [>.............................] - ETA: 0s - loss: -52.2145 - accuracy: 0.0625
30/30 [==============================] - 0s 3ms/step - loss: -120.6972 - accuracy: 0.0961 - val_loss: -513.0211 - val_accuracy: 0.1012
Epoch 3/6
1/30 [>.............................] - ETA: 0s - loss: -638.2873 - accuracy: 0.1250
30/30 [==============================] - 0s 3ms/step - loss: -968.3621 - accuracy: 0.1006 - val_loss: -2804.1062 - val_accuracy: 0.1012
Epoch 4/6
1/30 [>.............................] - ETA: 0s - loss: -3427.3135 - accuracy: 0.0000e+00
30/30 [==============================] - 0s 3ms/step - loss: -4571.7894 - accuracy: 0.0934 - val_loss: -10332.9727 - val_accuracy: 0.1012
Epoch 5/6
1/30 [>.............................] - ETA: 0s - loss: -12963.2559 - accuracy: 0.0625
30/30 [==============================] - 0s 3ms/step - loss: -15268.3010 - accuracy: 0.0887 - val_loss: -29262.1191 - val_accuracy: 0.1012
Epoch 6/6
1/30 [>.............................] - ETA: 0s - loss: -30990.6758 - accuracy: 0.1562
30/30 [==============================] - 0s 3ms/step - loss: -40321.9540 - accuracy: 0.0960 - val_loss: -68548.6094 - val_accuracy: 0.1012
Any guidance is greatly appricated, Thanks!

When you have a CNN you want the last layer to have as many nodes as the labels. So if you have 10 digits you want the last layer to have an output size 10. It usually has the activation function "softmax", which makes every value go to 0, except on value which is 1.
model.add(Dense(10))
model.add(Activation('softmax'))

CNN image classification training acc reaches 95% while validation acc is around only 45%

I've learned some deep learning with Tensorflow and Keras, so I wanted to do some pratical experiments.
I want to train a model with the CAISAV5 Fingerprint dataset(totally 20,000 fingerprint images), but during the training the training accuracy reaches 97% after 120 epochs while validation accuracy stays abbot 45%.
Here are the results:
Epoch 109/200
150/150 [==============================] - 23s 156ms/step - loss: 0.6971 - accuracy: 0.9418 - val_loss: 4.1766 - val_accuracy: 0.4171
Epoch 110/200
150/150 [==============================] - 23s 155ms/step - loss: 0.6719 - accuracy: 0.9492 - val_loss: 4.1447 - val_accuracy: 0.4379
Epoch 111/200
150/150 [==============================] - 24s 162ms/step - loss: 0.7003 - accuracy: 0.9388 - val_loss: 4.1439 - val_accuracy: 0.4396
Epoch 112/200
150/150 [==============================] - 24s 157ms/step - loss: 0.7010 - accuracy: 0.9377 - val_loss: 4.1577 - val_accuracy: 0.4425
Epoch 113/200
150/150 [==============================] - 24s 160ms/step - loss: 0.6699 - accuracy: 0.9494 - val_loss: 4.1242 - val_accuracy: 0.4371
Epoch 114/200
150/150 [==============================] - 25s 167ms/step - loss: 0.6814 - accuracy: 0.9456 - val_loss: 4.1966 - val_accuracy: 0.4288
Epoch 115/200
150/150 [==============================] - 24s 160ms/step - loss: 0.6440 - accuracy: 0.9590 - val_loss: 4.1586 - val_accuracy: 0.4354
Epoch 116/200
150/150 [==============================] - 23s 157ms/step - loss: 0.7877 - accuracy: 0.9212 - val_loss: 4.0408 - val_accuracy: 0.4246
Epoch 117/200
150/150 [==============================] - 23s 156ms/step - loss: 0.6728 - accuracy: 0.9504 - val_loss: 3.9317 - val_accuracy: 0.4567
Epoch 118/200
150/150 [==============================] - 25s 167ms/step - loss: 0.5710 - accuracy: 0.9874 - val_loss: 3.9505 - val_accuracy: 0.4483
Epoch 119/200
150/150 [==============================] - 24s 158ms/step - loss: 0.5616 - accuracy: 0.9873 - val_loss: 4.0607 - val_accuracy: 0.4542
Epoch 120/200
150/150 [==============================] - 23s 156ms/step - loss: 0.5948 - accuracy: 0.9716 - val_loss: 4.1531 - val_accuracy: 0.4238
Epoch 121/200
150/150 [==============================] - 23s 155ms/step - loss: 0.7453 - accuracy: 0.9150 - val_loss: 4.0798 - val_accuracy: 0.4154
Epoch 122/200
150/150 [==============================] - 26s 172ms/step - loss: 0.7232 - accuracy: 0.9256 - val_loss: 3.9307 - val_accuracy: 0.4425
Epoch 123/200
150/150 [==============================] - 24s 158ms/step - loss: 0.6277 - accuracy: 0.9632 - val_loss: 3.9988 - val_accuracy: 0.4408
Epoch 124/200
150/150 [==============================] - 23s 156ms/step - loss: 0.6367 - accuracy: 0.9581 - val_loss: 4.0837 - val_accuracy: 0.4358
I searched via the Internet and found overfitting may explain this, so I tried to simplify the layers, add dropouts and regulaziers and use batchnormalization. But those methods contribute very little to the accuracy.
Also I have normalized the data, already shuffled and convert its float value between 0.0 and 1.0. The original resolution of the images is 328 * 356, which was resized into 400 * 400 before being fed into the autoencoder.
Here is part of my code:
def encoder(input_img):
#encoder
conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
conv1 = BatchNormalization()(conv1)
conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv1)
conv1 = BatchNormalization()(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(pool1)
conv2 = BatchNormalization()(conv2)
conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv2)
conv2 = BatchNormalization()(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(pool2)
conv3 = BatchNormalization()(conv3)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv3)
conv3 = BatchNormalization()(conv3)
return conv3
def fc(enco):
pool = keras.layers.MaxPooling2D(pool_size = (2, 2))(enco)
keras.layers.BatchNormalization()
den1 = keras.layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l2(1e-3))(pool)
keras.layers.BatchNormalization()
pool1 = keras.layers.MaxPooling2D(pool_size = (2, 2))(den1)
keras.layers.Dropout(0.4)
den2 = keras.layers.Dense(256, activation = 'relu', kernel_regularizer=regularizers.l2(1e-3))(pool1)
keras.layers.BatchNormalization()
pool2 = keras.layers.MaxPooling2D(pool_size = (2, 2))(den2)
keras.layers.Dropout(0.4)
den3 = keras.layers.Dense(512, activation = 'relu', kernel_regularizer=regularizers.l2(1e-4))(pool2)
keras.layers.BatchNormalization()
pool3 = keras.layers.AveragePooling2D(pool_size = (2, 2))(den3)
keras.layers.Dropout(0.4)
flat = keras.layers.Flatten()(pool3)
keras.layers.Dropout(0.4)
keras.layers.BatchNormalization()
den4 = keras.layers.Dense(256, activation = 'relu', kernel_regularizer=regularizers.l2(1e-3))(flat)
keras.layers.Dropout(0.4)
keras.layers.BatchNormalization()
out = keras.layers.Dense(num, activation='softmax',kernel_regularizer=regularizers.l2(1e-4))(den4)
return out
encode = encoder(input_img)
full_model = Model(input_img,fc(encode))
for l1,l2 in zip(full_model.layers[0:15],autoencoder_model.layers[0:15]):
l1.set_weights(l2.get_weights())
for layer in full_model.layers[0:15]:
layer.trainable = False
full_model.summary()
full_model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Nadam(),metrics=['accuracy'])
batch_size = 64
The autoencoder_model has already been trained and performs well with a loss lower than 3e-4.
So I'm wondering what cause the low validation accuracy and what can I do to contribute to it?

Most obvious conclusion would be over fitting but given that you tried the standard methods to correct this like model simplification, dropout and regularization without any improvement it may be a different problem. For validation accuracy to be high the probability distribution of the validation data must mirror that of the data the model was trained on. So the question is how was the validation data selected? One thing I would try as a test is to make the validation data an identical subset of the training data. In that case the validation accuracy should approach 100%. If it does not get high then it may be pointing to something in how you process the validation data. I also noticed you elected to not train some layers in the model. Try making all layers trainable and see if that helps. I have seen where freezing the weights in a model can result in lower validation accuracy. Not sure why but I believe if the non trainable layers include dropout then with the weights frozen dropout has no effect and thus lead to over fitting. I am not a great fan of early stopping. It is a crutch for not effectively addressing over fitting issues.

What can I do to solve over-fitting problem in following code?

I'm trying to build a Handwritten word recognition using IAM Dataset
and while training I'm facing over fitting problem. Would you please
help me figure out what mistake I have made in code below.
I have tried all the solution that I can find to resolve the problem but still the same overfitting problem persists.
import os
import fnmatch
import cv2
import numpy as np
import string
import time
import random
from keras import regularizers, optimizers
from keras.regularizers import l2
from keras.preprocessing.sequence import pad_sequences
from keras.layers import Dense, LSTM, Reshape, BatchNormalization, Input, Conv2D, MaxPool2D, Lambda, Bidirectional, Dropout
from keras.models import Model
from keras.activations import relu, sigmoid, softmax
import keras.backend as K
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint,ReduceLROnPlateau
import matplotlib.pyplot as plt
imgSize = (128,32)
def preprocess(img, imgSize, dataAugmentation=False):
"put img into target img of size imgSize, transpose for TF and normalize gray-values"
# there are damaged files in IAM dataset - just use black image instead
if img is None:
img = np.zeros([imgSize[1], imgSize[0]])
# increase dataset size by applying random stretches to the images
if dataAugmentation:
stretch = (random.random() - 0.5) # -0.5 .. +0.5
wStretched = max(int(img.shape[1] * (1 + stretch)), 1) # random width, but at least 1
img = cv2.resize(img, (wStretched, img.shape[0])) # stretch horizontally by factor 0.5 .. 1.5
img = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,11,2)
# print('Data Augmented')
# create target image and copy sample image into it
(wt, ht) = imgSize
(h, w) = img.shape
fx = w / wt
fy = h / ht
f = max(fx, fy)
newSize = (max(min(wt, int(w / f)), 1), max(min(ht, int(h / f)), 1)) # scale according to f (result at least 1 and at most wt or ht)
img = cv2.resize(img, newSize)
target = np.ones([ht, wt]) * 255
target[0:newSize[1], 0:newSize[0]] = img
# transpose for TF
img = cv2.transpose(target)
# normalize
(m, s) = cv2.meanStdDev(img)
m = m[0][0]
s = s[0][0]
img = img - m
img = img / s if s>0 else img
img = np.expand_dims(img , axis = 2)
return img
def truncateLabel(text, maxTextLen): # A,32
cost = 0
for i in range(len(text)):
if i != 0 and text[i] == text[i-1]:
cost += 2
else:
cost += 1
if cost > maxTextLen:
return text[:i] # returns words with repeated chars
return text
path = 'iam_dataset_words/'
maxTextLen = 32
samples = []
bad_samples = []
fileName = ''
dataAugmentation = False
chars = set()
f=open(path+ 'words.txt', "r")
cou = 0
bad_samples = []
bad_samples_reference = ['a01-117-05-02.png',
'r06-022-03-05.png']
for line in f:
cou+=1
# ignore comment line
if not line or line[0]=='#':
continue
lineSplit = line.strip().split(' ')
assert len(lineSplit) >= 9
fileNameSplit = lineSplit[0].split('-') #a01-000u-00-00 splits
#../data/words/a01/a01-000u/a01-000u-00-00.png
fileName = path + 'words/' \
+ fileNameSplit[0] + '/' \
+ fileNameSplit[0] + '-' \
+ fileNameSplit[1] \
+ '/' + lineSplit[0] + '.png'
# GT text are columns starting at 9
gtText = truncateLabel(' '.join(lineSplit[8:]), maxTextLen) #A,32
#chars = chars.union(gtText) #unique chars only
chars = chars.union(set(list(gtText)))
# check if image is not empty
if not os.path.getsize(fileName):
bad_samples.append(lineSplit[0] + '.png')
continue
# put sample into list
#'A','../data/words/a01/a01-000u/a01-000u-00-00.png'
samples.append([gtText, fileName])
print(cou)
print(len(samples))
print(samples[:2])
if set(bad_samples) != set(bad_samples_reference):
print("Warning, damaged images found:", bad_samples)
print("Damaged images expected:", bad_samples_reference)
trainSamples = []
validationSamples = []
testSamples = []
valid_testSamples = []
# split into training and validation set: 90% - 10%
# dataAugmentation = True
random.shuffle(samples)
splitIdx = int(0.75 * len(samples))
train_samples = samples[:splitIdx]
valid_testSamples = samples[splitIdx:]
print('vv:', len(valid_testSamples))
validationSamples = valid_testSamples[:15000]
testSamples = valid_testSamples[15000:]
print('valid: ',len(validationSamples))
print('test: ',len(testSamples))
print('train_before: ',len(train_samples))
# # start with train set
trainSamples = train_samples[:25000] #tran data 25000
print('train_ after: ',len(trainSamples))
# # list of all unique chars in dataset
charList = sorted(list(chars))
char_list = str().join(charList)
# print('test samples: ',testSamples)
print('char list : ',char_list)
# # save characters of model for inference mode
# open(FilePaths.fnCharList, 'w').write(str().join(charList))
# # save words contained in dataset into file
# open(FilePaths.fnCorpus, 'w').write(str(' ').join(loader.trainWords + validationWords))
def encode_to_labels(txt):
# encoding each output word into digits
chars = []
for index, char in enumerate(txt):
try:
chars.append(char_list.index(char))
except:
print(char)
return chars
print(trainSamples[:2])
# lists for training dataset
train_img = []
train_txt = []
train_input_length = []
train_label_length = []
train_orig_txt = []
max_label_len = 0
b = 0
for words, imgPath in trainSamples:
img = preprocess(cv2.imread(imgPath, cv2.IMREAD_GRAYSCALE), imgSize, dataAugmentation = True)
# compute maximum length of the text
if len(words) > max_label_len:
max_label_len = len(words)
train_orig_txt.append(words)
train_label_length.append(len(words))
train_input_length.append(31)
train_img.append(img)
train_txt.append(encode_to_labels(words))
b+=1
# print(train_img[1])
print(len(train_txt))
train_txt[:5]
a = 0
#lists for validation dataset
valid_img = []
valid_txt = []
valid_input_length = []
valid_label_length = []
valid_orig_txt = []
for words, imgPath in validationSamples:
img = preprocess(cv2.imread(imgPath, cv2.IMREAD_GRAYSCALE), imgSize, dataAugmentation = False)
valid_orig_txt.append(words)
valid_label_length.append(len(words))
valid_input_length.append(31)
valid_img.append(img)
valid_txt.append(encode_to_labels(words))
a+=1
print(len(valid_txt))
valid_txt[:5]
# lists for training dataset
test_img = []
test_txt = []
test_input_length = []
test_label_length = []
test_orig_txt = []
c = 0
for words, imgPath in testSamples:
img = preprocess(cv2.imread(imgPath, cv2.IMREAD_GRAYSCALE), imgSize, dataAugmentation = False)
test_orig_txt.append(words)
test_label_length.append(len(words))
test_input_length.append(31)
test_img.append(img)
test_txt.append(encode_to_labels(words))
c+=1
# print(c)
print(test_img[0].shape)
print('Train: {}\nValid: {}\nTest: {}'.format(b,a,c))
print(max_label_len)
# pad each output label to maximum text length
train_padded_txt = pad_sequences(train_txt, maxlen=max_label_len, padding='post', value = len(char_list))
valid_padded_txt = pad_sequences(valid_txt, maxlen=max_label_len, padding='post', value = len(char_list))
test_padded_txt = pad_sequences(test_txt, maxlen=max_label_len, padding='post', value = len(char_list))
print(len(train_padded_txt))
print(len(test_padded_txt))
print(valid_padded_txt[1])
# input with shape of height=32 and width=128
inputs = Input(shape=(128,32,1))
print(inputs.shape)
# convolution layer with kernel size (3,3)
conv_1 = Conv2D(32, (3,3), activation = 'relu', padding='same')(inputs)
batch_norm_1 = BatchNormalization()(conv_1)
# poolig layer with kernel size (2,2)
pool_1 = Conv2D(32, kernel_size=(1, 1), strides=2, padding='valid')(batch_norm_1)
conv_2 = Conv2D(64, (3,3), activation = 'relu', padding='same')(pool_1)
batch_norm_2 = BatchNormalization()(conv_2)
pool_2 = Conv2D(64, kernel_size=(1, 1), strides=2, padding='valid')(batch_norm_2)
conv_3 = Conv2D(128, (3,3), activation = 'relu', padding='same')(pool_2)
batch_norm_3 = BatchNormalization()(conv_3)
conv_4 = Conv2D(128, (3,3), activation = 'relu', padding='same')(batch_norm_3)
batch_norm_4 = BatchNormalization()(conv_4)
# poolig layer with kernel size (1,2)
pool_4 = MaxPool2D(pool_size=(1,2))(batch_norm_4)
conv_5 = Conv2D(256, (3,3), activation = 'relu', padding='same')(pool_4)
# Batch normalization layer
batch_norm_5 = BatchNormalization()(conv_5)
conv_6 = Conv2D(256, (3,3), activation = 'relu', padding='same')(batch_norm_5)
batch_norm_6 = BatchNormalization()(conv_6)
pool_6 = MaxPool2D(pool_size=(1,2))(batch_norm_6)
conv_7 = Conv2D(256, (2,2), activation = 'relu')(pool_6)
batch_norm_7 = BatchNormalization()(conv_7)
# print(conv_7.shape)
# map-to-sequence-- dropping 1 dimension
squeezed = Lambda(lambda x: K.squeeze(x, 2))(batch_norm_7)
# print('squeezed',squeezed.shape)
# bidirectional LSTM layers with units=128
blstm_1 = Bidirectional(LSTM(128, return_sequences=True, dropout = 0.3))(squeezed)
blstm_2 = Bidirectional(LSTM(128, return_sequences=True, dropout = 0.3))(blstm_1)
outputs = Dense(len(char_list)+1, activation = 'softmax')(blstm_2)
# model to be used at test time
word_model = Model(inputs, outputs)
adam = optimizers.Adamax(lr=0.01, decay = 1e-5)
model.compile(loss= {'ctc': lambda y_true, y_pred: y_pred}, optimizer = adam, metrics = ['accuracy'])
filepath="best_model.hdf5"
checkpoint1 = ReduceLROnPlateau(monitor='val_loss', verbose=1,
mode='auto',factor=0.2,patience=4, min_lr=0.0001)
checkpoint2 = ModelCheckpoint(filepath=filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='auto')
callbacks_list = [checkpoint1, checkpoint2]
train_img = np.array(train_img)
train_input_length = np.array(train_input_length)
train_label_length = np.array(train_label_length)
valid_img = np.array(valid_img)
valid_input_length = np.array(valid_input_length)
valid_label_length = np.array(valid_label_length)
test_img = np.array(test_img)
test_input_length = np.array(test_input_length)
test_label_length = np.array(test_label_length)
test_img.shape
batch_size = 50
epochs = 30
train_history = model.fit(x=[train_img, train_padded_txt, train_input_length, train_label_length],
y=np.zeros(len(train_img)), batch_size=batch_size, epochs = epochs,
validation_data = ([valid_img, valid_padded_txt, valid_input_length,
valid_label_length], [np.zeros(len(valid_img))]),
verbose = 1, callbacks = callbacks_list)
Train on 25000 samples, validate on 15000 samples
Epoch 1/30
25000/25000 [==============================] - 159s 6ms/step - loss: 13.6510 - acc: 0.0199 - val_loss: 11.4910 - val_acc: 0.0651
Epoch 00001: val_loss improved from inf to 11.49100, saving model to best_model.hdf5
Epoch 2/30
25000/25000 [==============================] - 146s 6ms/step - loss: 10.9559 - acc: 0.0603 - val_loss: 9.7359 - val_acc: 0.0904
Epoch 00002: val_loss improved from 11.49100 to 9.73587, saving model to best_model.hdf5
Epoch 3/30
25000/25000 [==============================] - 146s 6ms/step - loss: 9.0720 - acc: 0.0943 - val_loss: 7.3571 - val_acc: 0.1565
Epoch 00003: val_loss improved from 9.73587 to 7.35715, saving model to best_model.hdf5
Epoch 4/30
25000/25000 [==============================] - 145s 6ms/step - loss: 6.9501 - acc: 0.1520 - val_loss: 5.5228 - val_acc: 0.2303
Epoch 00004: val_loss improved from 7.35715 to 5.52277, saving model to best_model.hdf5
Epoch 5/30
25000/25000 [==============================] - 144s 6ms/step - loss: 5.4893 - acc: 0.2129 - val_loss: 4.3179 - val_acc: 0.2895
Epoch 00005: val_loss improved from 5.52277 to 4.31793, saving model to best_model.hdf5
Epoch 6/30
25000/25000 [==============================] - 143s 6ms/step - loss: 4.7053 - acc: 0.2612 - val_loss: 3.7490 - val_acc: 0.3449
Epoch 00006: val_loss improved from 4.31793 to 3.74896, saving model to best_model.hdf5
Epoch 7/30
25000/25000 [==============================] - 143s 6ms/step - loss: 4.1183 - acc: 0.3096 - val_loss: 3.5902 - val_acc: 0.3805
Epoch 00007: val_loss improved from 3.74896 to 3.59015, saving model to best_model.hdf5
Epoch 8/30
25000/25000 [==============================] - 143s 6ms/step - loss: 3.6662 - acc: 0.3462 - val_loss: 3.7923 - val_acc: 0.3350
Epoch 00008: val_loss did not improve from 3.59015
Epoch 9/30
25000/25000 [==============================] - 143s 6ms/step - loss: 3.3398 - acc: 0.3809 - val_loss: 3.1352 - val_acc: 0.4344
Epoch 00009: val_loss improved from 3.59015 to 3.13516, saving model to best_model.hdf5
Epoch 10/30
25000/25000 [==============================] - 143s 6ms/step - loss: 3.0199 - acc: 0.4129 - val_loss: 2.9798 - val_acc: 0.4541
Epoch 00010: val_loss improved from 3.13516 to 2.97978, saving model to best_model.hdf5
Epoch 11/30
25000/25000 [==============================] - 143s 6ms/step - loss: 2.7361 - acc: 0.4447 - val_loss: 3.3836 - val_acc: 0.3780
Epoch 00011: val_loss did not improve from 2.97978
Epoch 12/30
25000/25000 [==============================] - 143s 6ms/step - loss: 2.5127 - acc: 0.4695 - val_loss: 2.9266 - val_acc: 0.5041
Epoch 00012: val_loss improved from 2.97978 to 2.92656, saving model to best_model.hdf5
Epoch 13/30
25000/25000 [==============================] - 142s 6ms/step - loss: 2.3045 - acc: 0.4974 - val_loss: 2.7329 - val_acc: 0.5174
Epoch 00013: val_loss improved from 2.92656 to 2.73294, saving model to best_model.hdf5
Epoch 14/30
25000/25000 [==============================] - 141s 6ms/step - loss: 2.1245 - acc: 0.5237 - val_loss: 2.8624 - val_acc: 0.5339
Epoch 00014: val_loss did not improve from 2.73294
Epoch 15/30
25000/25000 [==============================] - 142s 6ms/step - loss: 1.9091 - acc: 0.5524 - val_loss: 2.6933 - val_acc: 0.5506
Epoch 00015: val_loss improved from 2.73294 to 2.69333, saving model to best_model.hdf5
Epoch 16/30
25000/25000 [==============================] - 141s 6ms/step - loss: 1.7565 - acc: 0.5705 - val_loss: 2.7697 - val_acc: 0.5461
Epoch 00016: val_loss did not improve from 2.69333
Epoch 17/30
25000/25000 [==============================] - 145s 6ms/step - loss: 1.6273 - acc: 0.5892 - val_loss: 2.8992 - val_acc: 0.5361
Epoch 00017: val_loss did not improve from 2.69333
Epoch 18/30
25000/25000 [==============================] - 145s 6ms/step - loss: 1.5007 - acc: 0.6182 - val_loss: 2.9558 - val_acc: 0.5345
Epoch 00018: val_loss did not improve from 2.69333
Epoch 19/30
25000/25000 [==============================] - 143s 6ms/step - loss: 1.3775 - acc: 0.6311 - val_loss: 2.8437 - val_acc: 0.5744
Epoch 00019: ReduceLROnPlateau reducing learning rate to 0.0019999999552965165.
Epoch 00019: val_loss did not improve from 2.69333
Epoch 20/30
25000/25000 [==============================] - 144s 6ms/step - loss: 0.9636 - acc: 0.7115 - val_loss: 2.6072 - val_acc: 0.6083
Epoch 00020: val_loss improved from 2.69333 to 2.60724, saving model to best_model.hdf5
Epoch 21/30
25000/25000 [==============================] - 146s 6ms/step - loss: 0.7940 - acc: 0.7583 - val_loss: 2.6613 - val_acc: 0.6167
Epoch 00021: val_loss did not improve from 2.60724
Epoch 22/30
25000/25000 [==============================] - 146s 6ms/step - loss: 0.6995 - acc: 0.7797 - val_loss: 2.7180 - val_acc: 0.6220
Epoch 00022: val_loss did not improve from 2.60724
Epoch 23/30
25000/25000 [==============================] - 144s 6ms/step - loss: 0.6197 - acc: 0.8046 - val_loss: 2.7504 - val_acc: 0.6226
Epoch 00023: val_loss did not improve from 2.60724
Epoch 24/30
25000/25000 [==============================] - 143s 6ms/step - loss: 0.5668 - acc: 0.8167 - val_loss: 2.8238 - val_acc: 0.6255
Epoch 00024: ReduceLROnPlateau reducing learning rate to 0.0003999999724328518.
Epoch 00024: val_loss did not improve from 2.60724
Epoch 25/30
25000/25000 [==============================] - 144s 6ms/step - loss: 0.5136 - acc: 0.8316 - val_loss: 2.8167 - val_acc: 0.6283
Epoch 00025: val_loss did not improve from 2.60724
Epoch 26/30
25000/25000 [==============================] - 143s 6ms/step - loss: 0.5012 - acc: 0.8370 - val_loss: 2.8244 - val_acc: 0.6299
Epoch 00026: val_loss did not improve from 2.60724
Epoch 27/30
25000/25000 [==============================] - 143s 6ms/step - loss: 0.4886 - acc: 0.8425 - val_loss: 2.8366 - val_acc: 0.6282
Epoch 00027: val_loss did not improve from 2.60724
Epoch 28/30
25000/25000 [==============================] - 143s 6ms/step - loss: 0.4820 - acc: 0.8432 - val_loss: 2.8447 - val_acc: 0.6271
Epoch 00028: ReduceLROnPlateau reducing learning rate to 0.0001.
Epoch 00028: val_loss did not improve from 2.60724
Epoch 29/30
25000/25000 [==============================] - 141s 6ms/step - loss: 0.4643 - acc: 0.8452 - val_loss: 2.8538 - val_acc: 0.6278
Epoch 00029: val_loss did not improve from 2.60724
Epoch 30/30
25000/25000 [==============================] - 141s 6ms/step - loss: 0.4576 - acc: 0.8496 - val_loss: 2.8555 - val_acc: 0.6277
Epoch 00030: val_loss did not improve from 2.60724
Evaluation of the model
test_history = model.evaluate([test_img, test_padded_txt,
test_input_length, test_label_length],
y=np.zeros(len(test_img)), verbose = 1)
test_history
Output
13830/13830 [==============================] - 42s 3ms/step
[2.855567638786134, 0.6288503253882292]
Some Predicted Output:

Not sure what you have already tried, but did you check if your training and validation samples are balanced? That is, whether they have roughly the same percentages of examples in each category.
You could shuffle 'samples' using 'random.shuffle(samples)' before executing your following code:
splitIdx = int(0.75 * len(samples))
train_samples = samples[:splitIdx]
That way, you can be more certain that your training and validation sets are balanced.

There is a lot you can do.
Add batch normalization after every conv2d layer
Replace maxpooling with conv2d valid padding so it becomes a learnable layer
from: pool_1 = MaxPool2D(pool_size=(2, 2), strides=2)(conv_1)
to: pool_1 = Conv2D(filters, kernel_size=(1, 1), strides=2, padding='valid')(conv_1)
Add l2 regularization to your layers, look here for implementation
Try weight decay
Increase the dropout values you already have
Modify your learning rate, too small and it might fall into a local minimum
And here is a lot more, the only way to know is to try them out

Keras Concatenated Model Doesn't learn

i'm trying to build a model that can predict emotions using 7 models concatenated .
Each of the 7 model represents a part of the face: mouth, left_eye, right_eye...ect
the problem is the model doesn't learn at all: from the 2nd epoch to the last one 100 : i have 15% accuracy, no changes in acuracy or loss during all the epochs.
i think maybe the problem is in my model cocatenated or my fit function ( the train and labels data)
there is 7 Emotions : sad, angry , happy ....ect
Here is my model and my compile and train and my datasets
Model
from keras.layers import Conv2D, MaxPooling2D, Input, concatenate
from keras.models import Sequential, Model
from keras.layers.core import Dense, Dropout, Flatten
def build_all_faceparts_model(input_shape,batch_shape,num_classes):
input1=Input(input_shape)
input2=Input(input_shape)
input3=Input(input_shape)
input4=Input(input_shape)
input5=Input(input_shape)
input6=Input(input_shape)
input7=Input(input_shape)
# Create the model for right eye
right_eye=Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input1, batch_input_shape = batch_shape) (input1)
right_eye=MaxPooling2D(pool_size=(2, 2))(right_eye)
right_eye=Dropout(0.25)(right_eye)
right_eye=Flatten()(right_eye)
# Create the model for leftt eye
left_eye=Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input2, batch_input_shape = batch_shape) (input2)
left_eye=MaxPooling2D(pool_size=(2, 2))(left_eye)
left_eye=Dropout(0.25)(left_eye)
left_eye=Flatten()(left_eye)
# Create the model for right eyebrow
right_eyebrow=Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input3, batch_input_shape = batch_shape) (input3)
right_eyebrow=MaxPooling2D(pool_size=(2, 2))(right_eyebrow)
right_eyebrow=Dropout(0.25)(right_eyebrow)
right_eyebrow=Flatten()(right_eyebrow)
# Create the model for leftt eye
left_eyebrow=Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input4, batch_input_shape = batch_shape) (input4)
left_eyebrow=MaxPooling2D(pool_size=(2, 2))(left_eyebrow)
left_eyebrow=Dropout(0.25)(left_eyebrow)
left_eyebrow=Flatten()(left_eyebrow)
# Create the model for mouth
mouth=Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input5, batch_input_shape = batch_shape) (input5)
mouth=MaxPooling2D(pool_size=(2, 2))(mouth)
mouth=Dropout(0.25)(mouth)
mouth=Flatten()(mouth)
# Create the model for nose
nose=Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input6, batch_input_shape = batch_shape) (input6)
nose=MaxPooling2D(pool_size=(2, 2))(nose)
nose=Dropout(0.25)(nose)
nose=Flatten()(nose)
# Create the model for jaw
jaw=Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input7, batch_input_shape = batch_shape) (input7)
jaw=MaxPooling2D(pool_size=(2, 2))(jaw)
jaw=Dropout(0.25)(jaw)
jaw=Flatten()(jaw)
concatenated = concatenate([right_eye, left_eye, right_eyebrow, left_eyebrow, mouth, nose, jaw],axis = -1)
out = Dense(num_classes, activation='softmax')(concatenated)
model = Model([input1,input2,input3,input4,input5,input6,input7], out)
return model
train and test datasets Here X_train_all is a list of datasets, not like y_train_all
X_train_all=[X_train_mouth,X_train_right_eyebrow,X_train_left_eyebrow,X_train_right_eye,X_train_left_eye,X_train_nose,X_train_jaw]
X_test_all=[X_test_mouth,X_test_right_eyebrow,X_test_left_eyebrow,X_test_right_eye,X_test_left_eye,X_test_nose,X_test_jaw]
y_train_all=y_train_mouth+y_train_right_eyebrow+y_train_left_eyebrow+y_train_right_eye+y_train_left_eye+y_train_nose+y_train_jaw
y_test_all=y_test_mouth+y_test_right_eyebrow+y_test_left_eyebrow+y_test_right_eye+y_test_left_eye+y_test_nose+y_test_jaw
compile
from keras.optimizers import Adam
input_shape =X_train_mouth[0].shape
batch_shape = X_train_mouth[0].shape
model_all_faceparts=build_all_faceparts_model(input_shape,batch_shape,7)
#Compile Model
model_all_faceparts.compile(loss='categorical_crossentropy', optimizer=Adam(lr=1e-3),metrics=["accuracy"])
lr_reducer = ReduceLROnPlateau(monitor='val_loss', factor=0.9, patience=3)
early_stopper = EarlyStopping(monitor='val_acc', min_delta=0, patience=15, mode='auto')
checkpointer = ModelCheckpoint(current_dir+'/weights_jaffe.hd5', monitor='val_loss', verbose=1, save_best_only=True)
Train
history=model_all_faceparts.fit(
X_train_all, y_train_all, batch_size=7, epochs=100, verbose=1,callbacks=[lr_reducer, checkpointer, early_stopper])
output
Epoch 1/100
181/181 [==============================] - 19s 107ms/step - loss: 94.6603 - acc: 0.1271
Epoch 2/100
/usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1109: RuntimeWarning: Reduce LR on plateau conditioned on metric `val_loss` which is not available. Available metrics are: loss,acc,lr
(self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
/usr/local/lib/python3.6/dist-packages/keras/callbacks.py:434: RuntimeWarning: Can save best model only with val_loss available, skipping.
'skipping.' % (self.monitor), RuntimeWarning)
/usr/local/lib/python3.6/dist-packages/keras/callbacks.py:569: RuntimeWarning: Early stopping conditioned on metric `val_acc` which is not available. Available metrics are: loss,acc,lr
(self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
181/181 [==============================] - 15s 81ms/step - loss: 95.9962 - acc: 0.1492
Epoch 3/100
181/181 [==============================] - 15s 81ms/step - loss: 95.9962 - acc: 0.1492
Epoch 4/100
181/181 [==============================] - 15s 83ms/step - loss: 95.9962 - acc: 0.1492
Epoch 5/100
181/181 [==============================] - 15s 84ms/step - loss: 95.9962 - acc: 0.1492
Epoch 6/100
181/181 [==============================] - 15s 85ms/step - loss: 95.9962 - acc: 0.1492
Epoch 7/100
181/181 [==============================] - 16s 86ms/step - loss: 95.9962 - acc: 0.1492
Epoch 8/100
181/181 [==============================] - 16s 87ms/step - loss: 95.9962 - acc: 0.1492
Epoch 9/100
181/181 [==============================] - 16s 86ms/step - loss: 95.9962 - acc: 0.1492
Epoch 10/100

(I completly forgot this post)
The problem was in the model itself, i just changed the model (added some layers) and everything was fine concluding to 93% accuracy!
PS: thanks to the tensorflow support guy that did remind me to post an answer

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Keras model does not improve on multi-class classification problem - python

Related

ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (10, 24)

Keras Cnn Model wont improve Accuracy

CNN image classification training acc reaches 95% while validation acc is around only 45%

What can I do to solve over-fitting problem in following code?

Keras Concatenated Model Doesn't learn

Categories

Resources