CNN accuracy doesn't change through multiple epochs - python

I'm trying to create a CNN which differentiates b/w pictures of eyes with/without symptoms of diabetic retinopathy. When I tried to run my model, the accuracy didn't improve at all. I tried using different learning rates, but it hasn't worked. Since this is my first time making a CNN, I think I might have made a mistake elsewhere. If you see the problems within my code, please let me know, I would really appreciate the help.
Train on 980 samples, validate on 327 samples
Epoch 1/5
980/980 [==============================] - 777s 792ms/step - loss: 8.1986 - accuracy: 0.4653 - val_loss: 8.8154 - val_accuracy: 0.4251
Epoch 2/5
980/980 [==============================] - 666s 679ms/step - loss: 8.1986 - accuracy: 0.4653 - val_loss: 8.8154 - val_accuracy: 0.4251
Epoch 3/5
980/980 [==============================] - 672s 686ms/step - loss: 8.1986 - accuracy: 0.4653 - val_loss: 8.8154 - val_accuracy: 0.4251
Here is my code:
DATADIR = "C:\\Users.."
CATEGORIES = ["nosymptoms", "symptoms"]
training_data = []
IMG_SIZE = 512
for category in CATEGORIES:
path = os.path.join(DATADIR, category) #brings us to folder with categories
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_GRAYSCALE)
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE)) #img resized and becomes array
training_data.append([new_array, class_num]) #classification is appended to image
import random
random.shuffle(training_data)
for sample in training_data[:10]:
print(sample[1])#0 is the image array, 1 is the label
X = [] #features set
y = [] #labels
X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 1)#shape of features (-1 means any), img size, 1 (b/c it is a grayscale)
#Save data
import pickle
pickle_out = open("X.pickle","wb")
pickle.dump(X, pickle_out)
pickle_out.close()
pickle_out = open("y.pickle","wb")
pickle.dump(y, pickle_out)
pickle_out.close()
X = pickle.load(open("X.pickle", "rb"))
y = pickle.load(open("y.pickle", "rb"))
X = X/255.0
model = Sequential()
model.add(Convolution2D(32, (3,3),input_shape=(X.shape[1:]),activation='relu'))
model.add(Convolution2D(32, (3,3),activation='relu'))
model.add(Convolution2D(32, (3,3),activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(16, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(1, activation='softmax'))
from keras.optimizers import SGD
opt = SGD(lr=0.01)
model.compile(loss = "binary_crossentropy", optimizer = opt, metrics=['accuracy'])
print(model.summary())
model.fit(X, y, batch_size = 16, epochs = 5, validation_split=.25)

Since you have 2 classes ,change your last layer to 2 neurons
model = Sequential()
model.add(Convolution2D(32, (3,3),input_shape=(X.shape[1:]),activation='relu'))
model.add(Convolution2D(32, (3,3),activation='relu'))
model.add(Convolution2D(32, (3,3),activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(16, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(2, activation='softmax'))
from keras.optimizers import SGD
opt = SGD(lr=0.01)
model.compile(loss = "categorical_crossentropy", optimizer = opt, metrics=['accuracy'])
print(model.summary())
model.fit(X, y, batch_size = 16, epochs = 5, validation_split=.25)

Related

Setting the shape of tensorflow sequential model input layer

I'm trying to build a model for multi class classification, but I don't understand how to set the correct input shape. I have a training set with shape (5420, 212) and this is the model I built:
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape = (5420,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(5, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
history = model.fit(X_train, y_train, epochs=20, batch_size=512)
When I run it I get the error:
ValueError: Input 0 of layer sequential_9 is incompatible with the layer: expected axis -1 of input shape to have value 5420 but received input with shape (None, 212)
Why? Isn't the input value correct?
The input shape should be equal to the length of the input X second dimension, while the output shape should be equal to the length of the output Y second dimension (assuming that both X and Y are 2-dimensional, i.e. that they don't have higher dimensions).
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.datasets import make_classification
from sklearn.preprocessing import OneHotEncoder
tf.random.set_seed(0)
# generate some data
X, y = make_classification(n_classes=5, n_samples=5420, n_features=212, n_informative=212, n_redundant=0, random_state=42)
print(X.shape, y.shape)
# (5420, 212) (5420,)
# one-hot encode the target
Y = OneHotEncoder(sparse=False).fit_transform(y.reshape(-1, 1))
print(X.shape, Y.shape)
# (5420, 212) (5420, 5)
# extract the input and output shapes
input_shape = X.shape[1]
output_shape = Y.shape[1]
print(input_shape, output_shape)
# 212 5
# define the model
model = Sequential()
model.add(Dense(64, activation='relu', input_shape=(input_shape,)))
model.add(Dense(64, activation='relu'))
model.add(Dense(output_shape, activation='softmax'))
# compile the model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# fit the model
history = model.fit(X, Y, epochs=3, batch_size=512)
# Epoch 1/3
# 11/11 [==============================] - 0s 1ms/step - loss: 4.8206 - accuracy: 0.2208
# Epoch 2/3
# 11/11 [==============================] - 0s 1ms/step - loss: 2.8060 - accuracy: 0.3229
# Epoch 3/3
# 11/11 [==============================] - 0s 1ms/step - loss: 2.0705 - accuracy: 0.3989

Neural network errors don't change

I am training a model using TensorFlow. I was getting weird results when looking at my model performance. I built two models to classify images, one using a CNN and the other using a traditional ANN. Below is the code setup for each of them.
#CNN model
model = Sequential()
model.add(Reshape((20, 60, 3)))
#model.add(Conv2D(128, (5, 5), (2, 2), activation='elu'))
#model.add(Conv2D(64, (4, 4), (2, 2), activation='elu'))
#model.add(Flatten())
#model.add(Dense(1, activation = 'elu'))
#model.add(Dense(25, activation = 'elu'))
#model.add(Dense(10, activation = 'elu'))
#model.add(Dense(1))
opt = keras.optimizers.RMSprop(lr=0.0009, decay=1e-6)
model.compile(Adam(lr = 0.0001), loss='mse', metrics = ['mae'])
history = model.fit(X_train, y_train, epochs = 20, validation_data=(X_val, y_val), batch_size= 32)
#ANN model
model = Sequential()
model.add(Reshape((20, 60, 3)))
#model.add(Flatten())
#model.add(Dense(10, activation = 'elu'))
#model.add(Dense(1))
opt = keras.optimizers.RMSprop(lr=0.0009, decay=1e-6)
model.compile(Adam(lr = 0.0001), loss='mse', metrics = ['mae'])
history = model.fit(X_train, y_train, epochs = 20, validation_data=(X_val, y_val), batch_size= 32)
However, the problem is that I am getting nearly identical loss, and mean absolute error metrics using both of these models, when I am expecting the mae to be MUCH higher for the 2nd model. Does anyone know why this is happening? Could it be something wrong with my input data?
P.S. This network is trying to do regression to predict steering angle for a self-driving rc car from a image
EDIT:
Here is the ending error with the CNN:
Epoch 20/20 113/113 [==============================] - 1s 5ms/step - loss: 0.0382 - mae: 0.1582 - val_loss: 0.0454 - val_mae: 0.1727 dict_keys(['loss', 'mae', 'val_loss', 'val_mae'])
Here is the ending error with the ANN:
Epoch 20/20 113/113 [==============================] - 0s 3ms/step - loss: 0.0789 - mae: 0.2187 - val_loss: 0.0854 - val_mae: 0.2300 dict_keys(['loss', 'mae', 'val_loss', 'val_mae'])
I think the issue is from your training data, try using another data and check the results again

Using CNN to get value from dartboard where dart landed

I want to use CNN in python to get values from dartboard (or the value of the field where dart landed) using pictures.
I took 208 photos of dartboard, in each dart is in specific location. I want to predict if the dart in next image is in specific field (208 pictures represent 4 classes/52 each) (single, double and triple from same field represent same number or in our case, same class.
sample dart in a field
Then i use similar picture to test model.
When I try to fit model I get something like this
208/208 [==============================] - 3s 15ms/sample - loss: 0.0010 - accuracy: 1.0000 - val_loss: 8.1726 - val_accuracy: 0.2500
Epoch 29/100
208/208 [==============================] - 3s 15ms/sample - loss: 9.8222e-04 - accuracy: 1.0000 - val_loss: 8.6713 - val_accuracy: 0.2500
Epoch 30/100
208/208 [==============================] - 3s 15ms/sample - loss: 8.5902e-04 - accuracy: 1.0000 - val_loss: 9.2214 - val_accuracy: 0.2500
Epoch 31/100
208/208 [==============================] - 3s 15ms/sample - loss: 7.9463e-04 - accuracy: 1.0000 - val_loss: 9.6584 - val_accuracy: 0.2500
As the accuracy hits 1 the val_accuracy stays the same, some previous model got me a little better result, but it was little better than this.
As I am new in the field I need some advice to get my model or whole program better.
Here is my current model_
model = Sequential()
model.add(Conv2D(32, kernel_size=3, activation='relu', input_shape=(640, 480, 3)))
model.add(MaxPooling2D(2, 2))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2, 2))
model.add(Conv2D(128, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2, 2))
model.add(Conv2D(256, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(2, 2))
model.add(Flatten())
model.add(Dense(512, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(4, activation='softmax'))
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(X, y, batch_size=16, epochs=100, validation_data=(Xtest,ytest))
AND MY SAMPLE PROGRAM
training_data = []
DATADIR = 'C:/PikadaNew'
dir = sorted(os.listdir(DATADIR), key=len)
def create_training_data():
for category in dir: # do dogs and cats
path = os.path.join(DATADIR,category)
class_num = dir.index(category)
for img in tqdm(os.listdir(path)):
try:
img_array = cv2.imread(os.path.join(path,img))
training_data.append([img_array, class_num])
except Exception as e:
pass
create_training_data()
DATATESTDIR = 'C:/PikadaNewTest'
dir1 = sorted(os.listdir(DATATESTDIR), key=len)
test_data = []
def create_test_data():
for category in dir1:
path = os.path.join(DATATESTDIR,category)
class_num = dir1.index(category)
for img in tqdm(os.listdir(path)):
try:
img_array = cv2.imread(os.path.join(path,img)) # convert to array
test_data.append([img_array, class_num])
except Exception as e:
pass
create_test_data()
#print(len(training_data))
#print(len(test_data))
X = []
y = []
Xtest = []
ytest = []
for features,label in training_data:
X.append(features)
y.append(label)
for features,label in test_data:
Xtest.append(features)
ytest.append(label)
X = np.array(X).reshape(-1, 640, 480, 3)
Xtest= np.array(Xtest).reshape(-1, 640, 480, 3)
y = np.array(y)
ytest = np.array(ytest)
y = to_categorical(y)
ytest = to_categorical(ytest)
X = X/255.0
Xtest = Xtest/255.0
X,y = shuffle(X,y)
Xtest,ytest = shuffle(Xtest,ytest)
Thanks and sorry for mistakes, i hope its understandable what i wanna to achieve
Every advice is much appreciated
Samo
You are facing an overfitting problem because your data are so small and the model in more complex than needed. you can try the following:
Add more data if you can.
Try to simplify the model by removing some layers.
Add dropout to the model and use regularizes.
Use smaller number of epochs.

Classification model produces extremely low test accuracy, although training and validation accuracies are good for multiclass classification

I'm trying to do alphabet classification for American Sign Language. So it's multiclass classification task with 26 classes. My CNN model gave 84% training accuracy and 91% validation accuracy, yet test accuracy is hilariously low - only 7.7% !!!
I used ImageDataGenerator to produce training and validation data:
datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=0.2,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=0.05,
horizontal_flip=True,
fill_mode='nearest',
validation_split=0.2)
img_height = img_width = 256
batch_size = 16
source = '/home/hp/asl_detection/train'
train_generator = datagen.flow_from_directory(
source,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle=True,
class_mode='categorical',
subset='training', # set as training data
color_mode='grayscale',
seed=42,
)
validation_generator = datagen.flow_from_directory(
source,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle=True,
class_mode='categorical',
subset='validation', # set as validation data
color_mode='grayscale',
seed=42,
)
This is my model code :
img_rows = 256
img_cols = 256
def get_net():
inputs = Input((img_rows, img_cols, 1))
print("inputs shape:",inputs.shape)
#Convolution layers
conv1 = Conv2D(24, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
print("conv1 shape:",conv1.shape)
conv2 = Conv2D(24, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
print("conv2 shape:",conv2.shape)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv2)
print("pool1 shape:",pool1.shape)
drop1 = Dropout(0.25)(pool1)
conv3 = Conv2D(36, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(drop1)
print("conv3 shape:",conv3.shape)
conv4 = Conv2D(36, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
print("conv4 shape:",conv4.shape)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv4)
print("pool2 shape:",pool2.shape)
drop2 = Dropout(0.25)(pool2)
conv5 = Conv2D(48, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(drop2)
print("conv5 shape:",conv5.shape)
conv6 = Conv2D(48, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
print("conv6 shape:",conv6.shape)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv6)
print("pool3 shape:",pool3.shape)
drop3 = Dropout(0.25)(pool3)
#Flattening
flat = Flatten()(drop3)
#Fully connected layers
dense1 = Dense(128, activation = 'relu', use_bias=True, kernel_initializer = 'he_normal')(flat)
print("dense1 shape:",dense1.shape)
drop4 = Dropout(0.5)(dense1)
dense2 = Dense(128, activation = 'relu', use_bias=True, kernel_initializer = 'he_normal')(drop4)
print("dense2 shape:",dense2.shape)
drop5 = Dropout(0.5)(dense2)
dense4 = Dense(26, activation = 'softmax', use_bias=True, kernel_initializer = 'he_normal')(drop5)
print("dense4 shape:",dense4.shape)
model = Model(input = inputs, output = dense4)
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=0.00000001, decay=0.0)
model.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'])
return model
This is training code :
def train():
model = get_net()
print("got model")
model.summary()
model_checkpoint = ModelCheckpoint('seqnet.hdf5', monitor='loss',verbose=1, save_best_only=True)
print('Fitting model...')
history = model.fit_generator(
train_generator,
steps_per_epoch = train_generator.samples // batch_size,
validation_data = validation_generator,
validation_steps = validation_generator.samples // batch_size,
epochs = 100)
# list all data in history
print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
return model
model = train()
This is training log for last few epochs:
Epoch 95/100
72/72 [==============================] - 74s 1s/step - loss: 0.4326 - acc: 0.8523 - val_loss: 0.2198 - val_acc: 0.9118
Epoch 96/100
72/72 [==============================] - 89s 1s/step - loss: 0.4591 - acc: 0.8418 - val_loss: 0.1944 - val_acc: 0.9412
Epoch 97/100
72/72 [==============================] - 90s 1s/step - loss: 0.4387 - acc: 0.8533 - val_loss: 0.2802 - val_acc: 0.8971
Epoch 98/100
72/72 [==============================] - 106s 1s/step - loss: 0.4680 - acc: 0.8349 - val_loss: 0.2206 - val_acc: 0.9228
Epoch 99/100
72/72 [==============================] - 85s 1s/step - loss: 0.4459 - acc: 0.8427 - val_loss: 0.2861 - val_acc: 0.9081
Epoch 100/100
72/72 [==============================] - 74s 1s/step - loss: 0.4639 - acc: 0.8472 - val_loss: 0.2866 - val_acc: 0.9191
dict_keys(['val_loss', 'loss', 'acc', 'val_acc'])
These are the curves for model accuracies and losses:
I didn't use ImageDataGenerator to prepare test data, unlike training and validation data. For test data, I used OpenCV for converting images to grayscale, further I did normalization. In the same loop I generated the corresponding label of the image to prevent any order mismatch. I saved the image file names and labels in a csv file. Here's the code:
source = '/home/hp/asl_detection/test/unknown'
files = os.listdir(source)
test_data = []
rows = []
for file in files:
row = []
row.append(file)
row.append(file[6])
print(file)
row.append(ord(file[6]) - 97)
rows.append(row)
img = cv2.imread(os.path.join(source, file))
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.resize(img,(256, 256))
test_data.append(img)
test_data = np.array(test_data, dtype="float") / 255.0
print(test_data)
print(test_data.shape)
with open("/home/hp/asl_detection/test/alpha_class.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerows(rows)
Here are few tuples of the csv:
Further I reshaped the test image array to give channel information:
test_data = test_data.reshape((test_data.shape[0], img_rows, img_cols, 1))
Finally predicted classes and calculated accuracy on test data by fetching labels from csv:
y_proba = model.predict(test_data)
y_classes = y_proba.argmax(axis=-1)
data = pd.read_csv('/home/hp/asl_detection/test/alpha_class.csv', header=None)
original_classes = data.iloc[:, 2]
original_classes = original_classes.tolist()
y_classes = y_classes.tolist()
acc = accuracy_score(original_classes, y_classes) * 100
Could you plz find the reason behind such a low test accuracy? If any information is needed further, plz let me know.
I think you are facing an overfitting problem and the validation set misleads you. For the validation not to be misleading it has to have the same distribution of the test set, so try to generate the test and validation sets from the same distribution, also don't do data augmentation with the validation data set.

val_acc doesnt identify to be improving in order to save my model to my file path. what am id doing wrong

I've been playing around with some deep learning code i found on GitHub. I incorporated my own dataset. My code so far shows
from keras.layers import Input, Dense, concatenate, Activation
from keras.models import Model
tweet_input = Input(shape=(45,), dtype='int32')
tweet_encoder = Embedding(100000, 200, weights=[embedding_matrix], input_length=45, trainable=True)
(tweet_input)
bigram_branch = Conv1D(filters=100, kernel_size=2, padding='valid', activation='relu', strides=1)
(tweet_encoder)
bigram_branch = GlobalMaxPooling1D()(bigram_branch)
trigram_branch = Conv1D(filters=100, kernel_size=3, padding='valid', activation='relu', strides=1)
(tweet_encoder)
trigram_branch = GlobalMaxPooling1D()(trigram_branch)
fourgram_branch = Conv1D(filters=100, kernel_size=4, padding='valid', activation='relu', strides=1)
(tweet_encoder)
fourgram_branch = GlobalMaxPooling1D()(fourgram_branch)
merged = concatenate([bigram_branch, trigram_branch, fourgram_branch], axis=1)
merged = Dense(256, activation='relu')(merged)
merged = Dropout(0.2)(merged)
merged = Dense(1)(merged)
output = Activation('sigmoid')(merged)
model = Model(inputs=[tweet_input], outputs=[output])
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
from keras.callbacks import ModelCheckpoint
filepath="CNN.{epoch:02d}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True,
mode='max')
model.fit(x_train_seq, y_train, batch_size=32, epochs=5,
validation_data=(x_val_seq, y_validation), callbacks = [checkpoint])
Anyway the results i got looked a bit like this:
Train on 19926 samples, validate on 203 samples
Epoch 1/5
11840/19926 [================>.............] - ETA: 1:20 - loss: 0.0381 - accuracy: 0.97 - ETA: 1:20 - loss: 0.0385 - accuracy: 0.97 - ETA: 1:19 - loss: 0.0385 - accuracy: 0.97 - ETA: 1:19 - loss: 0.0-383 - accuracy: 0.97 - ETA: 1:19 - loss: 0.0389 - accuracy: 0.97 - ETA: 1:19 - loss: 0.0389 - accuracy: 0.97.......
it keeps going for ages
.....<keras.callbacks.callbacks.History at 0x28e6065bb08>
My val_acc doesnt identify to be improving to be saved in my file path so i can continue to import_load. what i want to do next is:
from keras.models import load_model
loaded_CNN_model = load_model(filepath)
loaded_CNN_model.evaluate(x=x_val_seq, y=y_validation)
Honestly i have been working on someone else's code to see how well i understand whats going on, but honestly im completely stuck. Have no idea what im doing wrong as the val-acc doesnt identify to be improving.
If you want to save your model for future use you can serialize it to json :
# serialize model to JSON
model_json = model.to_json()
with open("model.json", "w") as json_file:
json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("model.h5")
print("Saved model to disk")
For importing your model you can use the following code :
# Import the following method
from keras.models import model_from_json
# load json and create model
json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("model.h5")
print("Loaded model from disk")
# evaluate loaded model on test data
loaded_model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=
['accuracy'])
score = loaded_model.evaluate(X, Y, verbose=0)
print("%s: %.2f%%" % (loaded_model.metrics_names[1], score[1]*100))
source : https://machinelearningmastery.com/save-load-keras-deep-learning-models/

Categories