Notebook Implementation: https://colab.research.google.com/drive/1MoSnUlnUyWo5A15gEuFPEwNCfFl62YcW?usp=sharing
So I've been debugging a CNN model on classifying people based on ECG and I just keep getting really high accuracy from first epoch.
Background
The data is sourced from Physionet MIT-BIH, I only extracted normal beats for each individual, particularly control classes. I have segmented and converted the signals into images.
I experimented with both types of image inputs:
Normal representation VS Time series recurrent representation
I have 5 classes, each with -+2800 samples (definitely sufficient), meaning 13806 total samples. Also no class imbalance. No need for augmentation because the signals are already long and all beats really slightly appear different.
Training
Training (9664, 256, 256, 3)
Validation (3727, 256, 256, 3)
Test (415, 256, 256, 3)
My data is shuffled, in np.array() format, and normalized to 0-1. I'm using a LabelBinarizer() for classes.
Network
def block(model, fs, c):
for _ in range(c):
model.add(Conv2D(filters=fs, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.25))
return model
# Model
model = Sequential()
model.add(Conv2D(filters=64, kernel_size=(3,3), padding="same", activation='relu', input_shape=IMAGE_DIMS))
model = block(model, 64, 1)
model = block(model, 128, 2)
model = block(model, 256, 3)
# Fully Connected Layer
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
# softmax classifier
model.add(Dense(len(lb.classes_), activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
STEPS_PER_EPOCH = len(x_train) // BS
VAL_STEPS_PER_EPOCH = len(x_valid) // BS
# train the network
H = model.fit(x_train, y_train, batch_size=BS,
validation_data=(x_valid, y_valid),
steps_per_epoch=STEPS_PER_EPOCH,
validation_steps=VAL_STEPS_PER_EPOCH,
epochs=EPOCHS, verbose=1)
History
Just for 10 epochs??
Related
My cnn model is not performing well on my test set. I have trained the images on dark and white background, the image is cropped to eliminate other objects in the picture. My goal is to determine the position a person is facing on the bed.
ImageDataGenerator was used for splitting and augmenting the data.The dataset for training contains 4800 images while the validation has 1500 images.
I have 3 classes:
Facing upward
Facing left
Facing Right
The testing results gives me an accuracy of below 50% while the loss is 1.0 and above. This was evaluated using the model.evaluate
INPUT_SHAPE = (250,150,1)
traindata = ImageDataGenerator(rescale=1./255, shear_range=0.2,width_shift_range=0.1, height_shift_range=0.1, zoom_range=0.2,rotation_range=45, horizontal_flip=False, vertical_flip=False, brightness_range=[0.3,2.0])
valdata = ImageDataGenerator(rescale=1./255)
training_set = traindata.flow_from_directory(TRAIN_DIR, target_size=INPUT_SHAPE[:-1],
shuffle=True,batch_size=BATCH_SIZE, color_mode='grayscale',
class_mode='categorical')
validation_set = valdata.flow_from_directory(VAL_DIR, target_size=INPUT_SHAPE[:-1],
shuffle=False,batch_size=BATCH_SIZE, color_mode='grayscale',
class_mode='categorical')
This is the code for the model:
model = Sequential()
model.add(Conv2D(64, (3,3), activation='relu', padding='same', input_shape=INPUT_SHAPE))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(MaxPooling2D((2,2),strides=1))
model.add(Dropout(0.5))
model.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model.add(MaxPooling2D((2,2),strides=1))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation="relu"))
# model.add(Dense(512, activation="relu"))
# model.add(Dropout(0.5))
model.add(Dense(units=3, activation="softmax"))
model.compile(optimizer=Adam(lr=0.001),loss='categorical_crossentropy',metrics=['accuracy'])
history = model.fit(training_set,
epochs = 100,
validation_data = validation_set,
callbacks=[tensorboard, earlyStop]
)
P.S. I have tried most of the solutions that I searched online. Posting here was my last resort since I really can't fix this problem. I am not allowed to use pretrained models.
different combination of neural network
adding batchnormalization and regularization
changing image size
increasing the data count
different optimizers with different learning rate
You have overfitting problem, try to balance the images between the test and train data and have more layers in the model because it's and reduce dropout value.
one more thing is you could try pretrained model on the same split you have now to check out the data integrity.
I am training a CNN for image classification. Specifically, I am trying to create a lip reader that is able to classify an image of a segmented mouth with its associated phoneme. The images have a dimension of 64x64 and are flattened into a 1D array of length 4096. I have inserted the code for my current model below with its performance graphs and metrics. Does anyone have any advice for how I can continue to modify this model in order to raise the accuracy?
df = pd.read_csv("/kaggle/input/labeled-frames-resized/labeled_frames.csv", error_bad_lines=False)
labelencoder = LabelEncoder()
df['Phoneme'] = labelencoder.fit_transform(df['Phoneme'])
labels = np.asarray(df[['Phoneme']].copy())
df = df.drop(df.columns[0], axis = 1)
X_train, X_test, y_train, y_test = train_test_split(df, labels, random_state = 42, test_size = 0.2, stratify = labels)
X_train = tf.reshape(X_train, (8113, 4096, 1))
X_test = tf.reshape(X_test, (2029, 4096, 1))
model = Sequential()
model.add(Conv1D(filters= 128, kernel_size=3, activation ='relu',strides = 2, padding = 'valid', input_shape= (4096, 1)))
model.add(MaxPooling1D(pool_size=2))
model.add(Conv1D(filters= 128, kernel_size=3, activation ='relu',strides = 2, padding = 'valid'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Conv1D(filters= 128, kernel_size=3, activation ='relu',strides = 2, padding = 'valid'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.2))
model.add(MaxPooling1D(pool_size=2))
model.add(Conv1D(filters= 128, kernel_size=3, activation ='relu',strides = 2, padding = 'valid'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.2))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(39))
model.add(Activation('softmax'))
optimizer = keras.optimizers.Adam(lr=0.4)
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_train,y_train, epochs = 500, batch_size = 2048, validation_data = (X_test, y_test), shuffle = True)
You can easily convert it into 2D Convolution:
model.add(Conv2D(filters= 128, kernel_size=(3,3), activation ='relu',strides = (2,2),
padding = 'valid', input_shape= (64,64,1)))
model.add(MaxPooling2D(pool_size=(2,2))
...
model.add(Flatten())
model.add(Dense(39))
model.add(Activation('softmax'))
I've only worked with Conv1d so far because it seemed easier.
Can 1D Convolution be used on images?
Yes you can, but not recommended, unless you have a very specific case and know what you are doing. Assume your images as 1024x1024, what happens when you flatten them? The information that you extract with 2D Convolutions is more than 1D Convolutions.
Explanation:
You can use 1D convolution on images indeed, but not in every situation. (I might be wrong) When you flatten them, then every pixel will be a feature. If we wanted every pixel to be a feature, then we could use normal Dense layers after flattening also. But there would be a lot parameters to train. What I mean by this (total parameters size not included):
model= tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(...)
...
])
When you flatten them you might break the spatial coherence of the images. Using 2D convolutions might gain you accuracy. What we do with 2D convolutions is we visit the image and see what we can extract as an important feature, with max or average pooling.
You will not be able catch that much information with 1D convolutions.
We can feed the pooled feature maps into Fully Connected Layers before making predictions.
I am learning Keras using audio classification, Actually, I am implementing the code with modification from https://github.com/deepsound-project/genre-recognition/blob/master/train_model.py using Keras.
The shape of the dataset is
X_train shape = (800, 32, 1)
y_train shape = (800, 10)
X_test shape = (200, 32, 1)
y_test shape = (200, 10)
The model
model = Sequential()
model.add(Conv1D(filters=256, kernel_size=5, input_shape=(32,1), activation="relu"))
model.add(BatchNormalization(momentum=0.9))
model.add(MaxPooling1D(2))
model.add(Dropout(0.5))
model.add(Conv1D(filters=256, kernel_size=5, activation="relu"))
model.add(BatchNormalization(momentum=0.9))
model.add(MaxPooling1D(2))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation="relu", ))
model.add(Dense(10, activation='softmax'))
model.compile(
loss='categorical_crossentropy',
optimizer = Adam(lr=0.001),
metrics = ['accuracy'],
)
model.summary()
red_lr= ReduceLROnPlateau(monitor='val_loss',patience=2,verbose=2,factor=0.5,min_delta=0.01)
check=ModelCheckpoint(filepath=r'/content/drive/My Drive/Colab Notebooks/gen/cnn.hdf5', verbose=1, save_best_only = True)
History = model.fit(X_train,
y_train,
epochs=100,
#batch_size=512,
validation_data = (X_test, y_test),
verbose = 2,
callbacks=[check, red_lr],
shuffle=True )
The accuracy graph
Loss graph
I do not understand, Why the val_acc is in the range of 70%. I tried to modify the model architecture including optimizer, but no improvement.
And, Is it good to have a lot of difference between loss and val_loss.
how to improve the accuracy above 80... any help...
Thank you
I found it, I use concatenate function from Keras to concatenate all convolution layers and, it gives the best performance.
I have started with Machine Learning recently, I am learning CNN, I planned to write an application for Car Damage severity detection, with the help of this Keras blog and this github repo.
This is how car data-set looks like:
F:\WORKSPACE\ML\CAR_DAMAGE_DETECTOR\DATASET\DATA3A
├───training (979 Images for all 3 categories of training set)
│ ├───01-minor
│ ├───02-moderate
│ └───03-severe
└───validation (171 Images for all 3 categories of validation set)
├───01-minor
├───02-moderate
└───03-severe
Following code gives me only 32% of accuracy.
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
# dimensions of our images.
img_width, img_height = 150, 150
train_data_dir = 'dataset/data3a/training'
validation_data_dir = 'dataset/data3a/validation'
nb_train_samples = 979
nb_validation_samples = 171
epochs = 10
batch_size = 16
if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size)
model.save_weights('first_try.h5')
I tried:
By increasing the epochs to 10, 20,50.
By increasing images in the dataset (all validation images added to training set).
By updating the filter size in the Conv2D layer
Tried to add couple of Conv2D layer, MaxPooling layers
Also tried with different optimizers such as adam, Sgd, etc
Also Tried by updating the filter strides to (1,1) and (5,5) instead of (3,3)
Also tried by updating the changing image dimensions to (256, 256), (64, 64) from (150, 150)
But no luck, every-time I'm getting accuracy up to 32% or less than that but not more.
Any idea what I'm missing.
As in the github repo we can see, it gives 72% accuracy for the same dataset (Training -979, Validation -171). Why its not working for me.
I tried his code from the github link on my machine but it hanged up while training the dataset(I waited for more than 8 hours), so changed the approach, but still no luck so far.
Here's the Pastebin containing output of my training epochs.
The issue is caused by a mis-match between the number of output classes (three) and your choice of final layer activation (sigmoid) and loss-function (binary cross entropy).
The sigmoid function 'squashes' real values into a value between [0, 1] but it is designed for binary (two class) problems only. For multiple classes you need to use something like the softmax function. Softmax is a generalised version of sigmoid (the two should be equivalent when you have two classes).
The loss value also needs to be updated to one that can handle multiple classes - categorical cross entropy will work in this case.
In terms of code, if you modify the model definition and compilation code to the version below it should work.
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(3))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
Finally you need to specify class_mode='categorical' in your data generators. That will ensure that the output targets are formatted as a categorical 3-column matrix that has a one in the column corresponding to the correct value and zeroes elsewhere. This response format is needed by the categorical_cross_entropy loss function.
Minor correction:
model.add(Dense(1))
Should be:
model.add(Dense(3))
It has to comply with number of classes in the output.
What I want to do:
I want to train a convolutional neural network on the cifar10 dataset on just two classes. Then once I get my fitted model, I want to take all of the layers and reproduce the input image. So I want to get an image back from the network instead of a classification.
What I have done so far:
def copy_freeze_model(model, nlayers = 1):
new_model = Sequential()
for l in model.layers[:nlayers]:
l.trainable = False
new_model.add(l)
return new_model
numClasses = 2
(X_train, Y_train, X_test, Y_test) = load_data(numClasses)
#Part 1
rms = RMSprop()
model = Sequential()
#input shape: channels, rows, columns
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=(3, 32, 32)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation("relu"))
model.add(Dropout(0.5))
#output layer
model.add(Dense(numClasses))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer=rms,metrics=["accuracy"])
model.fit(X_train,Y_train, batch_size=32, nb_epoch=25,
verbose=1, validation_split=0.2,
callbacks=[EarlyStopping(monitor='val_loss', patience=2)])
print('Classifcation rate %02.3f' % model.evaluate(X_test, Y_test)[1])
##pull the layers and try to get an output from the network that is image.
newModel = copy_freeze_model(model, nlayers = 8)
newModel.add(Dense(1024))
newModel.compile(loss='mean_squared_error', optimizer=rms,metrics=["accuracy"])
newModel.fit(X_train,X_train, batch_size=32, nb_epoch=25,
verbose=1, validation_split=0.2,
callbacks=[EarlyStopping(monitor='val_loss', patience=2)])
preds = newModel.predict(X_test)
Also when I do:
input_shape=(3, 32, 32)
Does this means a 3 channel (RGB) 32 x 32 image?
What I suggest you is a stacked convolutional autoencoder. This makes unpooling layers and deconvolution compulsory. Here you can find the general idea and code in Theano (on which Keras is built):
https://swarbrickjones.wordpress.com/2015/04/29/convolutional-autoencoders-in-pythontheanolasagne/
An example definition of layers needed can be found here :
https://github.com/fchollet/keras/issues/378