Different model performance in MLP and CNN - python

I'm experimenting with geometric shape classification. My datasets are 100x100 px thresholded black and white images of squares, circles and triangles in total 3000 and 1000 for each shape. They look like these:
But I got them as a csv file, where each row is the one dimensional representation of the image and last column is label.
I used MLP from sklearn to make a classifier. It performed well. Almost 99%.
df = pd.read_csv("img_data.csv", sep=";")
df = df.sample(frac=1) # shuffling the whole dataset
X = df.drop('label', axis=1) # Because 'label' is the column of label
y = df['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)
clf = MLPClassifier(solver='adam', activation="relu",alpha=1e- 5,hidden_layer_sizes=(1000,), random_state=1, verbose=True)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print('accuracy',accuracy_score(y_test, y_pred))
Then I wanted to try with CNN. For that I used keras with tensorflow backend. But accuracy here couldn't cross above 92% even after 20 epochs. Here's my code:
df = pd.read_csv("img_data.csv", sep=";")
df = df.sample(frac=1) # shuffling the whole dataset
X = df.drop('label', axis=1) # Because 'label' is the column of label
y = df['label']
X=X.as_matrix()
X = np.reshape(X, (-1, 100, 100, 1)) #made 1d to 2d
a = list(y)
label_binarizer = sklearn.preprocessing.LabelBinarizer()
label_binarizer.fit(range(max(a)))
y = label_binarizer.transform(a) # encoding one hot for labels
X_train, X_test, y_train, y_test = train_test_split(all_images, y, test_size=0.20)
model = Sequential()
model.add(Conv2D(32, 3, activation='relu', input_shape=[100, 100, 1]))
model.add(MaxPool2D())
model.add(BatchNormalization())
model.add(Conv2D(64, 3, activation='relu'))
model.add(MaxPool2D())
model.add(BatchNormalization())
model.add(Conv2D(128, 3, activation='relu'))
model.add(MaxPool2D())
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
epochs = 20
model.fit(X_train, y_train,
validation_data=(X_test, y_test),
epochs=epochs, batch_size=64, verbose=1)

This seems to be a very simple problem. There is very little structure inside the data, so I think you could try to reduce the depth of the neural network by removing the last two convolution and max pooling layers. Instead increase the number of nodes in the fully-connected layer, like this:
model = Sequential()
model.add(Conv2D(32, 3, activation='relu', input_shape=[100, 100, 1]))
model.add(MaxPool2D())
model.add(BatchNormalization())
model.add(Conv2D(64, 3, activation='relu'))
model.add(MaxPool2D())
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(1000, activation='relu'))
model.add(Dense(3, activation='softmax'))
You could also try to use some image augmentation techniques like shifting and rotating to increase your dataset. Then I expect the convnet to outperform the standard mlp.
Best

Related

High accuracy in mode.fit but low precision and recall

I've been training a CNN with keras. A binary classificator where it says if a depth image has a manhole or not. I've manually labeled the datasets with 0 (no manhole) and 1 (it has a manhole). I have 2 datasets 1 with 45k images to train the CNN and one with 26k images to test the CNN.
Both datasets are unbalanced double of negatives images than positives.
This is the code:
# dimensions of our images.
img_width, img_height = 80, 60
n_positives_img, n_negatives_img = 17874, 26308
n_total_img = 44182
#Labeled arrays for datasets
arrayceros = np.zeros(n_negatives_img)
arrayunos = np.ones(n_positives_img)
#Reshaping of datasets to convert separate them
arraynegativos= ds_negatives.reshape(( n_negatives_img, img_height, img_width,1))
arraypositivos= ds_positives.reshape((n_positives_img, img_height, img_width,1))
#Labeling datasets with the arrays
ds_negatives_target = tf.data.Dataset.from_tensor_slices((arraynegativos, arrayceros))
ds_positives_target = tf.data.Dataset.from_tensor_slices((arraypositivos, arrayunos))
#Concatenate 2 datasets and shuffle them
ds_concatenate = ds_negatives_target.concatenate(ds_positives_target)
datasetfinal = ds_concatenate.shuffle(n_total_img)
Then I have the same for the second dataset for testing.
#Adding batch dimension to datasets 4dim
valid_ds = datasetfinal2.batch(12)
train_ds = datasetfinal.batch(12)
#Defining model
model = Sequential()
model.add(Conv2D(5, kernel_size=(5, 5),activation='relu',input_shape=(60,80,1),padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D((5, 5),padding='same'))
model.add(Dropout(0.3))
model.add(Conv2D(5, (5, 5), activation='relu',padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Dropout(0.3))
model.add(Conv2D(5, (5, 5), activation='relu',padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Dropout(0.3))
model.add(Conv2D(5, (5, 5), activation='relu',padding='same'))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(100, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(50, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))
#Compiling model
model.summary()
initial_learning_rate = 0.001
lr_schedule = keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate, decay_steps=100000, decay_rate=0.96, staircase=True
)
model.compile(
loss="binary_crossentropy",
optimizer=keras.optimizers.Adam(learning_rate=lr_schedule),
metrics=["acc"],
)
# Define callbacks.
checkpoint_cb = keras.callbacks.ModelCheckpoint(
"2d_image_classification.h5", save_best_only=True
)
early_stopping_cb = keras.callbacks.EarlyStopping(monitor="val_acc", patience=15)
#Fitting the model
history= model.fit(train_ds, validation_data=valid_ds, batch_size=100, epochs=5,callbacks=[checkpoint_cb, early_stopping_cb])
This gives me 99% of acc in train dataset and 95% in test dataset.
But when i do this it gives me 60% precision for negatives images and 45% for positives:
#Get the real labels of valid dataset
valid_labels = list(valid_ds.flat_map(lambda x, y: tf.data.Dataset.from_tensor_slices((x, y))).as_numpy_iterator())
valid_labels = [y for x, y in valid_labels]
y_pred = model.predict(valid_ds)
y_pred = (y_pred > 0.5).astype(float)
from sklearn.metrics import classification_report
print(classification_report(valid_labels, y_pred))
Why this? I have printed both predicted labels and true labels and it look likes its random. It has no sense.
https://colab.research.google.com/drive/1bhrntDItqoeT0KLb-aKp0W8cV6LOQOtP?usp=sharing
If u need more information, just ask me.
Thanks!!!!

How to predict from new data set?

I have created a model to predict emotion from a voice sample, the model is made from the code below:
there are total 8 emotions:
neutral, calm, happy, sad, angry, disgust, surprised
i first extracted the features of each and every voice sample and put them in a dataframe, then loaded
them one by one to both X and (labels to Y) then split the data as shown below:
x_train, x_test, y_train, y_test = train_test_split(X, Y, random_state=0, shuffle=True)
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
x_train = np.expand_dims(x_train, axis=2)
x_test = np.expand_dims(x_test, axis=2)
model=Sequential()
model.add(Conv1D(256, kernel_size=5, strides=1, padding='same', activation='relu', input_shape=(x_train.shape[1], 1)))
model.add(MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(Conv1D(256, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(Conv1D(128, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(Dropout(0.2))
model.add(Conv1D(64, kernel_size=5, strides=1, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=5, strides = 2, padding = 'same'))
model.add(Flatten())
model.add(Dense(units=32, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(units=8, activation='softmax'))
model.compile(optimizer = 'adam' , loss = 'categorical_crossentropy' , metrics = ['accuracy'])
model.summary()
rlrp = ReduceLROnPlateau(monitor='loss', factor=0.4, verbose=0, patience=2, min_lr=0.0000001)
history=model.fit(x_train, y_train, batch_size=64, epochs=75, validation_data=(x_test, y_test), callbacks=[rlrp])
got total 89% accuracy
Now i want to predict with a new dataset. What do i need to do?
If new_data_x_test and new_data_y_true are your new data set, then all you need to do after training the model as follows:
scaler = StandardScaler()
new_data_x_test = scaler.transform(new_data_x_test )
new_data_x_test= np.expand_dims(new_data_x_test, axis=2)
model.load_weight(h5)
new_data_y_pred = model.predict(new_data_x_test )
The thing is, you should transform it according to the model requirements. Next, evaluate on new_data_y_true and new_data_y_pred with appropriate evaluation metrics.

Replace MLP with CNN

I have built up a NN with following architecture:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=0)
print(X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)
(1901, 456, 3) (476, 456, 3) (1901, 3, 3) (476, 3, 3)
model = Sequential()
model.add(Flatten(input_shape=(456,3)))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(3 * 3))
model.add(Reshape((3, 3)))
model.compile('adam', 'mse')
history = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs=100)
Now I want to replace this architecture with a analogue CNN which does the same; but when trying to implement this I always get problems with the dimensions of the different layers. And my error is always like this
ValueError: Error when checking input: expected conv2d_3_input to have 4 dimensions, but got array with shape (x, x, x)
the dataset remains the same, just the NN architecture changes and this is my first approach:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=(1901,456,3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))
Can someone help me out to replace my first NN into a CNN?
Your network is well defined, the error you're getting is during the fit operation. And why is that the case.
Well Conv2D is looking for data with 4D shape as you can see here : doc
X_train shape must then be (samples, channels, rows, cols)
When you gave input_shape=(1901,456,3), you didn't have to specify the number of samples.
But during the fit operation you need to have a data shaped as (samples, channels, rows, cols) .
And now you see that you have a problem. Why is your X_train shaped like that, it seems that you only have one image. You can feed it by reshaping it using :
X_train = X_train.reshape((1, 1901, 456, 3))
But that seems odd, you're only feeding one image to your network.
Edit : after clarification on the comments, conv1D will be better in this type of case, here is how to do it:
model = Sequential()
model.add(Conv1D(32, kernel_size=3,
activation='relu',
input_shape=(456,3)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3 * 3, activation='softmax'))
model.add(Reshape((3, 3))
now everything worked with the architecture and there is also no problem when compiling the NN;
batch_size = 128
epochs = 12
model.compile(
optimizer='rmsprop',
loss=tf.keras.losses.MeanSquaredError(),
metrics=['mse'],
)
model.fit(X_test, Y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
but when trying to fit I get following error :
ValueError: Input arrays should have the same number of samples as target
arrays. Found 476 input samples and 1901 target samples.
what am I missing here?

Audio processing Conv1D keras

I am learning Keras using audio classification, Actually, I am implementing the code with modification from https://github.com/deepsound-project/genre-recognition/blob/master/train_model.py using Keras.
The shape of the dataset is
X_train shape = (800, 32, 1)
y_train shape = (800, 10)
X_test shape = (200, 32, 1)
y_test shape = (200, 10)
The model
model = Sequential()
model.add(Conv1D(filters=256, kernel_size=5, input_shape=(32,1), activation="relu"))
model.add(BatchNormalization(momentum=0.9))
model.add(MaxPooling1D(2))
model.add(Dropout(0.5))
model.add(Conv1D(filters=256, kernel_size=5, activation="relu"))
model.add(BatchNormalization(momentum=0.9))
model.add(MaxPooling1D(2))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation="relu", ))
model.add(Dense(10, activation='softmax'))
model.compile(
loss='categorical_crossentropy',
optimizer = Adam(lr=0.001),
metrics = ['accuracy'],
)
model.summary()
red_lr= ReduceLROnPlateau(monitor='val_loss',patience=2,verbose=2,factor=0.5,min_delta=0.01)
check=ModelCheckpoint(filepath=r'/content/drive/My Drive/Colab Notebooks/gen/cnn.hdf5', verbose=1, save_best_only = True)
History = model.fit(X_train,
y_train,
epochs=100,
#batch_size=512,
validation_data = (X_test, y_test),
verbose = 2,
callbacks=[check, red_lr],
shuffle=True )
The accuracy graph
Loss graph
I do not understand, Why the val_acc is in the range of 70%. I tried to modify the model architecture including optimizer, but no improvement.
And, Is it good to have a lot of difference between loss and val_loss.
how to improve the accuracy above 80... any help...
Thank you
I found it, I use concatenate function from Keras to concatenate all convolution layers and, it gives the best performance.

Keras - Train convolution network, get auto-encoder output

What I want to do:
I want to train a convolutional neural network on the cifar10 dataset on just two classes. Then once I get my fitted model, I want to take all of the layers and reproduce the input image. So I want to get an image back from the network instead of a classification.
What I have done so far:
def copy_freeze_model(model, nlayers = 1):
new_model = Sequential()
for l in model.layers[:nlayers]:
l.trainable = False
new_model.add(l)
return new_model
numClasses = 2
(X_train, Y_train, X_test, Y_test) = load_data(numClasses)
#Part 1
rms = RMSprop()
model = Sequential()
#input shape: channels, rows, columns
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=(3, 32, 32)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation("relu"))
model.add(Dropout(0.5))
#output layer
model.add(Dense(numClasses))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer=rms,metrics=["accuracy"])
model.fit(X_train,Y_train, batch_size=32, nb_epoch=25,
verbose=1, validation_split=0.2,
callbacks=[EarlyStopping(monitor='val_loss', patience=2)])
print('Classifcation rate %02.3f' % model.evaluate(X_test, Y_test)[1])
##pull the layers and try to get an output from the network that is image.
newModel = copy_freeze_model(model, nlayers = 8)
newModel.add(Dense(1024))
newModel.compile(loss='mean_squared_error', optimizer=rms,metrics=["accuracy"])
newModel.fit(X_train,X_train, batch_size=32, nb_epoch=25,
verbose=1, validation_split=0.2,
callbacks=[EarlyStopping(monitor='val_loss', patience=2)])
preds = newModel.predict(X_test)
Also when I do:
input_shape=(3, 32, 32)
Does this means a 3 channel (RGB) 32 x 32 image?
What I suggest you is a stacked convolutional autoencoder. This makes unpooling layers and deconvolution compulsory. Here you can find the general idea and code in Theano (on which Keras is built):
https://swarbrickjones.wordpress.com/2015/04/29/convolutional-autoencoders-in-pythontheanolasagne/
An example definition of layers needed can be found here :
https://github.com/fchollet/keras/issues/378

Categories