Why does a binary Keras CNN always predict 1? - python

I want to build a binary classifier using a Keras CNN.
I have about 6000 rows of input data which looks like this:
>> print(X_train[0])
[[[-1.06405307 -1.06685851 -1.05989663 -1.06273152]
[-1.06295958 -1.06655996 -1.05969803 -1.06382503]
[-1.06415248 -1.06735609 -1.05999593 -1.06302975]
[-1.06295958 -1.06755513 -1.05949944 -1.06362621]
[-1.06355603 -1.06636092 -1.05959873 -1.06173742]
[-1.0619655 -1.06655996 -1.06039312 -1.06412326]
[-1.06415248 -1.06725658 -1.05940014 -1.06322857]
[-1.06345662 -1.06377347 -1.05890365 -1.06034568]
[-1.06027557 -1.06019084 -1.05592469 -1.05537518]
[-1.05550398 -1.06038988 -1.05225064 -1.05676692]]]
>>> print(y_train[0])
[1]
And then I've build a CNN by this way:
model = Sequential()
model.add(Convolution1D(input_shape = (10, 4),
nb_filter=16,
filter_length=4,
border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(Convolution1D(nb_filter=8,
filter_length=4,
border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(64))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dense(1))
model.add(Activation('softmax'))
reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9, patience=30, min_lr=0.000001, verbose=0)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(X_train, y_train,
nb_epoch = 100,
batch_size = 128,
verbose=0,
validation_data=(X_test, y_test),
callbacks=[reduce_lr],
shuffle=True)
y_pred = model.predict(X_test)
But it returns the following:
>> print(confusion_matrix(y_test, y_pred))
[[ 0 362]
[ 0 608]]
Why all predictions are ones? Why does the CNN perform so bad?
Here are the loss and acc charts:

It always predicts one because of the output in your network. You have a Dense layer with one neuron, with a Softmax activation. Softmax normalizes by the sum of exponential of each output. Since there is one output, the only possible output is 1.0.
For a binary classifier you can either use a sigmoid activation with the "binary_crossentropy" loss, or put two output units at the last layer, keep using softmax and change the loss to categorical_crossentropy.

Related

Why is my multiclass neural model not training (accuracy and loss staying same)?

I am learning neural networks. I get 98% accuracy with classical ML methods, so I think I made a coding error. The neural networks model is not learning.
Things I tried:
Changing X and y to float64 or float32
Normalizing data
Changing the activation to "linear" or "relu"
Removing Flatten()
Adding hidden layers
Using stochastic gradient descent as optimizer, instead of "adam".
Changing the y label with another label
There are 9 labels in X_train and 8 different classes in y_train.
X_train:
y_train:
Code:
model = keras.models.Sequential()
model.add(keras.layers.Input(shape=(9,)))
model.add(keras.layers.Dense(8, activation='softmax'))
model.add(layers.Flatten())
model.compile(optimizer= 'adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
Fitting:
I tried these lines by changing the target label. None of them help training the model. Some give "nan" loss, some go slightly up and down, but all of them are below 0.1% accuracy:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=20, batch_size=24)
or this:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(3, activation='relu', name='relu1'))
model.add(layers.Dense(16, activation='relu', name='relu2'))
model.add(layers.Dense(16, activation='relu', name='relu3'))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history = model.fit(x=X_train, y=y_train, epochs=20)

keras and sigmoid not predict a right class

I'm trying to fit a model that predict a target class that can be: 0, 1, 2, 3
during fitting his val_accuracy is: 1.0
but his prediction is like:
array([[1.2150223e-09]], dtype=float32)
X_train.shape
#(1992, 1, 68)
model = Sequential()
model.add(LSTM(128, input_shape=(1,X_train.shape[2])))
model.add(Dense(128, activation="relu",kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)))
model.add(Dropout(0.4))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer='adam',loss='mae', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100,batch_size=16, validation_split=0.1, shuffle=True
X_test = np.expand_dims(X_test,1)
y_test = np.expand_dims(y_test,1)
model.evaluate(X_test,y_test)
#[0.0010176461655646563, 1.0]
data = np.expand_dims(data, 1)
model.predict(data) #array([[1.2150223e-09]], dtype=float32) <---- here expected was 0, 1, 2 or 3
data.shape #(1, 1, 68)
I can't undestand what is wrong
Your model has only one output however you have four classes, so you need to change the last Dense layer to model.add(Dense(4, activation="softmax")), sigmoid is usually for binary classification. In your case, there 4 classes, so softmax need to be used. Then you will get probabilities by probabilities = model.predict(data) then use
CATEGORIES[np.argmax(probabilities)] it gives predicted class. By the way, when you want to calculate the loss, use multi-class cross entropy loss: model.compile(.., loss='categorical_crossentropy', ..)
model.add(Dense(4, activation="softmax"))
probabilities = model.predict(data)
print(CATEGORIES[np.argmax(probabilities)])

Audio processing Conv1D keras

I am learning Keras using audio classification, Actually, I am implementing the code with modification from https://github.com/deepsound-project/genre-recognition/blob/master/train_model.py using Keras.
The shape of the dataset is
X_train shape = (800, 32, 1)
y_train shape = (800, 10)
X_test shape = (200, 32, 1)
y_test shape = (200, 10)
The model
model = Sequential()
model.add(Conv1D(filters=256, kernel_size=5, input_shape=(32,1), activation="relu"))
model.add(BatchNormalization(momentum=0.9))
model.add(MaxPooling1D(2))
model.add(Dropout(0.5))
model.add(Conv1D(filters=256, kernel_size=5, activation="relu"))
model.add(BatchNormalization(momentum=0.9))
model.add(MaxPooling1D(2))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation="relu", ))
model.add(Dense(10, activation='softmax'))
model.compile(
loss='categorical_crossentropy',
optimizer = Adam(lr=0.001),
metrics = ['accuracy'],
)
model.summary()
red_lr= ReduceLROnPlateau(monitor='val_loss',patience=2,verbose=2,factor=0.5,min_delta=0.01)
check=ModelCheckpoint(filepath=r'/content/drive/My Drive/Colab Notebooks/gen/cnn.hdf5', verbose=1, save_best_only = True)
History = model.fit(X_train,
y_train,
epochs=100,
#batch_size=512,
validation_data = (X_test, y_test),
verbose = 2,
callbacks=[check, red_lr],
shuffle=True )
The accuracy graph
Loss graph
I do not understand, Why the val_acc is in the range of 70%. I tried to modify the model architecture including optimizer, but no improvement.
And, Is it good to have a lot of difference between loss and val_loss.
how to improve the accuracy above 80... any help...
Thank you
I found it, I use concatenate function from Keras to concatenate all convolution layers and, it gives the best performance.

ValueError: Error when checking input: expected cu_dnnlstm_22_input to have 3 dimensions, but got array with shape (2101, 17)

I am new to machine learning. I am having trouble getting my data into my network.
This is the error that I am receiving:
ValueError: Error when checking input: expected cu_dnnlstm_22_input to have 3 dimensions, but got array with shape (2101, 17)
I have tried adding model.add(Flatten()) before the dense layer. I would really appreciate your help!
BATCH_SIZE = 64
test_size_length = int(len(main_df)*TESTING_SIZE)
training_df = main_df[:test_size_length]
validation_df = main_df[test_size_length:]
train_x, train_y = training_df.drop('target',1).to_numpy(), training_df['target'].tolist()
validation_x, validation_y = validation_df.drop('target',1).to_numpy(), validation_df['target'].tolist()
#train_x.shape is (2101, 17)
model = Sequential()
# model.add(Flatten())
model.add(CuDNNLSTM(128, input_shape=(train_x.shape), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(CuDNNLSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(CuDNNLSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(2, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
# Compile model
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy']
)
tensorboard = TensorBoard(log_dir="logs/{}".format(NAME))
filepath = "RNN_Final-{epoch:02d}-{val_acc:.3f}" # unique file name that will include the epoch and the validation acc for that epoch
checkpoint = ModelCheckpoint("models/{}.model".format(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')) # saves only the best ones
# Train model
history = model.fit(
train_x, train_y,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_data=(validation_x, validation_y),
callbacks=[tensorboard, checkpoint],
)
# Score model
score = model.evaluate(validation_x, validation_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
# Save model
model.save("models/{}".format(NAME))
The input to your LSTM layer (CuDNNLSTM) should have shape: (batch_size, timesteps, input_dim).
It looks like you are missing one of these dimensions.
Often we can oversee the last dimension in the case where the input dimension is 1. If this is the case with your model (if you are predicting from a sequence of single numbers), then you might consider expanding the dimensions before the CuDNNLSTM layer with something like this:
model.add(Lambda(lambda t: tf.expand_dims(t, axis=-1)))
model.add(CuDNNLSTM(128))
Without knowing the problem you are working on it's hard to know whether this is a valid way forward but certainly you should keep in mind the required shape of a LSTM layer and reshape/expand dims accordingly.

Keras - Train convolution network, get auto-encoder output

What I want to do:
I want to train a convolutional neural network on the cifar10 dataset on just two classes. Then once I get my fitted model, I want to take all of the layers and reproduce the input image. So I want to get an image back from the network instead of a classification.
What I have done so far:
def copy_freeze_model(model, nlayers = 1):
new_model = Sequential()
for l in model.layers[:nlayers]:
l.trainable = False
new_model.add(l)
return new_model
numClasses = 2
(X_train, Y_train, X_test, Y_test) = load_data(numClasses)
#Part 1
rms = RMSprop()
model = Sequential()
#input shape: channels, rows, columns
model.add(Convolution2D(32, 3, 3, border_mode='same',
input_shape=(3, 32, 32)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation("relu"))
model.add(Dropout(0.5))
#output layer
model.add(Dense(numClasses))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer=rms,metrics=["accuracy"])
model.fit(X_train,Y_train, batch_size=32, nb_epoch=25,
verbose=1, validation_split=0.2,
callbacks=[EarlyStopping(monitor='val_loss', patience=2)])
print('Classifcation rate %02.3f' % model.evaluate(X_test, Y_test)[1])
##pull the layers and try to get an output from the network that is image.
newModel = copy_freeze_model(model, nlayers = 8)
newModel.add(Dense(1024))
newModel.compile(loss='mean_squared_error', optimizer=rms,metrics=["accuracy"])
newModel.fit(X_train,X_train, batch_size=32, nb_epoch=25,
verbose=1, validation_split=0.2,
callbacks=[EarlyStopping(monitor='val_loss', patience=2)])
preds = newModel.predict(X_test)
Also when I do:
input_shape=(3, 32, 32)
Does this means a 3 channel (RGB) 32 x 32 image?
What I suggest you is a stacked convolutional autoencoder. This makes unpooling layers and deconvolution compulsory. Here you can find the general idea and code in Theano (on which Keras is built):
https://swarbrickjones.wordpress.com/2015/04/29/convolutional-autoencoders-in-pythontheanolasagne/
An example definition of layers needed can be found here :
https://github.com/fchollet/keras/issues/378

Categories