Keras model always predicting the same class - python

I am working on a project where I have to classify genomic sequences as either positive or negative. So basically, I have sequences that are in the form 'accccttttttgggg...'
These sequences are 150 characters long consisting of a, c, t and g characters. I perform one hot encoding of the data and create a dataframe that contains 600 columns (150 x 4) for sequences plus one column for label (either 0 or 1). I then pass this data through CNN. The model performs well on the training and validation data but when I see the predictions on test data it always predicts a single label 0. Can anyone help me know why does this happen and what I am doing wrong here. This is the model that I have been using with epochs=50, learning_rate=0.00001 and batch_size=64
model = Sequential()
model.add(Conv1D(32, kernel_size=5, input_shape=(600, 1), activation='relu', padding='same'))
model.add(MaxPooling1D())
model.add(Conv1D(16, kernel_size=5, activation='relu', padding='same'))
model.add(MaxPooling1D())
model.add(Conv1D(8, kernel_size=5, activation='relu', padding='same'))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(4, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
adam = Adam(learning_rate=learning_rate)
model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])

Related

CNN is not getting good accuracy using unseen data

My cnn model is not performing well on my test set. I have trained the images on dark and white background, the image is cropped to eliminate other objects in the picture. My goal is to determine the position a person is facing on the bed.
ImageDataGenerator was used for splitting and augmenting the data.The dataset for training contains 4800 images while the validation has 1500 images.
I have 3 classes:
Facing upward
Facing left
Facing Right
The testing results gives me an accuracy of below 50% while the loss is 1.0 and above. This was evaluated using the model.evaluate
INPUT_SHAPE = (250,150,1)
traindata = ImageDataGenerator(rescale=1./255, shear_range=0.2,width_shift_range=0.1, height_shift_range=0.1, zoom_range=0.2,rotation_range=45, horizontal_flip=False, vertical_flip=False, brightness_range=[0.3,2.0])
valdata = ImageDataGenerator(rescale=1./255)
training_set = traindata.flow_from_directory(TRAIN_DIR, target_size=INPUT_SHAPE[:-1],
shuffle=True,batch_size=BATCH_SIZE, color_mode='grayscale',
class_mode='categorical')
validation_set = valdata.flow_from_directory(VAL_DIR, target_size=INPUT_SHAPE[:-1],
shuffle=False,batch_size=BATCH_SIZE, color_mode='grayscale',
class_mode='categorical')
This is the code for the model:
model = Sequential()
model.add(Conv2D(64, (3,3), activation='relu', padding='same', input_shape=INPUT_SHAPE))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(MaxPooling2D((2,2),strides=1))
model.add(Dropout(0.5))
model.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model.add(MaxPooling2D((2,2),strides=1))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation="relu"))
# model.add(Dense(512, activation="relu"))
# model.add(Dropout(0.5))
model.add(Dense(units=3, activation="softmax"))
model.compile(optimizer=Adam(lr=0.001),loss='categorical_crossentropy',metrics=['accuracy'])
history = model.fit(training_set,
epochs = 100,
validation_data = validation_set,
callbacks=[tensorboard, earlyStop]
)
P.S. I have tried most of the solutions that I searched online. Posting here was my last resort since I really can't fix this problem. I am not allowed to use pretrained models.
different combination of neural network
adding batchnormalization and regularization
changing image size
increasing the data count
different optimizers with different learning rate
You have overfitting problem, try to balance the images between the test and train data and have more layers in the model because it's and reduce dropout value.
one more thing is you could try pretrained model on the same split you have now to check out the data integrity.

How to get weights from keras model?

I'm trying to build a 2 layered neural network for MNIST dataset and I want to get weights from my model.
I found a similar question her on SO and I tried this,
model.get_weights()
But It returned 11 values when I check the len(model.get_weights()) Isn't it suppose to return 3 weights? I have even disabled bias.
model = Sequential()
model.add(Flatten(input_shape = (28, 28)))
model.add(Dense(512, activation='relu', kernel_initializer='he_normal', use_bias=False,))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(Dense(128, activation='relu', kernel_initializer='he_normal', use_bias=False,))
model.add(BatchNormalization())
model.add(Dropout(0.1))
model.add(Dense(10, activation='relu', kernel_initializer='he_normal', use_bias=False,))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
result = model.fit(x_train, y_train, validation_split=0.25, epochs=10,
batch_size=128, verbose=1)
To get the weights of a particular layer, you could retrieve this layer by using its name and call get_weights on it (as shubham-panchal said in its comment).
For example:
model.get_layer('dense').get_weights()
or
model.get_layer('dense_2').get_weights()
You could go though the layers of your model and retrieve its name and weights:
{layer.name: layer.get_weights() for layer in model.layers}

Why is this Keras OneHot layer implementation performing different then OneHot training data?

I want to learn a convnet to classify > 240.000 docs in approx 2000 classes. For this I selected the first 60 words and converted them to indices.
I tried to implement a OneHot layer in Keras to avoid memory issues but the model performs much worse than the model with the data already prepared as OneHot. What is the real difference?
The models summary reports are similar in shape and parameters except for the additional One_hot Lambda layer. I used the One_Hot function described here: https://fdalvi.github.io/blog/2018-04-07-keras-sequential-onehot/
def OneHot(input_dim=None, input_length=None):
# input_dim refers to the eventual length of the one-hot vector (e.g.
vocab size)
# input_length refers to the length of the input sequence
# Check if inputs were supplied correctly
if input_dim is None or input_length is None:
raise TypeError("input_dim or input_length is not set")
# Helper method (not inlined for clarity)
def _one_hot(x, num_classes):
return K.one_hot(K.cast(x, 'uint8'),
num_classes=num_classes)
# Final layer representation as a Lambda layer
return Lambda(_one_hot,
arguments={'num_classes': input_dim},
input_shape=(input_length,))
# Model A : This is the Keras model I use with the OneHot function:
model = Sequential()
model.add(OneHot(input_dim=model_max,
input_length=input_length))
model.add(Conv1D(256, 6, activation='relu'))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(labels_max, activation='softmax'))
checkpoint = ModelCheckpoint('model-best.h5', verbose=1,
monitor='val_loss',save_best_only=True, mode='auto')
model.compile(optimizer=Adam(),
loss='categorical_crossentropy',
metrics=['accuracy'])
#Model B: And this model I use with the data already converted to OneHot:
model = Sequential()
model.add(Conv1D(256, 6, activation='relu', input_shape=(input_length,
model_max)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(labels_max, activation='softmax'))
checkpoint = ModelCheckpoint('model-best.h5', verbose=1,
monitor='val_loss',save_best_only=True, mode='auto')
model.compile(optimizer=Adam(),
loss='categorical_crossentropy',
metrics=['accuracy'])
Model B is performing much better with validation accuracy up to 60% but it runs easily into memory errors.
Model A is much faster but only reaches a maximum validation accuracy of 25%.
I would expect them to perform similar. What am I missing here? Thanks!

LSTM input shape for multivariate time series?

I know this question is asked many times, but I truly can't fix this input shape issue for my case.
My x_train shape == (5523000, 13) // (13 timeseries of length 5523000)
My y_train shape == (5523000, 1)
number of classes == 2
To reshape the x_train and y_train:
x_train= x_train.values.reshape(27615,200,13) # 5523000/200 = 27615
y_train= y_train.values.reshape((5523000,1)) # I know I have a problem here but I dont know how to fix it
Here is my lstm network :
def lstm_baseline(x_train, y_train):
batch_size=200
model = Sequential()
model.add(LSTM(batch_size, input_shape=(27615,200,13),
activation='relu', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1, activation='softmax'))
model.compile(
loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x_train,y_train, epochs= 15)
return model
Whenever I run the code I get this error :
ValueError: Input 0 is incompatible with layer lstm_10: expected
ndim=3, found ndim=4
My question is what I am missing here?
PS: The idea of the project is that I have 13 signals coming from the 13 points of the human body, I want to use them to detect a certain type of diseases (an arousal). By using the LSTM, I want my model to locate the regions where I have that arousal based on these 13 signals.
.
The whole data is 993 patients, for each one I use 13 signals to detect the disorder regions.
if you want me to put the data in 3D dimensions:
(500000 ,13, 993) # (nb_recods, nb_signals, nb_patient)
for each patient I have 500000 observations of 13 signals.
nb_patient is 993
It worth noting that the 500000 size doesn't matter ! as i can have patients with more observations or less than that.
Update: here is a sample data of one patient.
Here is a chunk of my data first 2000 rows
Ok, I did some changes to your code. First, I still don't now what the "200" in your attempt to reshape your data means, so I'm gonna give you a working code and let's see if you can use it or you can modify it to make your code work. The size of your input data and your targets, have to match. You can not have an input x_train with 27615 rows (which is the meaning of x_train[0] = 27615) and a target set y_train with 5523000 values.
I took the first two rows from the data example that you provided for this example:
x_sample = [[-17, -7, -7, 0, -5, -18, 73, 9, -282, 28550, 67],
[-21, -16, -7, -6, -8, 15, 60, 6, -239, 28550, 94]]
y_sample = [0, 0]
Let's reshape x_sample:
x_train = np.array(example)
#Here x_train.shape = (2,11), we want to reshape it to (2,11,1) to
#fit the network's input dimension
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], 1)
You are using a categorical loss, so you have to change your targets to categorical (chek https://keras.io/utils/)
y_train = np.array(target)
y_train = to_categorical(y_train, 2)
Now you have two categories, I assumed two categories as in the data that you provided all the targets values are 0, so I don't know how many possible values your target can take. If your target can take 4 possible values, then the number of categories in the to_categorical function will be 4. Every output of your last dense layer will represent a category and the value of that output, the probability of your input to belong to that category.
Now, we just have to slightly modify your LSTM model:
def lstm_baseline(x_train, y_train):
batch_size = 200
model = Sequential()
#We are gonna change input shape for input_dim
model.add(LSTM(batch_size, input_dim=1,
activation='relu', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
#We are gonna set the number of outputs to 2, to match with the
#number of categories
model.add(Dense(2, activation='softmax'))
model.compile(
loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=15)
return model
You may try some modifications like this below:
x_train = x_train.reshape(1999, 1, 13)
# double-check dimensions
x_train.shape
def lstm_baseline(x_train, y_train, batch_size):
model = Sequential()
model.add(LSTM(batch_size, input_shape=(None, 13),
activation='relu', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1, activation='softmax'))
model.compile(
loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model

Keras top_k_categorical_accuracy metric is extremely low compared to accuracy

I created a CNN model using the cifar100 dataset from Keras. When adding the top_k_categorical_accuracy metric, I should be seeing the accuracy of when one of the top 5 predicted classes is the correct class. However, when training, top_k_categorical_accuracy stays very small, around 4-5%, as accuracy and validation accuracy increase all the way to 40-50%. Top 5 accuracy should be much higher than normal accuracy, instead its giving very odd results. I wrote my own metric using different k values but still the same issue. Even when I use k=1, which should then give the same value of accuracy, the same issue occurs.
Model code:
cnn = Sequential()
cnn.add(Conv2D(filters=200, kernel_size=2, padding='same', activation='relu', input_shape=(train_images.shape[1:])))
cnn.add(Conv2D(filters=200, kernel_size=2, padding='same', activation='relu'))
cnn.add(Conv2D(filters=200, kernel_size=2, padding='same', activation='relu'))
cnn.add(MaxPooling2D(pool_size=2, padding='same'))
cnn.add(Dropout(0.4))
cnn.add(Conv2D(filters=200, kernel_size=2, padding='same', activation='relu'))
cnn.add(Conv2D(filters=200, kernel_size=2, padding='same', activation='relu'))
cnn.add(Conv2D(filters=200, kernel_size=2, padding='same', activation='relu'))
cnn.add(Conv2D(filters=200, kernel_size=2, padding='same', activation='relu'))
cnn.add(Dropout(0.4))
cnn.add(MaxPooling2D(pool_size=2, padding='same'))
cnn.add(Dropout(0.5))
cnn.add(Flatten())
cnn.add(Dense(550, activation='relu'))
cnn.add(Dropout(0.4))
cnn.add(Dense(100, activation='softmax'))
Compile code:
cnn.compile(loss='sparse_categorical_crossentropy', optimizer=opt.Adam(lr=learn_rate), metrics=['accuracy', 'top_k_categorical_accuracy'])
Turns out, since I am using the sparse_categorical_crossentropy loss function I need to use the sparse_top_k_categorical_accuracy function. This metric also requires your labels be flattened to one dimension. After that, metric is correct and model is training!

Categories