Fluctuating Accuracy/loss curves - LSTM keras - python

I am creating a LSTM model for human activity recognition and I keep always getting fluctuating but increasing Train and loss curves.
The following architecture gave these curves Train and loss curves :
model = Sequential()
model.add(LSTM(units = 128, return_sequences=True , input_shape=(3500, 11)))
model.add(Dropout(0.5))
model.add(Dense(units= 64, activation='relu'))
model.add(LSTM(units = 128, return_sequences=False , input_shape=(3500, 11)))
model.add(Dropout(0.5))
model.add(Dense(units= 64, activation='relu'))
model.add(Dense(4, activation='softmax'))
adam = tf.keras.optimizers.Adam(learning_rate=0.0020, beta_1=0.9, beta_2=0.999,
epsilon=None, decay=0.0, amsgrad=False, clipnorm=1.)
model.compile(optimizer=adam ,loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(Gen, validation_data=val_Gen, epochs=30, callbacks=[tensorboard_callback],
verbose=1).history
I tried changing the models architecture with different hyperparameters but nothing improved.
I am using TimeSeriesGenerator from keras to generate batches.
Does anyone have a suggestion ?

Related

Why is my multiclass neural model not training (accuracy and loss staying same)?

I am learning neural networks. I get 98% accuracy with classical ML methods, so I think I made a coding error. The neural networks model is not learning.
Things I tried:
Changing X and y to float64 or float32
Normalizing data
Changing the activation to "linear" or "relu"
Removing Flatten()
Adding hidden layers
Using stochastic gradient descent as optimizer, instead of "adam".
Changing the y label with another label
There are 9 labels in X_train and 8 different classes in y_train.
X_train:
y_train:
Code:
model = keras.models.Sequential()
model.add(keras.layers.Input(shape=(9,)))
model.add(keras.layers.Dense(8, activation='softmax'))
model.add(layers.Flatten())
model.compile(optimizer= 'adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
Fitting:
I tried these lines by changing the target label. None of them help training the model. Some give "nan" loss, some go slightly up and down, but all of them are below 0.1% accuracy:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=20, batch_size=24)
or this:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(3, activation='relu', name='relu1'))
model.add(layers.Dense(16, activation='relu', name='relu2'))
model.add(layers.Dense(16, activation='relu', name='relu3'))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history = model.fit(x=X_train, y=y_train, epochs=20)

Validation loss increase and constant training accuracy 1D cnn

I'm implementing a CNN for speech recognition. The input is MEL frequencies with shape (85314, 99, 1) and the labels are one-hot encoded with 35 output classes (shape: (85314, 35)). When I run the model the training accuracy (image 2) starts high and stays the same over the number of epochs, while the loss on validation (image 1) increases. Hence, it is probably overfitting but I cannot find the origin of the issue. I already decreased the learning rate and played with batch sizes but the results stays the same. Also the amount of training data should be sufficient. Is there another issue with my hyper-parameter settings somewhere?
My model and hyper-parameters are defined as follows:
#hyperparameters
input_dimension = 85314
learning_rate = 0.0000025
momentum = 0.85
hidden_initializer = random_uniform(seed=1)
dropout_rate = 0.2
# create model
model = Sequential()
model.add(Convolution1D(nb_filter=32, filter_length=3, input_shape=(99, 1), activation='relu'))
model.add(Convolution1D(nb_filter=16, filter_length=1, activation='relu'))
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu'))
model.add(Dense(35, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['acc'])
history = model.fit(frequencies_train, labels_hot, validation_split=0.2, epochs=10, batch_size=50)
You are using "binary_crossentropy" for a problem of multiple classes. Change it to "categorical_crossentrop".
The accuracy computed with Keras using the binary_crossentropy with a model of more than 2 labels is just wrong.

Why does a binary Keras CNN always predict 1?

I want to build a binary classifier using a Keras CNN.
I have about 6000 rows of input data which looks like this:
>> print(X_train[0])
[[[-1.06405307 -1.06685851 -1.05989663 -1.06273152]
[-1.06295958 -1.06655996 -1.05969803 -1.06382503]
[-1.06415248 -1.06735609 -1.05999593 -1.06302975]
[-1.06295958 -1.06755513 -1.05949944 -1.06362621]
[-1.06355603 -1.06636092 -1.05959873 -1.06173742]
[-1.0619655 -1.06655996 -1.06039312 -1.06412326]
[-1.06415248 -1.06725658 -1.05940014 -1.06322857]
[-1.06345662 -1.06377347 -1.05890365 -1.06034568]
[-1.06027557 -1.06019084 -1.05592469 -1.05537518]
[-1.05550398 -1.06038988 -1.05225064 -1.05676692]]]
>>> print(y_train[0])
[1]
And then I've build a CNN by this way:
model = Sequential()
model.add(Convolution1D(input_shape = (10, 4),
nb_filter=16,
filter_length=4,
border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(Convolution1D(nb_filter=8,
filter_length=4,
border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(64))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dense(1))
model.add(Activation('softmax'))
reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9, patience=30, min_lr=0.000001, verbose=0)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(X_train, y_train,
nb_epoch = 100,
batch_size = 128,
verbose=0,
validation_data=(X_test, y_test),
callbacks=[reduce_lr],
shuffle=True)
y_pred = model.predict(X_test)
But it returns the following:
>> print(confusion_matrix(y_test, y_pred))
[[ 0 362]
[ 0 608]]
Why all predictions are ones? Why does the CNN perform so bad?
Here are the loss and acc charts:
It always predicts one because of the output in your network. You have a Dense layer with one neuron, with a Softmax activation. Softmax normalizes by the sum of exponential of each output. Since there is one output, the only possible output is 1.0.
For a binary classifier you can either use a sigmoid activation with the "binary_crossentropy" loss, or put two output units at the last layer, keep using softmax and change the loss to categorical_crossentropy.

Training with Adam gets slower each epoch

I am training a simple neural network in Keras with Theano backend consisting of 4 dense layers connected to a Merge layer and then to a softmax classifier layer. Using Adam for training, the first few epochs train in about 60s each (in the CPU) but, after that, the training time per epoch starts increasing, taking more than 400s by epoch 70, making it unusable.
Is there anything wrong with my code or is this suppose to happen?
This only happens when using Adam, not with sgd, adadelta, rmsprop or adagrad. I'd use any of the other methods but Adam produces far better results.
The code:
modela = Sequential()
modela.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modelb = Sequential()
modelb.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modelc = Sequential()
modelc.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modeld = Sequential()
modeld.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
model = Sequential()
model.add(Merge([modela, modelb, modelc, modeld], mode='concat', concat_axis=1))
model.add(Dense(258, init='uniform', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
hist = model.fit([Xa, Xb, Xc, Xd], Ycat, validation_split=.25, nb_epoch=80, batch_size=100, verbose=2)

NN Accuracy Saturates After the Very First Epoch with Keras

I'm trying to fit a simple Neural Network to predict a binary target using keras-1.0.6. The output saturates after the very first epoch. I try playing around with the learning rate (from 0.1 to 1e-6), decay and momentum of the SGD optimizer and with the layers (10-512 hidden neurons and 1-2 hidden layers) and their activation functions of the network, but nothing worked - the prediction accuracy was the same.
My training set has shape (13602, 115) and my validation set has shape (3400,115). The target variable y_train and y_test have values encoded as 1 and 0 (60% are 1's and 40% are 0's). At first, the data was not normalized though when I normalized it I got the same results.
Verifying the output, I see that the model is predicting only 1 class. Sometimes it predicts only 1's and other times only 0's (when I tweak the model).
I also tried to encode the target variable in the shape (n_sample, 2) but the output was the same.
I followed some questions here and googling that suggests tunning the learning rate and not using 'softmax' activation but couldn't improve the results.
Some of the models I tried is below:
The simplest model:
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
Model 2:
model = Sequential()
model.add(Dense(512, input_dim=X_train.shape[1]))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation('sigmoid'))
Model 3
model.add(Dense(64, input_dim=X_train.shape[1], init='uniform', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
Model 4
model.add(Dense(64, input_dim=X_train.shape[1], init='uniform', activation='sigmoid'))
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
and to compile and fit the model:
sgd = SGD(lr=0.01, decay=0.1, momentum=0.0, nesterov=True)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train2, nb_epoch=5, batch_size=50, validation_split=0.2)
model.predict(X_test)
The output gives either [0,0,0,0,0,0,0,...] or [1,1,1,1,1,1,1,1,...]
Does anybody have a clue on what's going on here?

Categories