Keras - Validation accuracy not matching self-measured accuracy in LSTM? - python

After two epochs the validation accuracy of my model shows .30, but when I return the predicted classes using model.predict_generator, and measure the accuracy myself - the accuracy is much lower at about .18.
Why are these methods returning different accuracies? I believe it may be related to my implementation or understanding of timeseriesgenerator.
data_gen_train = sequence.TimeseriesGenerator(X, y_ct, timesteps, sampling_rate=1, stride=1, start_index=0, end_index=len(y), batch_size=batch_size)
data_gen_test = sequence.TimeseriesGenerator(X_ho, y_ho_ct, timesteps, sampling_rate=1, stride=1, start_index=0, end_index=len(y), batch_size=batch_size)
model = Sequential()
model.add(LSTM(20, stateful=True, batch_input_shape=(batch_size, timesteps, data_dim)))
model.add(Dense(9, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer= 'Nadam', metrics=['accuracy'])
model.fit_generator(data_gen_train, validation_data=data_gen_test, epochs=epochs, shuffle=False, validation_steps= len(y_ho) //batch_size)
y_pred = model.predict_generator(data_gen_test, steps= len(y_ho)//batch_size)
enter image description here

I found the answer to my question above. The difference in accuracy measurements was due to the pred and ground truth dataset being shifted relative to each other. Keras' pad_sequences should have been used here to avoid this issue.

Related

Overfitting in LSTM even after using regularizers

I am having a time series prediction problem and building an LSTM like below :
def create_model():
model = Sequential()
model.add(LSTM(50,kernel_regularizer=l2(0.01), recurrent_regularizer=l2(0.01), bias_regularizer=l2(0.01), input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dropout(0.591))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
return model
When I train the model on 5 splits like below :
tss = TimeSeriesSplit(n_splits = 5)
X = data.drop(labels=['target_prediction'], axis=1)
y = data['target_prediction']
for train_index, test_index in tss.split(X):
train_X, test_X = X.iloc[train_index, :].values, X.iloc[test_index,:].values
train_y, test_y = y.iloc[train_index].values, y.iloc[test_index].values
model=create_model()
history = model.fit(train_X, train_y, epochs=10, batch_size=64,validation_data=(test_X, test_y), verbose=0, shuffle=False)
I get an overfitting problem. The graph of loss is attached
I am not sure why there is overfitting when I use regularizers in my Keras model. Any help is appreciated .
EDIT:
Tried the architectures
def create_model():
model = Sequential()
model.add(LSTM(20, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
return model
def create_model(x,y):
# define LSTM
model = Sequential()
model.add(Bidirectional(LSTM(20, return_sequences=True), input_shape=(x,y)))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
model.compile(loss='mean_squared_error', optimizer='adam')
return model
but still it is overfitting.
First of all remove all your regularizers and dropout. You are literally spamming with all the tricks out there and 0.5 dropout is too high.
Reduce the number of units in your LSTM. Start from there. Reach a point where your model stops overfitting.
Then, add dropout if required.
After that, the next step is to add the tf.keras.Bidirectional. If still, you are not satfisfied then, increase number of layers. Remember to keep return_sequences True for every LSTM layer except the last one.
It is seldom I come across networks using layer regularization despite the availability because dropout and layer regularization have a same effect and people usually go with dropout (at maximum, I have seen 0.3 being used).

LSTM Model accuracy is stuck in a certain level and doesn't change

I'm working on a multivariate timeseries prediction using LSTM. I'm trying to get a better match between my actual and predicted values, but no matter what my hyperparamters are, the accuracy won't change. I was wondering if you can give me few insight on how to increase my model accuracy...
I have 3 inputs (time, two rates) and one output (pressure).
enter image description here
This is the LSTM section of my code:
model = Sequential()
model.add(LSTM(units=4,
activation='tanh',
recurrent_activation='hard_sigmoid',
use_bias=True,
unit_forget_bias=True,
dropout=0,
recurrent_dropout=0.3,
input_shape=(look_back, 3)))
model.add(Dense(units=1,
activation='linear',
use_bias=True))
model.compile(loss='mean_squared_error', optimizer='Adam', metrics=['mae','accuracy'])
hist = model.fit(x_train, y_train,
epochs=50,
batch_size=20,
validation_split=0.0,
verbose=2,
shuffle=False)

Why is my validation accuracy stuck around 65% and how do i increase it?

I'm making an image classification CNN with 5 classes with each having 693 images with a width and height of 224px using VGG16, but my validation accuracy is stuck after 15-20 epochs around 60% - 65%.
I'm already using some data augmentation, batch normalization, and dropout and I have frozen the first 5 layers but I can't seem to increase my accuracy more than 65%.
these are my own layers
img_rows, img_cols, img_channel = 224, 224, 3
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_rows, img_cols, img_channel))
for layer in base_model.layers[:5]:
layer.trainable = False
add_model = Sequential()
add_model.add(Flatten(input_shape=base_model.output_shape[1:]))
add_model.add(Dropout(0.5))
add_model.add(Dense(512, activation='relu'))
add_model.add(BatchNormalization())
add_model.add(Dropout(0.5))
add_model.add(Dense(5, activation='softmax'))
model = Model(inputs=base_model.input, outputs=add_model(base_model.output))
model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizers.Adam(lr=0.0001),
metrics=['accuracy'])
model.summary()
and this is my dataset with my model
batch_size = 64
epochs = 25
train_datagen = ImageDataGenerator(
rotation_range=30,
width_shift_range=.1,
height_shift_range=.1,
horizontal_flip=True)
train_datagen.fit(x_train)
history = model.fit_generator(
train_datagen.flow(x_train, y_train, batch_size=batch_size),
steps_per_epoch=x_train.shape[0] // batch_size,
epochs=epochs,
validation_data=(x_test, y_test),
callbacks=[ModelCheckpoint('VGG16-transferlearning.model', monitor='val_acc', save_best_only=True)]
)
I want to get a higher accuracy because what I get now is just not enough so any help or suggestions would be appreciated.
A few things you can try are:
Reduce your batch size.
Choose another optimizer: RMSprop, SGD...
Increase the learning rate by default and then use the callback ReduceLROnPlateau
But, as usual, it depends on the data you are using. Are well balanced?

Tensorflow training: why two successive training is better than one training

I'm working on a regression problem using Keras+Tensorflow. And I've found something interesting.
1) here are the two models, which are actually the same except that the first model is using a globally defined 'optimizer'.
optimizer = Adam() #as a global variable
def OneHiddenLayer_Model():
model = Sequential()
model.add(Dense(300 * inputDim, input_dim=inputDim, kernel_initializer='normal', activation=activationFunc))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer=optimizer)
return model
def OneHiddenLayer_Model2():
model = Sequential()
model.add(Dense(300 * inputDim, input_dim=inputDim, kernel_initializer='normal', activation=activationFunc))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer=Adam())
return model
2) Then, I use two schemes to train the datasets (training set(scaleX, Y); testing set(scaleTestX, testY)).
2.1) Scheme1. two successive fitting with the first model
numpy.random.seed(seed)
model = OneHiddenLayer_Model()
model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=250, batch_size=numBatch, verbose=0)
numpy.random.seed(seed)
model = OneHiddenLayer_Model()
history = model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=500, batch_size=numBatch, verbose=0)
predictY = model.predict(scaleX)
predictTestY = model.predict(scaleTestX)
2.2) Scheme2. one fitting with the second model
numpy.random.seed(seed)
model = OneHiddenLayer_Model2()
history = model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=500, batch_size=numBatch, verbose=0)
predictY = model.predict(scaleX)
predictTestY = model.predict(scaleTestX)
3). Finally, the results are plotted for each scheme, as shown below, (model loss history --> predict on scaleX --> predict on scaleTestX),
3.1) Scheme1
3.2) Scheme2 (with 500 epochs)
3.3) add one more test with Scheme2 and set epochs = 1000
From the images above, I've found that Scheme1 is better than Scheme2, even if Scheme2 is set with more epochs.
Can anyone help to explain why Scheme1 is better? Thanks a lot!!!

Does LSTM-Keras take into account dependencies between time series?

I have:
Multiple time series as INPUT
Forecast time series point in OUTPUT
How can be sure that model predict data by using dependencies between all time series in input?
Edit 1
My current model:
model = Sequential()
model.add(keras.layers.LSTM(hidden_nodes, input_dim=num_features, input_length=window, consume_less="mem"))
model.add(keras.layers.Dense(num_features, activation='sigmoid'))
optimizer = keras.optimizers.SGD(lr=learning_rate, decay=1e-6, momentum=0.9, nesterov=True)
By default LSTM layer in keras (and any other type of recurrent layer) is not stateful, and hence the states are reset every time a new input is fed into the network. Your code uses this default version. If you want, you can make it stateful by specifying stateful=True inside the LSTM layer, and then the states will not be reset. You can read more about the relevant syntax here, and this blog post provides more information regarding the stateful mode.
Here is an example of the corresponding syntax, taken from here:
trainX = numpy.reshape(trainX, (trainX.shape[0], trainX.shape[1], 1))
testX = numpy.reshape(testX, (testX.shape[0], testX.shape[1], 1))
# create and fit the LSTM network
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(100):
model.fit(trainX, trainY, epochs=1, batch_size=batch_size, verbose=2, shuffle=False)
model.reset_states()
# make predictions
trainPredict = model.predict(trainX, batch_size=batch_size)
model.reset_states()
testPredict = model.predict(testX, batch_size=batch_size)

Categories