Need to increase the accuracy of my LSTM model - python

This is my model
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(df)
train_size = int(len(dataset) * 0.8)
test_size = len(dataset) - train_size
train = dataset[0:train_size,:]
test = dataset[train_size:len(dataset),:]
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
# reshape into X=t and Y=t+1
look_back = 15
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
print(trainX.shape)
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], trainX.shape[1], 1))
testX = np.reshape(testX, (testX.shape[0], testX.shape[1], 1))
from keras.layers import Dropout
from keras.layers import Bidirectional
model=Sequential()
model.add(LSTM(50,activation='relu',return_sequences=True,input_shape=(look_back,1)))
model.add(LSTM(50, activation='relu', return_sequences=True))
model.add(LSTM(50, activation='relu', return_sequences=True))
model.add(LSTM(50, activation='sigmoid', return_sequences=False))
model.add(Dense(50))
model.add(Dense(50))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])
model.optimizer.learning_rate = 0.0001
Xdata_train=[]
Ydata_train=[]
Xdata_train, Ydata_train = create_dataset(train, look_back)
Xdata_train = np.reshape(Xdata_train, (Xdata_train.shape[0], Xdata_train.shape[1], 1))
#training for all data
history = model.fit(Xdata_train,Ydata_train,batch_size=1,epochs=10,shuffle=False)
RMSE value is around 35 and accuracy is very low. When I icrese the epochs there is no any variation. What are the changes should I do to get the accuracy at high value.
Here i attached the graphical results to get an idea.
How could I fix this?

Just with a once-over on your code, I can think of a few of the changes. Try using Bidirectional LSTM, binary_cross_entropy (assuming it's a binary classification) for the loss, and shuffle = True on training. Also, try adding Dropout between LSTM layers.

Here are couple of suggestions:
First of all, never fit normalizer on the entire dataset. First
partition your data into train/test parts, fit the scaler on train
data and then transform both train/test using that scaler. Otherwise
you are leaking the information from your test data into training
when doing the normalization (such as min/max values or std/mean
when using standard scaler).
You seem to be normalizing your y data, but never reverting the
normalization, as a result you end up with an output on lower scale
(as we can see on plots). You can redo normalization using
scaler.inverse_transform().
Finally, you may want to remove sigmoid activation function from the
LSTM layer, its generally not a good idea to use sigmoid anywhere
else besides the output layer as it may cause vanishing gradient.

Related

LSTM Model not predict the result accurately

# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(df)
train_size = int(len(dataset) * 0.8)
test_size = len(dataset) - train_size
train = dataset[0:train_size,:]
test = dataset[train_size:len(dataset),:]
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
# reshape into X=t and Y=t+1
look_back =50
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
print(trainX.shape)
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], trainX.shape[1], 1))
testX = np.reshape(testX, (testX.shape[0], testX.shape[1], 1))
from keras.layers import Dropout
from keras.layers import Bidirectional
model=Sequential()
model.add(LSTM(50,activation='relu',return_sequences=True,input_shape=(look_back,1)))
model.add(Dense(50))
model.add(LSTM(50, activation='relu', return_sequences=True))
model.add(Dense(50))
model.add(LSTM(50, activation='sigmoid', return_sequences=False))
model.add(Dense(50))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(optimizer='adam',loss='mean_squared_error')
Xdata_train=[]
Ydata_train=[]
Xdata_train, Ydata_train = create_dataset(train, look_back)
Xdata_train = np.reshape(Xdata_train, (Xdata_train.shape[0], Xdata_train.shape[1], 1))
#training for all data
history = model.fit(Xdata_train,Ydata_train,batch_size=10,epochs=100,shuffle=True)
Here is the model I used. But the output results are not fit with the data. How to increase the. Predicted values have a low accuracy according to the graph.
What are changes should do fix this issue?

Keras LSTM model overfitting

I am using an LSTM model in Keras. During the fitting stage, I added the validation_data paramater. When I plot my training vs validation loss, it seems there are major overfitting issues. My validation loss just won't decrease.
My full data is a sequence with shape [50,]. The first 20 records are used as training and the remaining used for the test data.
I have tried adding dropout and reducing the model complexity as much as I can and still no luck.
# transform data to be stationary
raw_values = series.values
diff_values = difference_series(raw_values, 1)
# transform data to be supervised learning
# using a sliding window
supervised = timeseries_to_supervised(diff_values, 1)
supervised_values = supervised.values
# split data into train and test-sets
train, test = supervised_values[:20], supervised_values[20:]
# transform the scale of the data
# scale function uses MinMaxScaler(feature_range=(-1,1)) and fit via training set and is applied to both train and test.
scaler, train_scaled, test_scaled = scale(train, test)
batch_size = 1
nb_epoch = 1000
neurons = 1
X, y = train_scaled[:, 0:-1], train_scaled[:, -1]
X = X.reshape(X.shape[0], 1, X.shape[1])
testX, testY = test_scaled[:, 0:-1].reshape(-1,1,1), test_scaled[:, -1]
model = Sequential()
model.add(LSTM(units=neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]),
stateful=True))
model.add(Dropout(0.1))
model.add(Dense(1, activation="linear"))
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(X, y, epochs=nb_epoch, batch_size=batch_size, verbose=0, shuffle=False,
validation_data=(testX, testY))
This what it looks like when changing the amount of neurons. I even tried using Keras Tuner (hyperband) to find the optimal parameters.
def fit_model(hp):
batch_size = 1
model = Sequential()
model.add(LSTM(units=hp.Int("units", min_value=1,
max_value=20, step=1),
batch_input_shape=(batch_size, X.shape[1], X.shape[2]),
stateful=True))
model.add(Dense(units=hp.Int("units", min_value=1, max_value=10),
activation="linear"))
model.compile(loss='mse', metrics=["mse"],
optimizer=keras.optimizers.Adam(
hp.Choice("learning_rate", values=[1e-2, 1e-3, 1e-4])))
return model
X, y = train_scaled[:, 0:-1], train_scaled[:, -1]
X = X.reshape(X.shape[0], 1, X.shape[1])
tuner = kt.Hyperband(
fit_model,
objective='mse',
max_epochs=100,
hyperband_iterations=2,
overwrite=True)
tuner.search(X, y, epochs=100, validation_split=0.2)
When evaluating the model against X_test and y_test, I get the same loss and accuracy score. But when fitting the "best model", I get this:
However, my predictions looks very reasonable against my true values. What should I do to get a better fit?
20 records as training data is too small. There won't be enough variation in the training data for the model to approximate a function accurately, and so your validation data, which is likely much smaller than 20, will likely contain an example wildly different from just those 20 in the training data (i.e. it hasn't seen an example of that nature during training) resulting in a loss that is much higher.

is this correctly work on predict next value in keras?

here is my code
...
look_back = 20
train_size = int(len(data) * 0.80)
test_size = len(data) - train_size
train = data[0:train_size]
test = data[train_size:len(data)]
x_train, y_train = create_dataset(train, look_back)
x_test, y_test = create_dataset(test, look_back)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
y_train=np.repeat(y_train.reshape(-1,1), 20, axis=1).reshape(-1,20,1)
y_test=np.repeat(y_test.reshape(-1,1), 20, axis=1).reshape(-1,20,1)
...
model = Sequential()
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(1, return_sequences=True))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])
model.summary()
model.fit(x_train, y_train, epochs=10, batch_size=64)
p = model.predict(x_test)
and I want to predict the next value So,
predictions = model.predict(x_train) and shape is (62796, 20, 1)
and I coded the following site how to use the Keras model to forecast for future dates or events?
future = []
currentStep = predictions[-20:, :, :] # -20 is last look_back number
for i in range(10):
currentStep = model.predict(currentStep)
future.append(currentStep)
in this code future's result is
but p = model.predict(x_test)'s [:4000] result is
The difference between the two results is very large.
is this right way to Predict the next value??
I don't know where it went wrong or the code went wrong.
I hope for your opinion.
full source is https://gist.github.com/Lay4U/654f70bd1fb9c4f7d5bdb21ddcb588ab
According to your code you are trying to predict next value using lstm.
So here you have to reshape your input data correctly to reflect the time steps and features.
model.add(LSTM(512, return_sequences=True))
instead of this code you have to write :
model.add(LSTM(512, input_shape=(look_back,x)))
x = input features in your training data.
I guess this article will help to moderate your code and predict the future value:
enter link description here
This article will help you to understand more about how to predict future value:
enter link description here
Thank you
There are multiple methods you can try. There is no one right way at the moment. You can train a seperate model for predicting t+1, t+2 ... t+n. One LSTM model predicts t+1 while another predicts t+n. That is called a DIRMO strategy.
Your strategy (recursive strategy) is particularly risky because the model can propagate the error through multiple time horizons.
You can find a good comparison of alternative strategies in this paper.
https://www.sciencedirect.com/science/article/pii/S0957417412000528?via%3Dihub

How can I get multiple outputs in an LSTM network in Python with Keras and Tensorflow?

I am working by first time with LSTMs in Keras and Tensorflow in Python, and I want to create a neural network with some layers and which gives 10 output values. I generated multiple layers in a neural network, and I created an output DenseLayer of 10 elements. I have the next code:
from pandas import DataFrame
from pandas import Series
from pandas import concat
from pandas import read_csv
from pandas import datetime
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from math import sqrt
from matplotlib import pyplot
import numpy
from numpy import array
import math
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return numpy.array(dataX), numpy.array(dataY)
look_back = 10
epochs = 1000
batch_size = 50
data = data.astype('float32')
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(data)
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(100, activation = 'tanh', inner_activation = 'hard_sigmoid', return_sequences=True))#, input_shape=(1, look_back)))
model.add(LSTM(50, activation = 'tanh', inner_activation = 'hard_sigmoid', return_sequences=True))
model.add(LSTM(25, activation = 'tanh', inner_activation = 'hard_sigmoid'))
# I want 10 outputs
model.add(Dense(10))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=epochs, batch_size=batch_size, verbose=2)
But when I execute the code I get the next error message:
ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,)
What can I do to solve the problem? I want to give me predictions for the next 10 elements, that is the reason why I put a final layer of 10 elements.
From what you have said above the error ValueError: Error when checking target: expected dense_1 to have shape (10,) but got array with shape (1,) is due to a problem in your target:
you have a list of values that are the targets.
you try to predict ten values while having only one to compare to.
you need to rework the trainY matrx to include every value you wish to predict.
for example if you wish to predict the 5 values in the closest futur, you'll need a target line (ie each element) of size 5 including all values.
as such you'll train the network to predict the 5 futur values.
i'll try to get you the code howerver it's just a reshaping with a roll to get futur values.
to be more precise, for 1 X (one input) you'll need a y=[v1,v2,v3,v4,v5]
so if you have train = [X1,X2,..] then Y = [[v1,v2,v3,v4,v5],[v2,v3,v4,v5,v6]

What is difference between setting number of epochs vs calling it repeatedly in a for loop?

# -*- coding: utf-8 -*-
"""
import numpy as np
"""
"""
from numpy.random import seed
seed(10)
from tensorflow import set_random_seed
import tensorflow as tf
set_random_seed(1)
import os
os.environ['PYTHONHASHSEED'] = '0'
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
from keras import backend as K
K.set_session(sess)
"""
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout,CuDNNGRU,CuDNNLSTM
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back):
dataX.append(dataset[i:(i+look_back), 0])
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
dataset = np.cos(np.arange(1000)*(20*np.pi/1000))[:,None]
plt.plot(dataset)
plt.show()
look_back = 30
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
trainX = np.reshape(trainX, (trainX.shape[0], trainX.shape[1], 1))
testX = np.reshape(testX, (testX.shape[0], testX.shape[1], 1))
batch_size = 1
model = Sequential()
model.add(CuDNNLSTM(16, batch_input_shape=(batch_size, look_back, 1), stateful=True, return_sequences=True))
model.add(Dropout(0.3))
model.add(CuDNNLSTM(16, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dropout(0.3))
model.add(Dense(16,activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(5):
print(i)
model.fit(trainX, trainY, epochs=1, batch_size=batch_size, verbose=1, shuffle=False)
model.reset_states()
trainScore = model.evaluate(trainX, trainY, batch_size=batch_size, verbose=0)
print('Train Score: ', trainScore)
testScore = model.evaluate(testX[:252], testY[:252], batch_size=batch_size, verbose=0)
print('Test Score: ', testScore)
look_ahead = 250
trainPredict = [np.vstack([trainX[-1][1:], trainY[-1]])]
predictions = np.zeros((look_ahead,1))
for i in range(look_ahead):
prediction = model.predict(np.array([trainPredict[-1]]), batch_size=batch_size)
predictions[i] = prediction
trainPredict.append(np.vstack([trainPredict[-1][1:],prediction]))
plt.figure(figsize=(12,5))
# plt.plot(np.arange(len(trainX)),np.squeeze(trainX))
# plt.plot(np.arange(200),scaler.inverse_transform(np.squeeze(trainPredict)[:,None][1:]))
# plt.plot(np.arange(200),scaler.inverse_transform(np.squeeze(testY)[:,None][:200]),'r')
plt.plot(np.arange(look_ahead),predictions,'r',label="prediction")
plt.plot(np.arange(look_ahead),dataset[train_size:(train_size+look_ahead)],label="test function")
plt.legend()
plt.show()
This code is based on this guys example here:
https://github.com/sachinruk/PyData_Keras_Talk/blob/master/cosine_LSTM.ipynb
Instead of setting the number of epochs he is using a for loop. Can you just set epochs in the fit command?
Secondly without setting a seed my results between runs vary wildly. I understand you need to set a seed to get reproducible results. However should results vary this much. In one case I am getting something that looks like a sinewave. Sometimes I end up with a straight line. Sometimes I end with a sine wave with the wrong frequency. Is this much variability normal?
Since in this case we are using the LSTM to predict multiple points the future I understand that we can have compounding errors. I am wondering if the was the fit is being called is maybe causing that. I have tried both methods and they seem to yield similar results. I am kind of lost on why this is happening.
Here is an example of the various outputs I got:
https://imgur.com/a/esEaVf9
There are some differences between running single epochs in a for loop vs specifying multipe epochs in the fit.
For ex: the learning rate decay usually works / gets modified after each epoch.
Following post has more specifics on this
https://datascience.stackexchange.com/questions/26112/decay-parameter-in-keras-optimizers?rq=1

Categories