Keras Deep Learning and Financial Returns - python

I am experiencing with Tensorflow via the Keras library and before diving into predicting uncertainty, I thought it might be a good idea to predict something certain. Therefore, I tried to predict weekly returns using daily price level data. My input shape looks like this: (1000, 5, 2), i.e. 1000 matrices of the form:
Stock A Stock B
110 100
95 101
90 100
89 99
100 110
For Stock A the price at day t=0is 100, 95 at t-1 and 100 at t-5. Thus, the weekly return for Stock A would be 110/100=10%and -10% for Stock B. Because I focus on only predicting Stock As return for now, my y for this input matrix would just be the scalar 0.01. Furthermore, I want to make it a classification problem and thus make a one-hot encoded vector via to_categorical with 1 if the y is above 5%, 2 if it is below -5% and 0 if it is in between. Hence my classification output for the aforementioned matrix would be:
0 1 0
To simplify: I want my model to learn to calculate returns, i.e. divide the first value in the input matrix by the last value of the input matrix for stock A and ignore the input for stock B. This would give the y. It is just a practice task for me before I get to more difficult tasks and the model should achieve a loss of zero because there is no uncertainty. What model do you propose to do that? I tried the following and it does not converge at all. Training and validation weights are calculated via compute_sample_weight('balanced', ).
Earlystop = EarlyStopping(monitor='val_loss', patience=150, mode='min', verbose=1, min_delta=0.0002, restore_best_weights=True)
checkpoint = ModelCheckpoint('nn', monitor='val_loss', verbose=1, save_best_only=True, mode='min', save_weights_only=False)
Plateau = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=30, verbose=1)
optimizer = optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, amsgrad=True)
input_ = Input(batch_shape=(batch_size, 1, 5, 2))
model = LocallyConnected2D(16, kernel_size=(5, 1), padding='valid', data_format="channels_first")(input_)
model = LeakyReLU(alpha=0.01)(model)
model = Dense(128)(model)
model = LeakyReLU(alpha=0.01)(model)
model = Flatten()(model)
x1 = Dense(3, activation='softmax', name='0')(model)
final_model = Model(inputs=input_, outputs=[x1])
final_model.compile(loss='categorical_crossentropy' , optimizer=optimizer, metrics=['accuracy'])
history = final_model.fit(X_train, y_train, epochs=1000, batch_size=batch_size, verbose=2, shuffle=False, validation_data=[X_valid, y_valid, valid_weight], sample_weight=train_weight, callbacks=[Earlystop, checkpoint, Plateau])
I thought convolution might be good for this and because every return is calcualted individually I decided to go for a LocallyConnected layer. Do I need to add more layers for such a simple task?
EDIT: transformed my input matrix to returns and the model converges successfully. So the input must be correct but the model fails to find the division function. Are there any layers that would be suited to do that?

Related

Getting unreasonably good results when using a simple neural network for Price prediction

I am trying to predict GPU prices. For this I have prepared a custom dataset of shape (135,39). I am using a simple multi layer neural network.
I gave my network a feature vector of size 39,1 which includes crypto prices and GPU prices of today. my y vector consists only of GPU prices one month from now.
When I train the network on the shuffled data, the predictions I get turn out to be really good. When I start predicting prices 12 months from now, I still get great results. This does not make any sense at all.
here are some of the results:
Here's what the dataset looks like
Input matrix X
Output y
And now here's the code
y_data = df.iloc[:,23:39]
y_data_mean = y_data.mean()
y_data_std = y_data.std()
norm_df=(df-df.mean())/df.std()
future_months = 1
X = norm_df[:len(df)-(5*future_months)]
y = norm_df.iloc[(5*future_months):,23:39]
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,shuffle=True)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=(39,)))
model.add(Dense(25, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(tf.keras.layers.Dense(16))
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
loss=[tf.keras.losses.MeanSquaredError()])
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), batch_size = 32, epochs = 2000)
Here's the train/test MSE loss.As per this graph I am not overfitting on the data. I should have used validation data but due to lack of dataset i only kept train and test datasets.

Poor LSTM Performance with Keras on Time Series Data

I'm trying to construct a network that will predict a Boolean target.
The data provided to the network contains both categorical and numerical entries but has all ready been properly processed. The data I am working with is Time Series data with 84 fields and 310033 rows of data. All the data has been scaled to remain between 0 and 1. Ever row represents a second in the data.
I created a database, data, with a shape (310033, 60, 500) and the target vector is of shape (1000, 1). The Time Step dimension was defined to be 60 because that is the maximum full 60 min hours possible with the amount of data available.
Then I split data (X_train, X_test, y_train, y_test).
Is it okay to give a matrix like this to the lstm model and expect a good prediction (if the relationships are there)? Because I have very poor performance. From what I have seen, people give only a 1D or 2D data and they reshape their data after to give a 3D input to the lstm layer. Which is what I have done here.
Below is the Transformation code from 2D to 3D:
X_train, X_test, y_train, y_test = train_test_split(scaled, target, train_size=.7, shuffle = False)
# Generate Lag time Steps 3D framework for LSTM - Currently in 2D Framework
# As required for LSTM networks, we must reshape the input data into N_samples x TimeSteps x Variables
hours = len(X_train)/3600
hours = math.floor(hours) #Find Most full 60-min-hours available in subset of data
temp =[]
# Pull hours into the three dimensional field
for hr in range(hours, len(X_train) + hours):
temp.append(scaled[hr - hours:hr, 0:scaled.shape[1]])
X_train = np.array(temp) #Export Train Features in (70% x Hours x Variables)
hours = len(X_test)/3600
hours = math.floor(hours) #Find Most full 60-min-hours available in subset of data
temp =[]
# Pull hours into the three dimensional field
for hr in range(hours, len(X_test) + hours):
temp.append(scaled[hr - hours:hr, 0:scaled.shape[1]])
X_test = np.array(temp) #Export Test Features in (30% x Hours x Variables)
Below is the Framework of the Model:
model = Sequential()
#Layer 1 - returns a sequence of vectors
model.add(LSTM(128, return_sequences=True,
input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dropout(0.15)) #15% drop out layer
#model.add(BatchNormalization())
#Layer 2
model.add(LSTM(256, return_sequences=False))
model.add(Dropout(0.15)) #15% drop out layer
#model.add(BatchNormalization())
#Layer 3 - return a single vector
model.add(Dense(32))
#Output of 2 because we have 2 classes
model.add(Dense(2, activation= 'sigmoid'))
# Define optimiser
opt = tf.keras.optimizers.Adam(learning_rate=1e-5, decay=1e-6)
# Compile model
model.compile(loss='sparse_categorical_crossentropy', # Mean Square Error Loss = 'mse'; Mean Absolute Error = 'mae'; sparse_categorical_crossentropy
optimizer=opt,
metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=epoch, batch_size=batch, validation_data=(X_test, y_test), verbose=3, shuffle=False)
I have experimented with many different frameworks for the LSTM. Single layer, multilayer, a double LSTM layer with 2 truncating Dense layers (LSTM -> LSTM -> Dense(32) -> Dense(2)), Batch normalization, etc...
Is there a suggested framework for this type of time series data to improve performance? I was getting better results when the data only had a single TimeStep = 1.

How do I specify what column/feature I want to predict in a RNN?

I'm trying to use a time-series data set with 30 different features and I want to predict the future values for 3 of those features. Is there any way I can specify what features I want to be used for output and how many outputs using TensorFlow and Sckit-learn? Or is that just done when I am creating the x_train, y_train, etc. sets? I want to predict the heat index, temperature, and humidity based on various meteorological factors (air pressure, HDD, CDD, pollution, etc.) The 3 factors I wish to predict are part of the 30 total features.
I am using TensorFlows RNN tutorial: https://www.tensorflow.org/tutorials/structured_data/time_series
univariate_past_history = 30
univariate_future_target = 0
x_train_uni, y_train_uni = univariate_data(uni_data, 0, 1930,
univariate_past_history,
univariate_future_target)
x_val_uni, y_val_uni = univariate_data(uni_data, 1930, None,
univariate_past_history,
univariate_future_target)
My data is given daily so I wanted to predict the next day using the last 30 days for example here.
and this is my implementation of the training of the model:
BATCH_SIZE = 256
BUFFER_SIZE = 10000
train_univariate = tf.data.Dataset.from_tensor_slices((x_train_uni, y_train_uni))
train_univariate =
train_univariate.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()
val_univariate = tf.data.Dataset.from_tensor_slices((x_val_uni, y_val_uni))
val_univariate = val_univariate.batch(BATCH_SIZE).repeat()
simple_lstm_model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(8, input_shape=x_train_uni.shape[-2:]),
tf.keras.layers.Dense(1)
])
simple_lstm_model.compile(optimizer='adam', loss='mae')
for x, y in val_univariate.take(1):
print(simple_lstm_model.predict(x).shape)
EVALUATION_INTERVAL = 200
EPOCHS = 30
simple_lstm_model.fit(train_univariate, epochs=EPOCHS,
steps_per_epoch=EVALUATION_INTERVAL,
validation_data=val_univariate, validation_steps=50)
EDIT: I understand that to increase the number of outputs I have to increase the Dense(1) value, want to understand how to specify which features to output/predict
You need to give the model.fit call the variables you want to learn from in a shape compatible with an LSTM layer
So for example, without any code a model like yours might take as input:
[batchsize, n_timestamps, n_features]
and output:
[batchsize, n_timestamps, m_features]
where n is input and m output.
So then you need to give the model the truth data of the same shape as the model output in order for the model to calculate a loss.
So the model.fit call should be:
model.fit(x_train, y_train, ....) where y_train are the truth vectors of the same shape as the model output.
You have to design a model architecture that fits your needs and matches the outputs you expect. I made a toy example, but I have never really worked with this type of NN so no idea if it makes sense for the problem.
import tensorflow as tf
from tensorflow.keras.layers import LSTM, Dense, InputLayer, Reshape
ni_feats = 10
no_feats = 3
ndays = 30
model = tf.keras.Sequential([
InputLayer((ndays, ni_feats)),
LSTM(10),
Dense(int(no_feats * ndays)),
Reshape((ndays, no_feats))
])

Regression with LSTM - python and Keras

I am trying to use a LSTM network in Keras to make predictions of timeseries data one step into the future. The data I have is of 5 dimensions, and I am trying to use the previous 3 periods of readings to predict the a future value in the next period. I have normalised the data and removed all NaN etc, and this is the code I am trying to use to train the network:
def Network_ii(IN, OUT, TIME_PERIOD, EPOCHS, BATCH_SIZE, LTSM_SHAPE):
length = len(OUT)
train_x = IN[:int(0.9 * length)]
validation_x = IN[int(0.9 * length):]
train_y = OUT[:int(0.9 * length)]
validation_y = OUT[int(0.9 * length):]
# Define Network & callback:
train_x = train_x.reshape(train_x.shape[0],3, 5)
validation_x = validation_x.reshape(validation_x.shape[0],3, 5)
model = Sequential()
model.add(LSTM(units=128, return_sequences= True, input_shape=(train_x.shape[1],3)))
model.add(LSTM(units=128))
model.add(Dense(units=1))
model.compile(optimizer='adam', loss='mean_squared_error')
train_y = np.asarray(train_y)
validation_y = np.asarray(validation_y)
history = model.fit(train_x, train_y, batch_size=BATCH_SIZE, epochs=EPOCHS, validation_data=(validation_x, validation_y))
# Score model
score = model.evaluate(validation_x, validation_y, verbose=0)
print('Test loss:', score)
# Save model
model.save(f"models/new_model")
I am attempting to roughly follow the steps outlined here- https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/
However, no matter what adjustments I have made in terms of changing the number of dimensions used to train the network or the length of the time period I cannot get the output of the model to give predictions that are not either 1 or 0. This is even though the target data, in the array 'OUT' is made up of data continuous on [0,1].
I think there may be something wrong with how I am setting up the .Sequential() function, but I cannot see what to adjust. I am relatively new to this so any help would be greatly appreciated.
You are probably using a prediction function that is not the standard. Maybe you are using predict_classes?
The one that is well documented and the standard is model.predict.

My Neuronal Network isn't learning (negative R_Squared, always same loss, categorial input data, regression)

I try to get my Neuronal Network to work but unfortunately it looks like I am missing something.
I have input data from different categories.
For example the type of a machine. ('abc', 'bcd', 'dca').
So one line of my input contains different words from different distinct word-categories. At the moment I have ~70.000 samples with 12 features.
First I use sklearns labelEncoder to transform every word into a number.
The vocabulary size goes up to 17903.
My simple newtwork looks like this:
#Start with the NN
model = tf.keras.Sequential([
tf.keras.layers.Embedding(np.amax(ml_input)+1, 300, input_length = x_train.shape[1]),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(500, activation=tf.keras.activations.softmax),
tf.keras.layers.Dense(1, activation = tf.keras.activations.linear)
])
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.01),
loss=tf.keras.losses.mean_absolute_error,
metrics=[R_squared])
model.summary()
#Train the Model
callback = [tf.keras.callbacks.EarlyStopping(monitor='loss', min_delta=5.0, patience=15),
tf.keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.1, patience=5, min_delta=5.00, min_lr=0)
]
history = model.fit(x_train, y_train, epochs=50, batch_size=64, verbose =2, callbacks = callback)
The loss of the first epoch is about 120 and after two epochs 70 but now it doesn't change anymore. So after two epochs my net isn't learning anymore.
I already tried other loss functions, standarize my labels (they go from 3 to 500mins), more neurons, another dense layer, another activation function. But after two epochs alway loss of 70. My R_Squared is something like -0.02 it changes but alway stays negative near 0.
It seems like my network isn't learning at all.
Does anyone have an Idea of what I am doing wrong?
Thanks for your help!

Categories