Tensorflow LSTM based RNN - Incorrect and Constant Prediction - python

I hope someone can point out where I am going wrong with my RNN. The long and short of my problem is that no matter the structure of my network, the predictions are always along the lines of this:
I have tried 1, 2, 3, and 4 layers of LSTMs each with varying neuron counts and either relu or tanh activation functions. For the above image, the network was setup as:
model = Sequential()
model.add(LSTM(128, activation='relu', return_sequences=True, input_shape=(length, scaled_train_data.shape[1])))
model.add(LSTM(256, activation='relu', return_sequences=True))
model.add(LSTM(256, activation='relu', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, activation='relu'))
model.add(Dense(scaled_train_data.shape[1]))
model.compile(optimizer='adam', loss="mse")
The actual training of the model passes ok, without event:
My data is financial data. There are around 70k rows and I have approx. 70/30 train/test split.
Where am I going wrong? Thanks!

So from asking about and reading around, it seems RNNs might not be the best solution for financial / random walk data - at least with the setup I am using. I wonder if using averages might produce better results?
Anyway, moving on to Reinforcement Learning.

Related

Any suggestions to improve my CNN model (always the same low test accuracy)?

I am working on a project to detect the presence of a person in a painting. I have 4000 training images and 1000 test images resized to (256,256,3)
I tried a CNN model with 3 (Conv layers, MaxPool, BatchNormalization) and 2 fully connected layers.
model = Sequential()
model.add(Conv2D(32, kernel_size = (7, 7), activation='relu', input_shape=shape))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(7,7), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(96, kernel_size=(5,5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'sigmoid'))
The train accuracy always converges to 1 (with just 20-50 epochs) and the test accuracy always remains constant around 0.67.
I tried the following:
I tried changing the size of the layers and adding more layers.
I tried data augmentation
I tried smaller images 128x128x3.
But I always have the same results.
I don't know if this is due to the few images I have, or if the architecture isn't big enough to learn from complex paintings.
I thought of trying Transfer Learning (But I don't know if this will help because it is my first time trying it). Also, do you have any idea where can I find trained models?
So, I am asking from some suggestions to improve my model.
It might be that you are overfitting on your training data, in that case you can use dropout.
The other thing is If you have not already normalized your data, you can do that. I am not sure whether that would be much help but give it a try with sth like:
X_training = X_training / X_training.max()
I tried using VGG16 (frozen) with 4 fully connected layers and the validation accuracy went up to 0.83. Also, I am using ImageDataGenerator.

keras fit time/step difference

Building a dqn agent, and trying to understand why calling fit in my code is orders of magnitude slower (over 1s) than another example I found (1ms). The neural nets are almost the same, the example has more connections, but that's the only difference (my alpha is set to the same as the example NN learning rate).
No idea what would cause such a difference in performance time. I thought maybe it was the way data was formatted before calling fit, but it looks like everything is the same.
My results:
Example results:
My NN:
q = Sequential()
q.add(Dense(24, input_dim=n_states, activation='relu'))
q.add(Dense(24, activation='relu'))
q.add(Dense(n_actions, activation='linear'))
q.compile(loss='mse', optimizer=Adam(lr=alpha))
Example NN:
model = Sequential()
model.add(Dense(32, input_dim=nS, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(nA, activation='linear'))
model.compile(loss='mse', optimizer=Adam(lr=0.01))

Keras LSTM fit underfitting

I have time series training data of about 5000 numbers. For each 100 numbers, I am trying to predict the 101st. At the end of the series, I would put in the predicted numbers back into the model to predict ahead of the time series.
The attached graph shows the training data, the test data and the prediction output. Currently, the model seems to be under-fitting. I would like to know what hyperparameters should be changed, or if I need to re-structure my input and output data.
I am using the following LSTM network.
model = Sequential()
model.add(LSTM(128, input_shape=([bl,1]), activation='relu', return_sequences=True))
model.add(Dropout(0.1))
model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(20,activation='relu'))
model.add(Dense(1))
model.compile(optimizer=adam(lr=0.0001), loss='mean_squared_error', metrics=['accuracy'])
model.fit(y_ba_tr_in, y_ba_tr_out,
epochs=20,
batch_size=5,shuffle=False,verbose=2)
y_ba_tr_in.shape = (4961, 100, 1)
y_ba_tr_out.shape = (4961, 1)
Something you could try is taking return_sequences=True out of your last LSTM layer. I believe this is generally the approach when you intend to predict for the next timestep.
After that modification, you also shouldn't need the subsequent Flatten() and Dense() layers.

Neural Networks works worse than RandomForest

I have a classification problem that target contains 5 classes, 15 features(all continuous)
and have 1 million for training data, 0.5 million for validation data.
e.g.,
shape of X_train = (1000000,15)
shape of X_validation = (500000,15)
First, I used Random Forest that can get 88% Avg. Accuracy.
After that I tried many Neural Network architecture, the best one got ~80% Avg. Accuracy both on training and validation data, which was worse than Random forest.
(I don't know much about designing Neural Network architecture)
Following is the best one of my NN architecture. (~80% Avg.Accuracy)
model = Sequential()
model.add(Dense(1000, input_dim=15, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(900, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(800, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(700, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(600, activation='relu'))
model.add(Dense(5, activation='softmax'))#output layer
adadelta = Adadelta()
model.compile(loss='categorical_crossentropy', optimizer=adadelta, metrics=['accuracy'])
Batch Size = 128 and epochs = 100
I have read this question. The answer point out that NN needs amount of data and some regulization. I think my data size is good enough and I have also tried higer Dropout rate and L2 regulization but still not working.
What could the problem be?
This is biological data that I have no domain knowledge so sorry about that I can't explain it. I've plot the feature distribution as below, all features are between 0 to 3

CNN with Python and Keras

I'm new to machine learning and Keras. I made an Neural Network with Keras for regression looking like this:
model = Sequential()
model.add(Dense(57, input_dim=44, kernel_initializer='normal',
activation='relu'))
model.add(Dense(45, activation='relu'))
model.add(Dense(35, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Dense(18, activation='relu'))
model.add(Dense(15, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(5, activation='relu'))
model.add(Dense(5, activation='relu'))
model.add(Dense(1, activation='linear'))
My data after preprocessing has 44 dimensions, so could you please give me an example how could i make an CNN.
Originally it looks like this: https://scontent.fskp1-1.fna.fbcdn.net/v/t1.0-9/40159383_10204721730878434_598395145989128192_n.jpg?_nc_cat=0&_nc_eui2=AeEYA4Nb3gomElC9qt0kF6Ou86P7jidco_LeHxEkmCB0-oVA9YKVe9VAh41SF25YomKTqKdkS96E18-sTCBidxJdbml4OV7FvFuAOWxI4mRafQ&oh=e81f4f56ebdf15e9c6eefbb078b8a982&oe=5BFD4157
Convolution neural network is not the best choice in this case. BTW you can do this thing easily with Conv1d:
model = keras.Sequential()
model.add(keras.layers.Embedding(44, 100))
model.add(keras.layers.Conv1D(50, kernel_size=1, strides=1))
model.add(keras.layers.GlobalAveragePooling1D())
# model.add(keras.layers.Dense(10, activation=tf.nn.relu))
model.add(keras.layers.Dense(1, activation=tf.nn.sigmoid))
To answer your question upfront I don't think you can use CNNs for your problem. Generally when people say they are using CNNs they usually mean the 2D convolution. It is operated on 2D spatial data (images). In NLP there exists 1D Convolution which people use to find local patterns in sequentual data. I don't think 1D convolution is relevant in your case. If you are from ML background you can think of regression using feed forward neural networks as polynomial regression. Intuitively you let the network decide which polynomial degree should we use to fit the data properly
You can add 2Dconvnet-layers like this:
model.add(Conv2D(32, (3, 3), input_shape=(3, 150, 150)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
where
model.add(Conv2D(<feature maps>, (<kernel size>), input_shape=(<imput-tensor-shape)))
But be careful, 2Dconfnet-layers are mathematically different than dense-layers, so you can't stack them easily. To stack 2Dconvnet-layers with dense layers, you'll have to flatten them (you'll normally do this at the end to get your "fully-connected layer"):
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Activation('relu'))
You'll find a lot of good tutorials on creating conv-nets with keras. This one for example focuses on image recognition. The examples above are taken from this article.
To find out, what a convolutional network does, I'd recommend you this article.
Edit:
But I share the opinion, that it might not be useful to use 2DConvnet layers for your example. Your data structure seems kind of "flat" and 2Dconvnets only make sense, when you have some multidimensional tensors as inputs.

Categories