I am new to deep learning and LSTM (with keras). I am trying to solve a multistep ahead time series prediction. I have 3 time series: A, B and C and I want to predict the values of C. I am training an LSTM feeding 3 steps back data points to predict the next 3 steps in the future.
The input data looks like:
X = [[[A0, B0, C0],[A1, B1, C1],[A2, B2, C2]],[[ ...]]]
with dimensions: (1000, 3, 3). The output is:
y = [[C3, C4, C5],[C4, C5, C6],...]
with dimensions: (1000, 3).
I am using a simple LSTM with 1 hidden layer (50 neurons).
I set up an LSTM with keras as:
n_features = 3
neurons = 50
ahead = 3
model = Sequential()
model.add(LSTM(input_dim=n_features, output_dim=neurons))
model.add(Dropout(.2))
model.add(Dense(input_dim=neurons, output_dim=ahead))
model.add(Activation('linear'))
model.compile(loss='mae', optimizer='adam')
model.fit(X, y, epochs=50)
This model works fine. Now, I'd like to predict the values of B as well (using the same input). So I tried to reshape the output in a similar way as I did for the training that has multiple features:
y = [[[B3, C3],[B4, C4],[B5, C5]],[[ ...]]]
so that it has dimensions: (1000, 3, 2). However, this gives me an error:
Error when checking target: expected activation_5 to have 2 dimensions,
but got array with shape (1000, 3, 2)
I guess the structure of the network needs to change. I tried to modify model.add(Dense(input_dim=neurons, output_dim=ahead)) with no success. Should I reshape the y differently? Is the structure of the network wrong?
Related
I have 9 features, one output variable i.e. to be predicted, window size is 5
code works very well without "TimeDistributed" command
MODEL INPUT SHAPE: feature_tensor.shape=(1649, 5, 9) MODEL OUTPUT
SHAPE: y_train.shape= (1649,)
Thats my Code:
#Build the network model
act_fn='relu'
modelq = Sequential()
modelq.add(TimeDistributed(Conv1D(filters=105, kernel_size=2, activation=act_fn, input_shape=(None, feature_tensor.shape[1],feature_tensor.shape[2]))))
modelq.add(TimeDistributed(AveragePooling1D(pool_size=1)))
modelq.add(TimeDistributed(Flatten()))
modelq.add(LSTM(50))
modelq.add(Dense(64, activation=act_fn))
modelq.add(Dense(1))
#Compile the model
modelq.compile(optimizer='adam', loss='mean_squared_error')
modelq.fit(feature_tensor, y_train ,batch_size=1, epochs=epoch_count)
THE ERROR STATEMENT IS :
ValueError: Input 0 of layer conv1d is incompatible with the layer: : expected min_ndim=3, found ndim=2. Full shape received: (5, 9)
I feel like there is some thing wrong with dimensionality of "feature_tensor" during "Model FITTING" i.e last command... But I don't know what's wrong with it :(
Your intuition is right, the problem is the dimensionality of tensor_feature. If you take a look in the documentation of TimeDistributed you see an example with images and Conv2d layers. There the the input has to have the following shape: batch_size, time steps, x_dim, y_dim, channels. Since you use time-series you need: batch_size, time steps, 1, features. E.g. you can reshape your data by numpy:
feature_tensor = np.reshape(feature_tensor, (-1, 5, 1, 9))
However, I am not sure if it useful to combine Conv1D with TimeDistributed, since in that case you apply the convolution only on the features and not on temporal contiguous values, where a 1d Convolution should be applied.
I would like to create a 'Sequential' model (a Time Series model as you might have guessed), that takes 20 days of past data with a feature size of 2, and predict 1 day into the future with the same feature size of 2.
I found out you need to specify the batch size for a stateful LSTM model, so if I specify a batch size of 32 for example, the final output shape of the model is (32, 2), which I think means the model is predicting 32 days into the future rathen than 1.
How would I go on fixing it?
Also, asking before I arrive to the problem; if I specify a batch size of 32 for example, but I want to predict on an input of shape (1, 20, 2), would the model predict correctly or what, since I changed to batch size from 32 to 1. Thank you.
You don't need to specify batch_size. But you should feed 3-d tensor:
import tensorflow as tf
from tensorflow.keras.layers import Input, LSTM, Dense
from tensorflow.keras import Model, Sequential
features = 2
dim = 128
new_model = Sequential([
LSTM(dim, stateful=True, return_sequences = True),
Dense(2)
])
number_of_sequences = 1000
sequence_length = 20
input = tf.random.uniform([number_of_sequences, sequence_length, features], dtype=tf.float32)
output = new_model(input) # shape is (number_of_sequences, sequence_length, features)
predicted = output[:,-1] # shape is (number_of_sequences, 1, features)
Shape of (32, 2) means that your sequence length is 32.
Batch size is a parameter of training (how many sequences should be feeded to the model before backpropagating error - see stochastic graient descent method). It doesn't affect your data (which shoud be 3-d - (number of sequences, length of sequence, feature)).
If you need to predict only one sequence - just feed tensor of shape (1, 20, 2) to the model.
I have a data set with the shape (3340, 6). I want to use a CNN-LSTM to read a sequence of 30 rows and predict the next row's (6) elements. From what I have read, this is considered a multi-parallel time series. I have been primarily following this machine learning mastery tutorial and am having trouble implementing the CNN-LSTM architecture for a multi-parallel time series.
I have used this function to split the data into 30 day time step frames
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences)-1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
Here is a sample of the data frames produced by the function above.
# 30 Time Step Input Frame X[0], X.shape = (3310, 30, 6)
[4.951e-02, 8.585e-02, 5.941e-02, 8.584e-02, 8.584e-02, 5.000e+00],
[8.584e-02, 9.307e-02, 7.723e-02, 8.080e-02, 8.080e-02, 4.900e+01],
[8.080e-02, 8.181e-02, 7.426e-02, 7.474e-02, 7.474e-02, 2.000e+01],
[7.474e-02, 7.921e-02, 6.634e-02, 7.921e-02, 7.921e-02, 4.200e+01],
...
# 1 Time Step Output Array y[0], y.shape = (3310, 6)
[6.550e-02, 7.690e-02, 6.243e-02, 7.000e-02, 7.000e-02, 9.150e+02]
Here is the following model that I am using:
model = Sequential()
model.add(TimeDistributed(Conv1D(64, 1, activation='relu'), input_shape=(None, 30, 6)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, activation='relu', return_sequences=True))
model.add(Dense(6))
model.compile(optimizer='adam', loss='mse')
When I run model.fit, I receive the following error:
ValueError: Error when checking input: expected time_distributed_59_input to have
4 dimensions, but got array with shape (3310, 30, 6)
I am at a loss at how to properly shape my input layer so that I can get this model learning. I have done several Conv2D nets in the past but this is my first time series model so I apologize if there's an obvious answer here that I am missing.
Remove TimeDistributed from Conv1D and MaxPooling1D; 3D inputs are supported
Remove Flatten(), as it destroys timesteps-channels relationships
Add TimeDistributed to the last Dense layer, as Dense does not support 3D inputs (returned by LSTM(return_sequences=True); alternatively, use return_sequences=False)
I'm building a few simple models in Keras to improve my knowledge of deep learning, and encountering some issues I don't quite understand how to debug.
I want to use a 1D CNN to perform regression on some time-series data. My input feature tensor is of shape N x T x D, where N is the number of data points, T is the number of sequences, and D is the number of dimensions. My target tensor is of shape N x T x 1 (1 because I am trying to output a scalar value).
I've set up my model architecture like this:
feature_tensor.shape
# (75584, 40, 38)
target_tensor.shape
# (75584, 40, 1)
inputs = Input(shape=(SEQUENCE_LENGTH,DIMENSIONS))
conv1 = Conv1D(filters=64, kernel_size=3, activation='relu')
x = conv1(inputs)
x = MaxPooling1D(pool_size=2)(x)
x = Flatten()(x)
x = Dense(100, activation='relu')(x)
predictions = Dense(1, activation="linear")(x)
model = Model(inputs, predictions)
opt = Adam(lr=1e-5, decay=1e-4 / 200)
model.compile(loss="mean_absolute_error", optimizer=opt)
When I attempt to train my model, however, I get the following output:
r = model.fit(cleaned_tensor, target_tensor, epochs=100, batch_size=2058)
ValueError: Error when checking target: expected dense_164 to have 2
dimensions, but got array with shape (75584, 40, 1).
The first two numbers are familiar: 75584 is the # of samples, 40 is the sequence length.
When I debug my model summary object, I see that the expected output from the Flatten layer should be 1216:
However, my colleague and I stared at the code for a long time and could not understand why the shape of (75584, 40, 1) was being arrived at via the architecture when it reached the dense layer.
Could someone point me in the direction of what I am doing wrong?
Try reshaping your target variable to N x T, and it looks like your final dense layer should be 40 rather than 1 (i think).
I have been trying to perform regression using tflearn and my own dataset.
Using tflearn I have been trying to implement a convolutional network based off an example using the MNIST dataset. Instead of using the MNIST dataset I have tried replacing the training and test data with my own. My data is read in from a csv file and is a different shape to the MNIST data. I have 255 features which represent a 15*15 grid and a target value. In the example I replaced the lines 24-30 with (and included import numpy as np):
#read in train and test csv's where there are 255 features (15*15) and a target
csvTrain = np.genfromtxt('train.csv', delimiter=",")
X = np.array(csvTrain[:, :225]) #225, 15
Y = csvTrain[:,225]
csvTest = np.genfromtxt('test.csv', delimiter=",")
testX = np.array(csvTest[:, :225])
testY = csvTest[:,225]
#reshape features for each instance in to 15*15, targets are just a single number
X = X.reshape([-1,15,15,1])
testX = testX.reshape([-1,15,15,1])
## Building convolutional network
network = input_data(shape=[None, 15, 15, 1], name='input')
I get the following error:
ValueError: Cannot feed value of shape (64,) for Tensor u'target/Y:0',
which has shape '(?, 10)'
I have tried various combinations and have seen a similar question in stackoverflow but have not had success. The example in this page does not work for me and throws a similar error and I do not understand the answer provided or those provided by similar questions.
How do I use my own data?
Short answer
In the line 41 of the MNIST example, you also have to change the output size 10 to 1 in network = fully_connected(network, 10, activation='softmax') to network = fully_connected(network, 1, activation='linear'). Note that you can remove the final softmax.
Looking at your code, it seems you have a target value Y, which means using the L2 loss with mean_square (you will find here all the losses available):
regression(network, optimizer='adam', learning_rate=0.01,
loss='mean_square', name='target')
Also, reshape Y and Y_test to have shape (batch_size, 1).
Long answer: How to analyse the error and find the bug
Here is how to analyse the error:
The error is Cannot feed value ... for Tensor 'target/Y', which means it comes from the feed_dict argument Y.
Again, according to the error, you try to feed an Y value of shape (64,) whereas the network expect a shape (?, 10).
It expects a shape (batch_size, 10), because originally it's a network for MNIST (10 classes)
We now want to change the expected value of the network for Y.
in the code, we see that the last layer fully_connected(network, 10, activation='softmax') is returning an output of size 10
We change that to an output of size 1 without softmax: fully_connected(network, 1, activation='linear')
In the end, it was not a bug, but a wrong model architecture.