Understanding Keras LSTM NN input & output for binary classification - python

I am trying to create a simple LSTM network that would - based on the last 16 time frames - provide some output. Let's say I have a dataset with 112000 rows (measurements) and 7 columns (6 features + class). What I understand is that I have to "pack" the dataset into X number of 16 elements long batches. With 112000 rows that would mean 112000/16 = 7000 batches, therefore a numpy 3D array with shape (7000, 16, 7). Splitting this array for train and test data I get shapes:
xtrain.shape == (5000, 16, 6)
ytrain.shape == (5000, 16)
xtest.shape == (2000, 16, 6)
ytest.shape == (2000, 16)
My model looks like this:
model.add(keras.layers.LSTM(8, input_shape=(16, 6), stateful=True, batch_size=16, name="input"));
model.add(keras.layers.Dense(5, activation="relu", name="hidden1"));
model.add(keras.layers.Dense(1, activation="sigmoid", name="output"));
model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"]);
model.fit(xtrain, ytrain, batch_size=16, epochs=10);
However after trying to fit the model I get this error:
ValueError: Error when checking target: expected output to have shape (1,) but got array with shape (16,)
What I guess is wrong is that the model expects a single output per batch (so the ytrain shape should be (5000,)), instead of 16 outputs (one for every entry in a batch - (5000, 16)).
If that is the case, should I, instead of packing the data like this, create a 16 elements long batch for every output? Therefore having
xtrain.shape == (80000, 16, 6)
ytrain.shape == (80000,)
xtest.shape == (32000, 16, 6)
ytest.shape == (32000,)

You are close with the last comments of the question. Since it's a binary classification problem, you should have 1 output per input, so you need to get rid of the 16 in you ys and replace it for a 1.
Besides, you need to be able to divide the train set by your batch size, so you can use 5008 for example.
In fact:
ytrain.shape == (5000, 1)
Passes the error you mention, but raises a new one:
ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 5000 samples
Which is addressed by ensuring that:
xtrain.shape == (5008, 16, 6)
ytrain.shape == (5008, 1)

Related

How can I constrain the output of a keras.Sequential() model based on the final row of the input data?

Assuming I have a model that is doing time series predictions, with input shape (n_samples, n_steps, n_features), where n_steps is the number of historical data rows we use to predict the output - a single row of data (two features), how would I constrain one of these output features based on the final input entry row?
For example, let's say we feed the following into the model:
input = (n_samples, n_steps, n_features) = (1, 5, 3), and we expect the output to be of the shape output = (1, 1, 2) after the Dense(2) layer, how would I constrain the value of output(1, 1, 1) to be less than input(1, 5, 1) when training the model using model.train()?
I hope this makes sense.
I have looked into layers.Lambda functions but am unsure how to apply them here.
A simple example of the code would be
model = keras.Sequential([
layers.LSTM(units = 64, input_shape = (n_steps, n_features))
layers.Dense(2)
])

How to correctly shape my CNN-LSTM input layer

I have a data set with the shape (3340, 6). I want to use a CNN-LSTM to read a sequence of 30 rows and predict the next row's (6) elements. From what I have read, this is considered a multi-parallel time series. I have been primarily following this machine learning mastery tutorial and am having trouble implementing the CNN-LSTM architecture for a multi-parallel time series.
I have used this function to split the data into 30 day time step frames
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences)-1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
Here is a sample of the data frames produced by the function above.
# 30 Time Step Input Frame X[0], X.shape = (3310, 30, 6)
[4.951e-02, 8.585e-02, 5.941e-02, 8.584e-02, 8.584e-02, 5.000e+00],
[8.584e-02, 9.307e-02, 7.723e-02, 8.080e-02, 8.080e-02, 4.900e+01],
[8.080e-02, 8.181e-02, 7.426e-02, 7.474e-02, 7.474e-02, 2.000e+01],
[7.474e-02, 7.921e-02, 6.634e-02, 7.921e-02, 7.921e-02, 4.200e+01],
...
# 1 Time Step Output Array y[0], y.shape = (3310, 6)
[6.550e-02, 7.690e-02, 6.243e-02, 7.000e-02, 7.000e-02, 9.150e+02]
Here is the following model that I am using:
model = Sequential()
model.add(TimeDistributed(Conv1D(64, 1, activation='relu'), input_shape=(None, 30, 6)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, activation='relu', return_sequences=True))
model.add(Dense(6))
model.compile(optimizer='adam', loss='mse')
When I run model.fit, I receive the following error:
ValueError: Error when checking input: expected time_distributed_59_input to have
4 dimensions, but got array with shape (3310, 30, 6)
I am at a loss at how to properly shape my input layer so that I can get this model learning. I have done several Conv2D nets in the past but this is my first time series model so I apologize if there's an obvious answer here that I am missing.
Remove TimeDistributed from Conv1D and MaxPooling1D; 3D inputs are supported
Remove Flatten(), as it destroys timesteps-channels relationships
Add TimeDistributed to the last Dense layer, as Dense does not support 3D inputs (returned by LSTM(return_sequences=True); alternatively, use return_sequences=False)

Mismatch in expected Keras shapes after pooling

I'm building a few simple models in Keras to improve my knowledge of deep learning, and encountering some issues I don't quite understand how to debug.
I want to use a 1D CNN to perform regression on some time-series data. My input feature tensor is of shape N x T x D, where N is the number of data points, T is the number of sequences, and D is the number of dimensions. My target tensor is of shape N x T x 1 (1 because I am trying to output a scalar value).
I've set up my model architecture like this:
feature_tensor.shape
# (75584, 40, 38)
target_tensor.shape
# (75584, 40, 1)
inputs = Input(shape=(SEQUENCE_LENGTH,DIMENSIONS))
conv1 = Conv1D(filters=64, kernel_size=3, activation='relu')
x = conv1(inputs)
x = MaxPooling1D(pool_size=2)(x)
x = Flatten()(x)
x = Dense(100, activation='relu')(x)
predictions = Dense(1, activation="linear")(x)
model = Model(inputs, predictions)
opt = Adam(lr=1e-5, decay=1e-4 / 200)
model.compile(loss="mean_absolute_error", optimizer=opt)
When I attempt to train my model, however, I get the following output:
r = model.fit(cleaned_tensor, target_tensor, epochs=100, batch_size=2058)
ValueError: Error when checking target: expected dense_164 to have 2
dimensions, but got array with shape (75584, 40, 1).
The first two numbers are familiar: 75584 is the # of samples, 40 is the sequence length.
When I debug my model summary object, I see that the expected output from the Flatten layer should be 1216:
However, my colleague and I stared at the code for a long time and could not understand why the shape of (75584, 40, 1) was being arrived at via the architecture when it reached the dense layer.
Could someone point me in the direction of what I am doing wrong?
Try reshaping your target variable to N x T, and it looks like your final dense layer should be 40 rather than 1 (i think).

keras model predict without fit, what does it mean?

I see the following example code on tensorflow 2.0 API
model = Sequential()
model.add(Embedding(1000, 64, input_length=10))
# the model will take as input an integer matrix of size (batch,
# input_length).
# the largest integer (i.e. word index) in the input should be no larger
# than 999 (vocabulary size).
# now model.output_shape == (None, 10, 64), where None is the batch
# dimension.
input_array = np.random.randint(1000, size=(32, 10))
model.compile('rmsprop', 'mse')
output_array = model.predict(input_array)
assert output_array.shape == (32, 10, 64)
I have used keras API for a few days, compile, fit and then predict is my way.
What does above example mean without fit step?
It represents the use of initialized parameters in the model without fit(). This example is just to illustrate the return shape of Embedding layer.

How should we classify full sequence?

I want to classify fully sequence into two categories. I searched a lot in the web but found no result for this. My prefered way is to use LSTM model from keras to classify "full" sequence of varing rows into two categories. The problem with this approach is the different shape of X and y. This is a sample code I wrote to explain my problem.
import numpy as np
from keras.layers import Dense,LSTM
from keras.models import Sequential
#(no of samples, no of rows,step, feature)
feat= np.random.randint(0, 100, (150, 250, 25,10))
#(no of samples, binary_output)
target= np.random.randint(0, 2, (150, 1))
#model
model = Sequential()
model.add(LSTM(10, input_shape=(25, 10), return_sequences=True))
model.add(LSTM(10,return_sequences=False))
model.add(Dense(1, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
print model.summary()
for i in xrange(target.shape[0]):
X=feat[i]
y=target[i]
model.fit(X,y)
Here I have 150 sample sequence which I want to classify into 0 or 1. The problem is this
ValueError: Input arrays should have the same number of samples as target arrays. Found 250 input samples and 1 target samples.
If there is no way to carry out this in deep learning methods, can you suggest any other machine learning algorithms?
EDIT
Many have asked doubts regarding this
#(no of samples, no of rows,step, feature)
feat= np.random.randint(0, 100, (150, 250, 25,10))
150 is number of samples( consider this as 150 time series data).
250 and 10 is a time series data with 250 rows and 10 columns.
(250, 25,10) is additon of 25 time steps that way it can be passed to keras lstm input
The problem is that when you do
X=feat[i]
y=target[i]
This removes the first axis, which causes X.shape = (250, 25, 10) and y.shape == (1,). When you call model.fit(X, y) keras then assumes that X has 250 samples, and y has only one sample. Which is why you get that error.
You can solve this by extracting slices of feat and target, for example by calling
X=feat[i:i+batch_size]
y=target[i:i+batch_size]
Where batch_size is how many samples you want to use per iteration. If you set batch_size = 1, you should get the behavior you intended in your code.

Categories