Keras LSTM - Input shape for time series prediction

Keras LSTM - Input shape for time series prediction - python

I am trying to predict the output of a function. (Eventually it will be multi input multi output) but for now just to get the mechanics right I am trying to predict the output of sin function. My dataset is as follows,
t0 t1
0 0.000000 0.125333
1 0.125333 0.248690
2 0.248690 0.368125
3 0.368125 0.481754
4 0.481754 0.587785
5 0.587785 0.684547
6 0.684547 0.770513
7 0.770513 0.844328
8 0.844328 0.904827
9 0.904827 0.951057
.....
Total of 100 values. t0 is the current input t1 is the next output I want to predict. Then data is split into train/test via scikit,
x_train, x_test, y_train, y_test = train_test_split(wave["t0"].values, wave["t1"].values, test_size=0.20)
Problem happens in fit, I get an error that says input wrong dimensions.
model = Sequential()
model.add(LSTM(128, input_shape=??? ,stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train,
batch_size=10, epochs=100,
validation_data=(x_test, y_test))
I've tried other questions on the site to fix the problem but no matter what i try i can not get keras to recognize correct input.

The LSTM expects the input data to be of shape (batch_size, time_steps, num_features). In sine-wave prediction, the num_features is 1, the time_steps is how many previous time-points the LSTM should use for prediction. In the example below, batch size is 1, time_steps is 2 and num_features is 1.
x_train = np.ones((1,2,1))
y_train = np.ones((1,1))
x_test = np.ones((1,2,1))
y_test = np.ones((1,1))
model = Sequential()
model.add(LSTM(128, input_shape=(2,1)))
#for stateful
#model.add(LSTM(128, batch_input_shape=(1,2,1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train,
batch_size=1, epochs=100,
validation_data=(x_test, y_test))

Related

Conv1D for classify non-image dataset show error ValueError : `logits` and `labels` must have the same shape

I found this paper they present Convolutional Neural Network can get the best accuracy for non-image classify. So, I want to use CNN with non-image dataset. I download Early Stage Diabetes Risk Prediction Dataset form kaggle. I create CNN moldel like this code.
dataset = loadtxt('diabetes_data_upload.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:16]
Y = dataset[:,16]
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = Sequential()
model.add(Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100, batch_size=10)
It show error like this.
ValueError: `logits` and `labels` must have the same shape, received ((None, 15, 1) vs (None,)).
How to fix it ?

You can use tf.keras.layers.Flatten(). Something like below can solve youe problem.
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np
X = np.random.rand(100, 16)
Y = np.random.randint(0,2, size = 100) # <- Because you have two labels, I generate ranom 0,1
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=1, batch_size=10)
Update by thanks Ameya, we can solve this problem by only using tf.keras.layers.GlobalAveragePooling1D() too.
(by thanks Djinn and his_comment, but consider: these are two different approaches that do different things. Flatten() preserves all data, and just converts input tensors to a 1D tensor BUT GlobalAveragePooling1D() tries to generalize and loses data. Pooling layers with non-image data can significantly affect performance but I've noticed AveragePooling does the least "damage,")
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.GlobalAveragePooling1D())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
7/7 [==============================] - 0s 2ms/step - loss: 0.6954 - accuracy: 0.0000e+00

CNN for classification of numerical dataset in CSV file

I am trying to apply a CNN on my numerical dataset from a CSV file, but I have problems with the dimensions. My Dataset consists of 26 Features/Columns and 1200 rows/samples. The dataset has 3 labels.
Dataset = pd.read_csv("...", header=0)
features = ['...']
x = Dataset [features]
y = Dataset .Classifier
sc = PowerTransformer()
x = sc.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.75)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=4))
model.add(LSTM(64))
model.add(Dense(3, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=100, batch_size=8, verbose=1)
accuracy = model.evaluate(x_test, y_test)
print(accuracy)
I get the following error:
ValueError: Error when checking input: expected conv1d_1_input to have 3 dimensions, but got array with shape (900, 26)
I am not sure how to reshape the data. As far as I know I only need a vector.

You are partly correct. You do need a vector, but it has to be of different dimensions.
Conv1D layer takes as input:
Input shape:
3+D tensor with shape: batch_shape + (steps, input_dim)
In the model.fit function you set your batch size to 8. This means that you have to give sets of 8 samples per step (step = iteration before the network weights are updated).
What you have to do is generate sets (or batches) of 8 samples and then feed them to your network.

(ValueError) How to set data shape in RNN?

I have a problem with data shape in RNN model.
y_pred = model.predict(X_test_re) # X_test_re.shape (35,1,1)
It returned an error like below.
ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 35 samples. Batch size: 32.
first Question
I can't understand because I defined batch_size=10, but why error msg says batch size:32?
Second Question
when I modified the code as below
model.predict(X_test_re[:32])
I also got an error msg but I don't know what it means.
InvalidArgumentError: Incompatible shapes: [32,20] vs. [10,20]
[[{{node lstm_1/while/add_1}}]]
I built a model and fit it as below.
features = 1
timesteps = 1
batch_size = 10
K.clear_session()
model=Sequential()
model.add(LSTM(20, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, features)))
model.add(LSTM(20, stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.summary()
earyl_stop = EarlyStopping(monitor='val_loss', patience=5, verbose=1)
hist = model.fit(X_train_re, y_train, # X_train_re.shape (70,1,1), y_train(70,)
batch_size=batch_size,
epochs=100,
verbose=1,
shuffle=False,
callbacks=[earyl_stop])
Until fit model, it works without any problem.
+) source code
first, df looks like,
# split_train_test from dataframe
train,test = df[:-35],df[-35:]
# print(train.shape, test.shape) (70, 2) (35, 2)
# scaling
sc = MinMaxScaler(feature_range=(-1,1))
train_sc = sc.fit_transform(train)
test_sc = sc.transform(test)
# Split X,y (column t-1 is X)
X_train, X_test, y_train, y_test = train_sc[:,1], test_sc[:,1], train_sc[:,0], test_sc[:,0]
# reshape X_train
X_train_re = X_train.reshape(X_train.shape[0],1,1)
X_test_re = X_test.reshape(X_test.shape[0],1,1)

Regarding loss weighting in Keras regression problem for multiple outputs

I am running a Hyperas optimization for regression problem, with 3 predictors (X) and 2 targets (Y).
I did this, after ingesting the raw data:
X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size=0.2, random_state=111)
# Input layers and Hidden Layers
model = Sequential()
model.add(Dense({{choice([np.power(2,1),np.power(2,2),np.power(2,3),np.power(2,4),np.power(2,5)])}}, input_dim = X_train.shape[1]))
model.add(Activation({{choice(['tanh','relu', 'sigmoid'])}}))
model.add(Dropout({{uniform(0, 1)}}))
model.add(Dense({{choice([np.power(2,1),np.power(2,2),np.power(2,3),np.power(2,4),np.power(2,5)])}}))
model.add(Activation({{choice(['tanh','relu', 'sigmoid'])}}))
model.add(Dropout({{uniform(0, 1)}}))
# Output layer
model.add(Dense(Y_train.shape[1]))
model.add(Activation('linear'))
model.compile(loss='mae', metrics=['mae'],optimizer=optimizer, loss_weights=[0.6,0.4])
history = model.fit(X_train, Y_train,
batch_size={{choice([16,32,64,128])}},
epochs={{choice([20000])}},
verbose=2,
validation_data=(X_val, Y_val),
callbacks=callbacks_list)
However, when running this, it says:
ValueError: When passing a list as loss_weights, it should have one entry per model output. The model has 1 outputs, but you passed loss_weights=[1, 1]
I'm guessing its due to the format of my inputs and outputs. However, I can't figure out the proper format for which I am supposed to feed it into the model.
Appreciate your advice please, thank you.

LSTM output Dense expects 2d input

I have features in shape of (size,2) and labels in shape of (size,1) i.e. for [x,y] in feature the label will be z. I want to build an LSTM in keras that can do such job since the feature is linked somehow with the previous inputs i.e. 1 or multiple(I believe its a hyperparameter).
Sample dataset values are:-
features labels
[1,2] [5]
[3,4] [84]
Here is what I have done so far:-
print(labels.shape) #prints (1414,2)
print(features.shape) #prints(1414,1)
look_back=2
# reshape input to be [samples, time steps, features]
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
labels = np.reshape(labels, (labels.shape[0], 1, 1))
X_train, X_test, y_train, y_test = train_test_split(features,labels,test_size=0.2)
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back))) #executing correctly
model.add(Dense(1)) #error here is "ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (1131, 1, 1)"
model.summary()
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)
So can anyone please help me build a minimal LSTM example to run my code? Thank you. I don't know how can dense layer have 2 dimensions I mean it is an integer telling how many units to use in the dense layer.

You must not reshape your labels.
Try this:
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
model = Sequential()
model.add(LSTM(4, input_shape=(1, features.shape[1])))
model.add(Dense(1))
model.summary()
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Keras LSTM - Input shape for time series prediction - python

Related

Conv1D for classify non-image dataset show error ValueError : `logits` and `labels` must have the same shape

CNN for classification of numerical dataset in CSV file

(ValueError) How to set data shape in RNN?

Regarding loss weighting in Keras regression problem for multiple outputs

LSTM output Dense expects 2d input

Categories

Resources