Keras LSTM - Input shape for time series prediction - python

I am trying to predict the output of a function. (Eventually it will be multi input multi output) but for now just to get the mechanics right I am trying to predict the output of sin function. My dataset is as follows,
t0 t1
0 0.000000 0.125333
1 0.125333 0.248690
2 0.248690 0.368125
3 0.368125 0.481754
4 0.481754 0.587785
5 0.587785 0.684547
6 0.684547 0.770513
7 0.770513 0.844328
8 0.844328 0.904827
9 0.904827 0.951057
.....
Total of 100 values. t0 is the current input t1 is the next output I want to predict. Then data is split into train/test via scikit,
x_train, x_test, y_train, y_test = train_test_split(wave["t0"].values, wave["t1"].values, test_size=0.20)
Problem happens in fit, I get an error that says input wrong dimensions.
model = Sequential()
model.add(LSTM(128, input_shape=??? ,stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train,
batch_size=10, epochs=100,
validation_data=(x_test, y_test))
I've tried other questions on the site to fix the problem but no matter what i try i can not get keras to recognize correct input.

The LSTM expects the input data to be of shape (batch_size, time_steps, num_features). In sine-wave prediction, the num_features is 1, the time_steps is how many previous time-points the LSTM should use for prediction. In the example below, batch size is 1, time_steps is 2 and num_features is 1.
x_train = np.ones((1,2,1))
y_train = np.ones((1,1))
x_test = np.ones((1,2,1))
y_test = np.ones((1,1))
model = Sequential()
model.add(LSTM(128, input_shape=(2,1)))
#for stateful
#model.add(LSTM(128, batch_input_shape=(1,2,1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train,
batch_size=1, epochs=100,
validation_data=(x_test, y_test))

Related

Conv1D for classify non-image dataset show error ValueError : `logits` and `labels` must have the same shape

I found this paper they present Convolutional Neural Network can get the best accuracy for non-image classify. So, I want to use CNN with non-image dataset. I download Early Stage Diabetes Risk Prediction Dataset form kaggle. I create CNN moldel like this code.
dataset = loadtxt('diabetes_data_upload.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:16]
Y = dataset[:,16]
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = Sequential()
model.add(Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100, batch_size=10)
It show error like this.
ValueError: `logits` and `labels` must have the same shape, received ((None, 15, 1) vs (None,)).
How to fix it ?
You can use tf.keras.layers.Flatten(). Something like below can solve youe problem.
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np
X = np.random.rand(100, 16)
Y = np.random.randint(0,2, size = 100) # <- Because you have two labels, I generate ranom 0,1
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=1, batch_size=10)
Update by thanks Ameya, we can solve this problem by only using tf.keras.layers.GlobalAveragePooling1D() too.
(by thanks Djinn and his_comment, but consider: these are two different approaches that do different things. Flatten() preserves all data, and just converts input tensors to a 1D tensor BUT GlobalAveragePooling1D() tries to generalize and loses data. Pooling layers with non-image data can significantly affect performance but I've noticed AveragePooling does the least "damage,")
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.GlobalAveragePooling1D())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
7/7 [==============================] - 0s 2ms/step - loss: 0.6954 - accuracy: 0.0000e+00

CNN for classification of numerical dataset in CSV file

I am trying to apply a CNN on my numerical dataset from a CSV file, but I have problems with the dimensions. My Dataset consists of 26 Features/Columns and 1200 rows/samples. The dataset has 3 labels.
Dataset = pd.read_csv("...", header=0)
features = ['...']
x = Dataset [features]
y = Dataset .Classifier
sc = PowerTransformer()
x = sc.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.75)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=4))
model.add(LSTM(64))
model.add(Dense(3, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=100, batch_size=8, verbose=1)
accuracy = model.evaluate(x_test, y_test)
print(accuracy)
I get the following error:
ValueError: Error when checking input: expected conv1d_1_input to have 3 dimensions, but got array with shape (900, 26)
I am not sure how to reshape the data. As far as I know I only need a vector.
You are partly correct. You do need a vector, but it has to be of different dimensions.
Conv1D layer takes as input:
Input shape:
3+D tensor with shape: batch_shape + (steps, input_dim)
In the model.fit function you set your batch size to 8. This means that you have to give sets of 8 samples per step (step = iteration before the network weights are updated).
What you have to do is generate sets (or batches) of 8 samples and then feed them to your network.

(ValueError) How to set data shape in RNN?

I have a problem with data shape in RNN model.
y_pred = model.predict(X_test_re) # X_test_re.shape (35,1,1)
It returned an error like below.
ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 35 samples. Batch size: 32.
first Question
I can't understand because I defined batch_size=10, but why error msg says batch size:32?
Second Question
when I modified the code as below
model.predict(X_test_re[:32])
I also got an error msg but I don't know what it means.
InvalidArgumentError: Incompatible shapes: [32,20] vs. [10,20]
[[{{node lstm_1/while/add_1}}]]
I built a model and fit it as below.
features = 1
timesteps = 1
batch_size = 10
K.clear_session()
model=Sequential()
model.add(LSTM(20, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, features)))
model.add(LSTM(20, stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.summary()
earyl_stop = EarlyStopping(monitor='val_loss', patience=5, verbose=1)
hist = model.fit(X_train_re, y_train, # X_train_re.shape (70,1,1), y_train(70,)
batch_size=batch_size,
epochs=100,
verbose=1,
shuffle=False,
callbacks=[earyl_stop])
Until fit model, it works without any problem.
+) source code
first, df looks like,
# split_train_test from dataframe
train,test = df[:-35],df[-35:]
# print(train.shape, test.shape) (70, 2) (35, 2)
# scaling
sc = MinMaxScaler(feature_range=(-1,1))
train_sc = sc.fit_transform(train)
test_sc = sc.transform(test)
# Split X,y (column t-1 is X)
X_train, X_test, y_train, y_test = train_sc[:,1], test_sc[:,1], train_sc[:,0], test_sc[:,0]
# reshape X_train
X_train_re = X_train.reshape(X_train.shape[0],1,1)
X_test_re = X_test.reshape(X_test.shape[0],1,1)

Regarding loss weighting in Keras regression problem for multiple outputs

I am running a Hyperas optimization for regression problem, with 3 predictors (X) and 2 targets (Y).
I did this, after ingesting the raw data:
X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size=0.2, random_state=111)
# Input layers and Hidden Layers
model = Sequential()
model.add(Dense({{choice([np.power(2,1),np.power(2,2),np.power(2,3),np.power(2,4),np.power(2,5)])}}, input_dim = X_train.shape[1]))
model.add(Activation({{choice(['tanh','relu', 'sigmoid'])}}))
model.add(Dropout({{uniform(0, 1)}}))
model.add(Dense({{choice([np.power(2,1),np.power(2,2),np.power(2,3),np.power(2,4),np.power(2,5)])}}))
model.add(Activation({{choice(['tanh','relu', 'sigmoid'])}}))
model.add(Dropout({{uniform(0, 1)}}))
# Output layer
model.add(Dense(Y_train.shape[1]))
model.add(Activation('linear'))
model.compile(loss='mae', metrics=['mae'],optimizer=optimizer, loss_weights=[0.6,0.4])
history = model.fit(X_train, Y_train,
batch_size={{choice([16,32,64,128])}},
epochs={{choice([20000])}},
verbose=2,
validation_data=(X_val, Y_val),
callbacks=callbacks_list)
However, when running this, it says:
ValueError: When passing a list as loss_weights, it should have one entry per model output. The model has 1 outputs, but you passed loss_weights=[1, 1]
I'm guessing its due to the format of my inputs and outputs. However, I can't figure out the proper format for which I am supposed to feed it into the model.
Appreciate your advice please, thank you.

LSTM output Dense expects 2d input

I have features in shape of (size,2) and labels in shape of (size,1) i.e. for [x,y] in feature the label will be z. I want to build an LSTM in keras that can do such job since the feature is linked somehow with the previous inputs i.e. 1 or multiple(I believe its a hyperparameter).
Sample dataset values are:-
features labels
[1,2] [5]
[3,4] [84]
Here is what I have done so far:-
print(labels.shape) #prints (1414,2)
print(features.shape) #prints(1414,1)
look_back=2
# reshape input to be [samples, time steps, features]
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
labels = np.reshape(labels, (labels.shape[0], 1, 1))
X_train, X_test, y_train, y_test = train_test_split(features,labels,test_size=0.2)
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back))) #executing correctly
model.add(Dense(1)) #error here is "ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (1131, 1, 1)"
model.summary()
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)
So can anyone please help me build a minimal LSTM example to run my code? Thank you. I don't know how can dense layer have 2 dimensions I mean it is an integer telling how many units to use in the dense layer.
You must not reshape your labels.
Try this:
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
model = Sequential()
model.add(LSTM(4, input_shape=(1, features.shape[1])))
model.add(Dense(1))
model.summary()
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)

Categories