(ValueError) How to set data shape in RNN?

(ValueError) How to set data shape in RNN? - python

I have a problem with data shape in RNN model.
y_pred = model.predict(X_test_re) # X_test_re.shape (35,1,1)
It returned an error like below.
ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 35 samples. Batch size: 32.
first Question
I can't understand because I defined batch_size=10, but why error msg says batch size:32?
Second Question
when I modified the code as below
model.predict(X_test_re[:32])
I also got an error msg but I don't know what it means.
InvalidArgumentError: Incompatible shapes: [32,20] vs. [10,20]
[[{{node lstm_1/while/add_1}}]]
I built a model and fit it as below.
features = 1
timesteps = 1
batch_size = 10
K.clear_session()
model=Sequential()
model.add(LSTM(20, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, features)))
model.add(LSTM(20, stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.summary()
earyl_stop = EarlyStopping(monitor='val_loss', patience=5, verbose=1)
hist = model.fit(X_train_re, y_train, # X_train_re.shape (70,1,1), y_train(70,)
batch_size=batch_size,
epochs=100,
verbose=1,
shuffle=False,
callbacks=[earyl_stop])
Until fit model, it works without any problem.
+) source code
first, df looks like,
# split_train_test from dataframe
train,test = df[:-35],df[-35:]
# print(train.shape, test.shape) (70, 2) (35, 2)
# scaling
sc = MinMaxScaler(feature_range=(-1,1))
train_sc = sc.fit_transform(train)
test_sc = sc.transform(test)
# Split X,y (column t-1 is X)
X_train, X_test, y_train, y_test = train_sc[:,1], test_sc[:,1], train_sc[:,0], test_sc[:,0]
# reshape X_train
X_train_re = X_train.reshape(X_train.shape[0],1,1)
X_test_re = X_test.reshape(X_test.shape[0],1,1)

Related

ValueError: Shapes are incompatible in LSTM model

I am creating an LSTM model based on the following parameters
embed_dim = 128
lstm_out = 200
batch_size = 32
model = Sequential()
model.add(Embedding(2500, embed_dim,input_length = X.shape[1]))
model.add(Dropout(0.2))
model.add(LSTM(lstm_out))
model.add(Dense(2,activation='sigmoid'))
model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])
print(model.summary())
Xtrain, Xtest, ytrain, ytest = train_test_split(X, train['target'], test_size = 0.2, shuffle=True)
print(Xtrain.shape, ytrain.shape)
print(Xtest.shape, ytest.shape)
model.fit(Xtrain, ytrain, batch_size =batch_size, epochs = 1, verbose = 5)
but I am receiving the following error
ValueError: Shapes (32, 1) and (32, 2) are incompatible
Can you help me with this error?

Your y_train is coming from a single column of a Pandas dataframe, which is a single column. This is suitable if your classification problem is a binary classification 0/1 problem. Then you only need a single neuron in the output layer.
model = Sequential()
model.add(Embedding(2500, embed_dim,input_length = X.shape[1]))
model.add(Dropout(0.2))
model.add(LSTM(lstm_out))
# Only one neuron in the output layer
model.add(Dense(1,activation='sigmoid'))

CNN for classification of numerical dataset in CSV file

I am trying to apply a CNN on my numerical dataset from a CSV file, but I have problems with the dimensions. My Dataset consists of 26 Features/Columns and 1200 rows/samples. The dataset has 3 labels.
Dataset = pd.read_csv("...", header=0)
features = ['...']
x = Dataset [features]
y = Dataset .Classifier
sc = PowerTransformer()
x = sc.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.75)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=4))
model.add(LSTM(64))
model.add(Dense(3, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=100, batch_size=8, verbose=1)
accuracy = model.evaluate(x_test, y_test)
print(accuracy)
I get the following error:
ValueError: Error when checking input: expected conv1d_1_input to have 3 dimensions, but got array with shape (900, 26)
I am not sure how to reshape the data. As far as I know I only need a vector.

You are partly correct. You do need a vector, but it has to be of different dimensions.
Conv1D layer takes as input:
Input shape:
3+D tensor with shape: batch_shape + (steps, input_dim)
In the model.fit function you set your batch size to 8. This means that you have to give sets of 8 samples per step (step = iteration before the network weights are updated).
What you have to do is generate sets (or batches) of 8 samples and then feed them to your network.

Incompatible shapes: [64,4,4] vs. [64,4] - Time Series with 4 variables as input

I am trying to train a LSTM Autoencoder with multivariate time series data. The shape of data is:
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)
(160573, 4, 4)
(160573, 4)
(17838, 4, 4)
(17838, 4)
I wanted to transfer my model for univariate time series into a model for multivariate time series but somehow I do not know what to change:
model = keras.Sequential()
model.add(keras.layers.LSTM(
units=64,
input_shape=(X_train.shape[1], X_train.shape[2])
))
model.add(keras.layers.Dropout(rate=0.2))
model.add(keras.layers.RepeatVector(n=X_train.shape[1]))
model.add(keras.layers.LSTM(units=64, return_sequences=True))
model.add(keras.layers.Dropout(rate=0.2))
model.add(
keras.layers.TimeDistributed(
keras.layers.Dense(units=X_train.shape[2])
)
)
model.compile(loss='mse', optimizer='adam')
training = model.fit(
X_train, y_train,
epochs=10,
batch_size=64,
validation_split=0.1,
shuffle=False
)
The error is:
InvalidArgumentError: Incompatible shapes: [64,4,4] vs. [64,4]
[[node gradient_tape/mean_squared_error/BroadcastGradientArgs (defined at <ipython-input-83-6205cceab3d0>:1) ]] [Op:__inference_train_function_96522]
Function call stack:
train_function
Is there any solution for multivariate time series?
Thank you and best,
Can

Assuming you're not trying to predict a sequence, you need to remove return_sequences=True to remove the time step. Same goes for the TimeDistributed layer, remove it.
Corrected, minimal example:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
from tensorflow import keras
import numpy as np
X_train = np.random.rand(160, 4, 4)
y_train = np.random.rand(160, 4)
X_test = np.random.rand(17, 4, 4)
y_test = np.random.rand(17, 4)
model = keras.Sequential()
model.add(keras.layers.LSTM(
units=4,
input_shape=(X_train.shape[1], X_train.shape[2])
))
model.add(keras.layers.Dropout(rate=0.2))
model.add(keras.layers.RepeatVector(n=X_train.shape[1]))
model.add(keras.layers.LSTM(units=4))
model.add(keras.layers.Dropout(rate=0.2))
model.add(keras.layers.Dense(units=X_train.shape[2]))
model.compile(loss='mse', optimizer='adam')
training = model.fit(
X_train, y_train,
epochs=1,
batch_size=8,
validation_split=0.1,
shuffle=False)

Keras LSTM - Input shape for time series prediction

I am trying to predict the output of a function. (Eventually it will be multi input multi output) but for now just to get the mechanics right I am trying to predict the output of sin function. My dataset is as follows,
t0 t1
0 0.000000 0.125333
1 0.125333 0.248690
2 0.248690 0.368125
3 0.368125 0.481754
4 0.481754 0.587785
5 0.587785 0.684547
6 0.684547 0.770513
7 0.770513 0.844328
8 0.844328 0.904827
9 0.904827 0.951057
.....
Total of 100 values. t0 is the current input t1 is the next output I want to predict. Then data is split into train/test via scikit,
x_train, x_test, y_train, y_test = train_test_split(wave["t0"].values, wave["t1"].values, test_size=0.20)
Problem happens in fit, I get an error that says input wrong dimensions.
model = Sequential()
model.add(LSTM(128, input_shape=??? ,stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train,
batch_size=10, epochs=100,
validation_data=(x_test, y_test))
I've tried other questions on the site to fix the problem but no matter what i try i can not get keras to recognize correct input.

The LSTM expects the input data to be of shape (batch_size, time_steps, num_features). In sine-wave prediction, the num_features is 1, the time_steps is how many previous time-points the LSTM should use for prediction. In the example below, batch size is 1, time_steps is 2 and num_features is 1.
x_train = np.ones((1,2,1))
y_train = np.ones((1,1))
x_test = np.ones((1,2,1))
y_test = np.ones((1,1))
model = Sequential()
model.add(LSTM(128, input_shape=(2,1)))
#for stateful
#model.add(LSTM(128, batch_input_shape=(1,2,1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train,
batch_size=1, epochs=100,
validation_data=(x_test, y_test))

LSTM output Dense expects 2d input

I have features in shape of (size,2) and labels in shape of (size,1) i.e. for [x,y] in feature the label will be z. I want to build an LSTM in keras that can do such job since the feature is linked somehow with the previous inputs i.e. 1 or multiple(I believe its a hyperparameter).
Sample dataset values are:-
features labels
[1,2] [5]
[3,4] [84]
Here is what I have done so far:-
print(labels.shape) #prints (1414,2)
print(features.shape) #prints(1414,1)
look_back=2
# reshape input to be [samples, time steps, features]
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
labels = np.reshape(labels, (labels.shape[0], 1, 1))
X_train, X_test, y_train, y_test = train_test_split(features,labels,test_size=0.2)
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back))) #executing correctly
model.add(Dense(1)) #error here is "ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (1131, 1, 1)"
model.summary()
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)
So can anyone please help me build a minimal LSTM example to run my code? Thank you. I don't know how can dense layer have 2 dimensions I mean it is an integer telling how many units to use in the dense layer.

You must not reshape your labels.
Try this:
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
model = Sequential()
model.add(LSTM(4, input_shape=(1, features.shape[1])))
model.add(Dense(1))
model.summary()
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

(ValueError) How to set data shape in RNN? - python

Related

ValueError: Shapes are incompatible in LSTM model

CNN for classification of numerical dataset in CSV file

Incompatible shapes: [64,4,4] vs. [64,4] - Time Series with 4 variables as input

Keras LSTM - Input shape for time series prediction

LSTM output Dense expects 2d input

Categories

Resources