Keras LSTM model - python

I cannot find a hands on tutorial on how to structure the data for use with keras LSTM.
Data
x_train = 7300 rows where each vector is length 64.
y_train = array of 7300 items either 0's or 1's (the class).
Model
model = Sequential()
model.add(LSTM(200, dropout=0.2, recurrent_dropout=0.2, input_shape = (1, 64)))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train,
epochs = 5,
batch_size = 32,
validation_split = 0.1,
callbacks=[EarlyStopping(monitor='val_loss', patience=3, min_delta=0.0001)])
My question is simply, why doesn't this work? Why isn't is as simple as giving an 2d array of vectors and similar length y values to fit.

Keras LSTM expects input of shape [batch_size, timesteps, features]. Your data is of shape [batch_size, features].
To add the timestep dimension (where number of timesteps is 1), do the following:
x_train = np.expand_dims(x_train, axis=1)

Related

Preparing Pandas DataFrame for LSTM

I'm trying to fit a LSTM classifier using Keras but don't understand how to prepare the data for training.
I currently have two dataframes for the training data. X_train contains 48 hand-crafted temporal features from IMU data, and y_train contains corresponding labels (4 kinds) representing terrain. The shape of these dataframes is given below:
X_train = X_train.values.reshape(X_train.shape[0],X_train.shape[1],1)
print(X_train.shape, y_train.shape)
**(268320, 48, 1) (268320,)**
Model using batch_size = (32,5,48):
def def_model():
model = Sequential()
model.add(LSTM(units=144,batch_size=(32, 5, 48),return_sequences=True))
model.add(Dropout(0.5))
model.add(Dense(144, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='softmax'))
model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['categorical_accuracy'])
return model
model_LSTM = def_model()
LSTM_history = model_LSTM.fit(X_train, y_train, epochs=15, validation_data=(X_valid, y_valid), verbose=1)
The error that I am getting:
ValueError: Shapes (32, 1) and (32, 48, 4) are incompatible
Any insight into how to fix this particular error and any intuition into what Keras is expecting?
What is the 5 in your batch size ? The batch_size argument in the LSTM layer indicates that your data should be in the form (batch_size, time_steps, feature_per_time_step). If I am understanding correctly, your data has time_steps = 1 and feature_per_time_step = 48.
Here is a sample of working code and the shape of each of them.
def def_model():
model = Sequential()
model.add(LSTM(units=144,batch_size=(32, 1, 48),return_sequences=True))
model.add(Dropout(0.5))
model.add(Dense(144, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='softmax'))
model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['categorical_accuracy'])
return model
model_LSTM = def_model()
X_train = np.random.random((10000,1,48))
y_train = np.random.random((10000,4))
y_train = y_train.reshape(-1,1,4)
data = tf.data.Dataset.from_tensor_slices((X_train, y_train)).batch(32)
model_LSTM.fit(data, epochs=15, verbose=1)
Passing data instead of x_train and y_train in your fit function will fit the model properly.
If you want to have 5 timesteps in your data, you will have to create your X_train in such a way to have it have a shape (n_samples,5,48).

ValueError: logits and labels must have the same shape ((1, 7, 7, 2) vs (1, 2))

I'm quite new to CNN.
I'm trying to create a the following model. but I get the following error: "ValueError: logits and labels must have the same shape ((1, 7, 7, 2) vs (1, 2))"
Below the code I'm trying to implement
#create the training data set
train_data=scaled_data[0:training_data_len,:]
#define the number of periods
n_periods=28
#split the data into x_train and y_train data set
x_train=[]
y_train=[]
for i in range(n_periods,len(train_data)):
x_train.append(train_data[i-n_periods:i,:28])
y_train.append(train_data[i,29])
x_train=np.array(x_train)
y_train=np.array(y_train)
#Reshape the train data
x_train=x_train.reshape(x_train.shape[0],x_train.shape[1],x_train.shape[2],1)
x_train.shape
y_train = keras.utils.to_categorical(y_train,2)
# x_train as the folllowing shape (3561, 28, 28, 1)
# y_train as the following shape (3561, 2, 2)
#Build the 2 D CNN model for regression
model= Sequential()
model.add(Conv2D(32,kernel_size=(3,3),padding='same',activation='relu',input_shape=(x_train.shape[1],x_train.shape[2],1)))
model.add(Conv2D(64,kernel_size=(3,3),padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=(4,4)))
model.add(Dropout(0.25))
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='sigmoid'))
model.add(Dense(2, activation='sigmoid'))
model.summary()
#compile the model
model.compile(optimizer='ADADELTA', loss='binary_crossentropy', metrics=['accuracy'])
#train the model
model.fit(x_train, y_train, batch_size=1, epochs=1, verbose=2)
There are two problems in your approach:
You're using Convolutional/MaxPooling layers in which the inputs/outputs are as matrices, i.e., with the shape of (Batch_Size, Height, Width, Depth). You then add some Dense layers which usually expect vectors, not matrices as inputs. Therefore, you have to first flatten the outputs of MaxPooling before giving it to Dense layer, i.e., add a model.add(Flatten()) after model.add(Dropout(0.25)) and before model.add(Dense(128,activation='relu')).
You are doing binary classification, i.e., you have two classes. You are using binary_crossentropy as the loss function, for this to work, you should keep your targets as they are (0 and 1) and not use y_train = keras.utils.to_categorical(y_train,2). Your final layer should have 1 neuron and not 2 (Change model.add(Dense(2, activation='sigmoid')) into model.add(Dense(1, activation='sigmoid')) )

ValueError: Error when checking input: expected lstm_1_input to have shape (973, 215) but got array with shape (61, 215)

I have received multiple diffrent ValueErrors when trying to solve following by changing many parameters.
It is a time series problem, I have data from 60 shops, 215 items, 1034 days. I have splitted 973 days for train and 61 for test.:
train_x = train_x.reshape((60, 973, 215))
test_x = test_x.reshape((60, 61, 215))
train_y = train_y.reshape((60, 973, 215))
test_y = test_y.reshape((60, 61, 215))
My model:
model = Sequential()
model.add(LSTM(100, input_shape=(train_x.shape[1], train_x.shape[2]),
return_sequences='true'))
model.add(Dense(215))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=
['accuracy'])
history = model.fit(train_x, train_y, epochs=10,
validation_data=(test_x, test_y), verbose=2, shuffle=False)
ValueError: Error when checking input: expected lstm_1_input to have
shape (973, 215) but got array with shape (61, 215)
You've split your data with respect to timesteps as opposed to the samples. You need to decide on what are your samples in the first instance. For the sake of the answer I will assume these are along the first axis (assuming the data has been framed as a supervised time-series problem).
The input_size in LSTM expects the shape of (timesteps, data_dim) as explained here, and these dimensions must remain the same for each batch. In your example, samples from training and testing have different dimensions. The batch size can differ (unless specified with batch_size parameter).
Your data should be split between training and testing along the first axis. Here is an analogous example from Keras tutorials:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
num_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, num_classes))
# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, num_classes))
model.fit(x_train, y_train,
batch_size=64, epochs=5,
validation_data=(x_val, y_val))
You will notice that timesteps is the same for training and testing data and x_train.shape[1] == x_val.shape[1]. It is the number of samples that differs along the first axis x_train.shape[0] is 1000 and x_val.shape[0] is 100.

What should be 'y_train' in Keras LSTM?

I refer to the example given at the Keras website here:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
num_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, num_classes))
# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, num_classes))
model.fit(x_train, y_train, batch_size=64, epochs=5, validation_data=(x_val, y_val))
For a real-world example, what should be y_train and y_val? Should they be the same as x_train and x_val respectively, since they come from the same sequence?
Also, how should I understand data_dim and num_classes?
Since your parameter return_sequences = True, your LSTM will be fed numpy arrays of shape [batch_size, time_steps, input_features] and perform a "many-to-many" mapping. Data_dim is simply the number of distinct features your model takes as input. Your y_train will be of shape [[1000, 10]]
The key to understanding the excerpt of code you provided is that setting the parameter return_sequences = True enables the LSTM layer to propagate sequences of values to upstream layers in the network. Note that the final LSTM layer that precedes the 10-way softmax does not set return_sequences = True. This is due to the fact that the Dense layer cannot handle a sequence of inputs - hence, the time_steps dimension is collapsed and the Dense layer receives a vector of inputs, which it can process without issue.
This sort of answers my question.

LSTM output Dense expects 2d input

I have features in shape of (size,2) and labels in shape of (size,1) i.e. for [x,y] in feature the label will be z. I want to build an LSTM in keras that can do such job since the feature is linked somehow with the previous inputs i.e. 1 or multiple(I believe its a hyperparameter).
Sample dataset values are:-
features labels
[1,2] [5]
[3,4] [84]
Here is what I have done so far:-
print(labels.shape) #prints (1414,2)
print(features.shape) #prints(1414,1)
look_back=2
# reshape input to be [samples, time steps, features]
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
labels = np.reshape(labels, (labels.shape[0], 1, 1))
X_train, X_test, y_train, y_test = train_test_split(features,labels,test_size=0.2)
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back))) #executing correctly
model.add(Dense(1)) #error here is "ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (1131, 1, 1)"
model.summary()
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)
So can anyone please help me build a minimal LSTM example to run my code? Thank you. I don't know how can dense layer have 2 dimensions I mean it is an integer telling how many units to use in the dense layer.
You must not reshape your labels.
Try this:
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
model = Sequential()
model.add(LSTM(4, input_shape=(1, features.shape[1])))
model.add(Dense(1))
model.summary()
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)

Categories