I refer to the example given at the Keras website here:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
num_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, num_classes))
# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, num_classes))
model.fit(x_train, y_train, batch_size=64, epochs=5, validation_data=(x_val, y_val))
For a real-world example, what should be y_train and y_val? Should they be the same as x_train and x_val respectively, since they come from the same sequence?
Also, how should I understand data_dim and num_classes?
Since your parameter return_sequences = True, your LSTM will be fed numpy arrays of shape [batch_size, time_steps, input_features] and perform a "many-to-many" mapping. Data_dim is simply the number of distinct features your model takes as input. Your y_train will be of shape [[1000, 10]]
The key to understanding the excerpt of code you provided is that setting the parameter return_sequences = True enables the LSTM layer to propagate sequences of values to upstream layers in the network. Note that the final LSTM layer that precedes the 10-way softmax does not set return_sequences = True. This is due to the fact that the Dense layer cannot handle a sequence of inputs - hence, the time_steps dimension is collapsed and the Dense layer receives a vector of inputs, which it can process without issue.
This sort of answers my question.
Related
I started to learn Keras and I came to some confusion with LSTM. I do not get what are the input parameters such as the first parameter that goes into brackets (n) and input_shape.
My dataset is numeric, it has 30 columns, 29 are features and 1 is output (1 and 0).
DataFrame shape (23991, 30)
x_train shape (19192, 29)
y_train shape (19192,)
x_test shape (4799, 29)
y_test shape (4799,)
Based on that, how should parameters look in my layers?
First:
model = Sequential()
model.add(LSTM((?), input_shape = ?, return_sequences = ?, activation = ?))
model.add(Dropout(0.01))
model.add(Dense(1, activation='sigmoid'))
Second:
model = Sequential()
model.add(LSTM((?), input_shape = ?, return_sequences = ?, activation = ?))
model.add(LSTM((?), input_shape = ?, return_sequences = ?, activation = ?))
model.add(Dropout(0.01))
model.add(Dense(1, activation='sigmoid'))
Are these parameters the same if I use for example CuDNNLSTM?
x_train shape (19192, 29)
y_train shape (19192,)
x_test shape (4799, 29)
y_test shape (4799,)
If you have pandas dataframe, convert them to numpy array.
x_train = x_train.to_numpy()
y_train = y_train.to_numpy()
x_test = x_test.to_numpy()
y_test = y_test.to_numpy()
First you need to reshape the data.
x_train = x_train.reshape(19192, 29, 1)
y_train = y_train.reshape(19192,1)
x_test = x_test.reshape(4799, 29, 1)
y_test = y_test.reshape(4799,1)
Now, usually for LSTM dimensions are:
0 - Samples. One sequence is one sample. A batch is comprised of one or more samples.
1 - Time Steps. One time step is one point of observation in the sample.
2 - Features. One feature is one observation at a time step.
So, the third 1 we add a dimension to correspond for features.
LSTM input shape will be (29,1) (29 = time steps, 1 = number of features per time sequence (also for simplicity, you can think of it like number of channels as in CNN)
model = Sequential()
model.add(LSTM(units = 10, input_shape = (29,1), return_sequences = False)) # keep other parameters default if you're not sure
model.add(Dropout(0.01))
model.add(Dense(1, activation='sigmoid'))
Observe, we add return_srquence = True for the first layer but for second LSTM layer we don't. The reason is LSTM needs 3D data (batch, time, features) but Dense needs 2D data (batch, features), when we see return_sequences = True, we send 3D data to next layer for Dense, we instead send 2D data.
model = Sequential()
model.add(LSTM(units = 10, input_shape = (29,1), return_sequences = True))
model.add(LSTM(units = 10, return_sequences = False))
model.add(Dropout(0.01))
model.add(Dense(1, activation='sigmoid'))
I cannot find a hands on tutorial on how to structure the data for use with keras LSTM.
Data
x_train = 7300 rows where each vector is length 64.
y_train = array of 7300 items either 0's or 1's (the class).
Model
model = Sequential()
model.add(LSTM(200, dropout=0.2, recurrent_dropout=0.2, input_shape = (1, 64)))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train,
epochs = 5,
batch_size = 32,
validation_split = 0.1,
callbacks=[EarlyStopping(monitor='val_loss', patience=3, min_delta=0.0001)])
My question is simply, why doesn't this work? Why isn't is as simple as giving an 2d array of vectors and similar length y values to fit.
Keras LSTM expects input of shape [batch_size, timesteps, features]. Your data is of shape [batch_size, features].
To add the timestep dimension (where number of timesteps is 1), do the following:
x_train = np.expand_dims(x_train, axis=1)
I'm trying to build a multi-output keras model starting from a working single output model. Keras however, is complaining about tensors dimensions.
The single output Model:
This GRU model is training and predicting fine:
timesteps = 250
features = 2
input_tensor = Input(shape=(timesteps, features), name="input")
conv = Conv1D(filters=128, kernel_size=6,use_bias=True)(input_tensor)
b = BatchNormalization()(conv)
s_gru, states = GRU(256, return_sequences=True, return_state=True, name="gru_1")(b)
biases = keras.initializers.Constant(value=88.15)
out = Dense(1, activation='linear', name="output")(s_gru)
model = Model(inputs=input_tensor, outputs=out)
My numpy arrays are:
train_x # shape:(7110, 250, 2)
train_y # shape: (7110, 250, 1)
If fit the model with the following code and everything is fine:
model.fit(train_x, train_y,batch_size=128, epochs=10, verbose=1)
The Problem:
I want to use a slightly modified version of the network that outputs also the GRU states:
input_tensor = Input(shape=(timesteps, features), name="input")
conv = Conv1D(filters=128, kernel_size=6,use_bias=True)(input_tensor)
b = BatchNormalization()(conv)
s_gru, states = GRU(256, return_sequences=True, return_state=True, name="gru_1")(b)
biases = keras.initializers.Constant(value=88.15)
out = Dense(1, activation='linear', name="output")(s_gru)
model = Model(inputs=input_tensor, outputs=[out, states]) # multi output
#fit the model but with a list of numpy array as y
model.compile(optimizer=optimizer, loss='mae', loss_weights=[0.5, 0.5])
history = model.fit(train_x, [train_y,train_y], batch_size=128, epochs=10, callbacks=[])
This training fails and keras is complaining about the target dimensions:
ValueError: Error when checking target: expected gru_1 to have 2 dimensions, but got array with shape (7110, 250, 1)
I'm using Keras 2.3.0 and Tensorflow 2.0.
What am I missing here?
The dimensions of the second output and the second element in the outputs list should be of similar shape. In this case, states would be of shape (7110, 256), which can't really be compared to the train_y shape (which will be of shape (7110, 250, 1) as noted in the first code block. Make sure the outputs can be compared with a similar shape.
I have received multiple diffrent ValueErrors when trying to solve following by changing many parameters.
It is a time series problem, I have data from 60 shops, 215 items, 1034 days. I have splitted 973 days for train and 61 for test.:
train_x = train_x.reshape((60, 973, 215))
test_x = test_x.reshape((60, 61, 215))
train_y = train_y.reshape((60, 973, 215))
test_y = test_y.reshape((60, 61, 215))
My model:
model = Sequential()
model.add(LSTM(100, input_shape=(train_x.shape[1], train_x.shape[2]),
return_sequences='true'))
model.add(Dense(215))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=
['accuracy'])
history = model.fit(train_x, train_y, epochs=10,
validation_data=(test_x, test_y), verbose=2, shuffle=False)
ValueError: Error when checking input: expected lstm_1_input to have
shape (973, 215) but got array with shape (61, 215)
You've split your data with respect to timesteps as opposed to the samples. You need to decide on what are your samples in the first instance. For the sake of the answer I will assume these are along the first axis (assuming the data has been framed as a supervised time-series problem).
The input_size in LSTM expects the shape of (timesteps, data_dim) as explained here, and these dimensions must remain the same for each batch. In your example, samples from training and testing have different dimensions. The batch size can differ (unless specified with batch_size parameter).
Your data should be split between training and testing along the first axis. Here is an analogous example from Keras tutorials:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
num_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, num_classes))
# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, num_classes))
model.fit(x_train, y_train,
batch_size=64, epochs=5,
validation_data=(x_val, y_val))
You will notice that timesteps is the same for training and testing data and x_train.shape[1] == x_val.shape[1]. It is the number of samples that differs along the first axis x_train.shape[0] is 1000 and x_val.shape[0] is 100.
I'm very new to keras and also to python.
I have a time series dataset with different sequence lengths (for example 1st sequence is 484000x128, 2nd sequence is 563110x128, etc)
I've put the sequences in 3D array.
My question is how to define the input shape, because I'm confused. I was using DL4J but the concept is different in defining the network configuration.
Here is my first trial code:
import numpy as np
from keras.models import Sequential
from keras.layers import Embedding,LSTM,Dense,Dropout
## Loading dummy data
sequences = np.array([[[1,2,3],[1,2,3]], [[4,5,6],[4,5,6],[4,5,6]]])
y = np.array([[[0],[0]], [[1],[1],[1]]])
x_test=np.array([[2,3,2],[4,6,7],[1,2,1]])
y_test=np.array([0,1,1])
n_epochs=40
# model configration
model = Sequential()
model.add(LSTM(100, input_shape=(3,1), activation='tanh', recurrent_activation='hard_sigmoid')) # 100 num of LSTM units
model.add(LSTM(100, activation='tanh', recurrent_activation='hard_sigmoid'))
model.add(Dense(1, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print(model.summary())
## training with batches of size 1 (each batch is a sequence)
for epoch in range(n_epochs):
for seq, label in zip(sequences, y):
model.train(np.array([seq]), [label]) # train a batch at a time..
scores=model.evaluate(x_test, y_test) # evaluate batch at a time..
Here is the docs on input shapes for LSTMs:
Input shapes
3D tensor with shape (batch_size, timesteps, input_dim), (Optional) 2D
tensors with shape (batch_size, output_dim).
Which implies that you you're going to need timesteps with a constant size for each batch.
The canonical way of doing this is padding your sequences using something like keras's padding utility
then you can try:
# let say timestep you choose: is 700000 and dimension of the vectors are 128
timestep = 700000
dims = 128
model.add(LSTM(100, input_shape=(timestep, dim),
activation='tanh', recurrent_activation='hard_sigmoid'))
I edited the answer to remove the batch_size argument. With this setup the batch size is unspecified, you could set that when you fitting the model (in model.fit()).