I started to learn Keras and I came to some confusion with LSTM. I do not get what are the input parameters such as the first parameter that goes into brackets (n) and input_shape.
My dataset is numeric, it has 30 columns, 29 are features and 1 is output (1 and 0).
DataFrame shape (23991, 30)
x_train shape (19192, 29)
y_train shape (19192,)
x_test shape (4799, 29)
y_test shape (4799,)
Based on that, how should parameters look in my layers?
First:
model = Sequential()
model.add(LSTM((?), input_shape = ?, return_sequences = ?, activation = ?))
model.add(Dropout(0.01))
model.add(Dense(1, activation='sigmoid'))
Second:
model = Sequential()
model.add(LSTM((?), input_shape = ?, return_sequences = ?, activation = ?))
model.add(LSTM((?), input_shape = ?, return_sequences = ?, activation = ?))
model.add(Dropout(0.01))
model.add(Dense(1, activation='sigmoid'))
Are these parameters the same if I use for example CuDNNLSTM?
x_train shape (19192, 29)
y_train shape (19192,)
x_test shape (4799, 29)
y_test shape (4799,)
If you have pandas dataframe, convert them to numpy array.
x_train = x_train.to_numpy()
y_train = y_train.to_numpy()
x_test = x_test.to_numpy()
y_test = y_test.to_numpy()
First you need to reshape the data.
x_train = x_train.reshape(19192, 29, 1)
y_train = y_train.reshape(19192,1)
x_test = x_test.reshape(4799, 29, 1)
y_test = y_test.reshape(4799,1)
Now, usually for LSTM dimensions are:
0 - Samples. One sequence is one sample. A batch is comprised of one or more samples.
1 - Time Steps. One time step is one point of observation in the sample.
2 - Features. One feature is one observation at a time step.
So, the third 1 we add a dimension to correspond for features.
LSTM input shape will be (29,1) (29 = time steps, 1 = number of features per time sequence (also for simplicity, you can think of it like number of channels as in CNN)
model = Sequential()
model.add(LSTM(units = 10, input_shape = (29,1), return_sequences = False)) # keep other parameters default if you're not sure
model.add(Dropout(0.01))
model.add(Dense(1, activation='sigmoid'))
Observe, we add return_srquence = True for the first layer but for second LSTM layer we don't. The reason is LSTM needs 3D data (batch, time, features) but Dense needs 2D data (batch, features), when we see return_sequences = True, we send 3D data to next layer for Dense, we instead send 2D data.
model = Sequential()
model.add(LSTM(units = 10, input_shape = (29,1), return_sequences = True))
model.add(LSTM(units = 10, return_sequences = False))
model.add(Dropout(0.01))
model.add(Dense(1, activation='sigmoid'))
Related
I have a dataset of 462 samples and 11 features. This makes the shape of my dataset (462,11). When I split the data using train/test split, the shape of my X_train is (231,11). I'm confused on what the input_shape in the Dense model would be? Would making it (231, 11)be correct? I have shown this below in code:
X_train, X_test, y_train, y_test = train_test_split (X,y, test_size = 0.5, random_state=45)
print(X_train.shape)
model = Sequential()
model.add(Dense(500, input_shape= (462,11), activation = 'relu'))
model.add(Dense(500, activation = 'relu'))
model.add(Dense(128, activation = 'relu'))
model.add(Dense(2, activation = 'linaer'))
It takes in the last dimension of the input only where the input is of shape (batch_size, units). With that in mind do not hardcode the exact input shape which can be taken from the last dimension of X_train.
model.add(Dense(500, input_shape=(X_train.shape[1],), activation = 'relu'))
I have a pandas dataframe X_train with 321 samples and 43 features. Also, there are 18 different classes in y_train.
strong textI want to train a CNN over my data, but I am having trouble to give the input shape in case of pandas dataframe.
X.shape, y.shape
((321, 43), (321,))
X = np.array(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0, stratify = y)
X_train.shape, X_test.shape
((256, 43), (65, 43))
inputs = np.concatenate((X_train, X_test), axis=0)
targets = np.concatenate((y_train, y_test), axis=0)
inputs.shape, targets.shape
((321, 43), (321,))
In the first layer of my model, I am having trouble with input_shape.
I am new to CNN and all the tutorials have used image and they are just passing in the height, width and channel as the parameter of input_shape.
fold_no = 1
for train, test in kfold.split(inputs, targets):
model = Sequential()
**model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(???)))**
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(18, activation='softmax'))
model.compile(optimizer=Adam(learning_rate = 0.001),
loss = 'sparse_categorical_crossentropy',
metrics = ['accuracy'])
history = model.fit(inputs[train], targets[train], batch_size=5, epochs=50, validation_split=0.2, verbose=1)
scores = model.evaluate(inputs[test], targets[test], verbose=0)
fold_no = fold_no + 1
I am having trouble with input_shape in the first layer:
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(???)))
I tried to set the input shape like the following format:
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(None, train.shape[1])))
But I got the following error:
I also tried in this way:
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(321, 43)))
Then I got the following error:
I also tried the following format:
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(None, 43)))
Then I got the following error:
Conv1D takes a 3D shape as an input, but the 1st dimension is the batch size, so you can ignore it for input_shape. The other 2 dimensions are (steps, input_dim).
When dealing with numeric or text data, the two dimensions are usually (a) how many sequential rows you want your CNN layer to process at once, (b) how many features are in the row. If your data is naturally segmented into specific lengths (maybe 24, for hours in a day, or 3 words in a trigram), you'll want to specifically set the steps dimension. It will also affect your output shape, which will be (steps-kernel_size+1, filters). Try using some different shapes and look at the model summary to see how they change.
But as the documentation says, you can also use None as your steps, e.g. (None, 128) for variable-length sequences of 128-dimensional vectors.
So basically, I'd suggest this, where inputs[train].shape[1] should be 43 for you:
input_shape=(None, inputs[train].shape[1])
You could also try the full length of your dataset, e.g. (321, 43):
input_shape=inputs[train].shape
Take a look at this excellent answer and also this article for a good visual intuition of how Conv1D works on numeric/text input.
tensor =1D sample or data point how many numbers or features = and number of samples.
In this case
sample1 or vector=[x0,x1,x2,...,ystando], total amount of samples are:321 and total features=31,it will have input_shape=(321,43) of the tensor
I cannot find a hands on tutorial on how to structure the data for use with keras LSTM.
Data
x_train = 7300 rows where each vector is length 64.
y_train = array of 7300 items either 0's or 1's (the class).
Model
model = Sequential()
model.add(LSTM(200, dropout=0.2, recurrent_dropout=0.2, input_shape = (1, 64)))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train,
epochs = 5,
batch_size = 32,
validation_split = 0.1,
callbacks=[EarlyStopping(monitor='val_loss', patience=3, min_delta=0.0001)])
My question is simply, why doesn't this work? Why isn't is as simple as giving an 2d array of vectors and similar length y values to fit.
Keras LSTM expects input of shape [batch_size, timesteps, features]. Your data is of shape [batch_size, features].
To add the timestep dimension (where number of timesteps is 1), do the following:
x_train = np.expand_dims(x_train, axis=1)
I'm new with Keras and I'm trying to implement a Sequence to Sequence LSTM.
Particularly, I have a dataset with 9 features and I want to predict 5 continuous values.
I split the training and the test set and their shape are respectively:
X TRAIN (59010, 9)
X TEST (25291, 9)
Y TRAIN (59010, 5)
Y TEST (25291, 5)
The LSTM is extremely simple at the moment:
model = Sequential()
model.add(LSTM(100, input_shape=(9,), return_sequences=True))
model.compile(loss="mean_absolute_error", optimizer="adam", metrics= ['accuracy'])
history = model.fit(X_train,y_train,epochs=100, validation_data=(X_test,y_test))
But I have the following error:
ValueError: Input 0 is incompatible with layer lstm_1: expected
ndim=3, found ndim=2
Can anyone help me?
LSTM layer expects inputs to have shape of (batch_size, timesteps, input_dim). In keras you need to pass (timesteps, input_dim) for input_shape argument. But you are setting input_shape (9,). This shape does not include timesteps dimension. The problem can be solved by adding extra dimension to input_shape for time dimension. E.g adding extra dimension with value 1 could be simple solution. For this you have to reshape input dataset( X Train) and Y shape. But this might be problematic because the time resolution is 1 and you are feeding length one sequence. With length one sequence as input, using LSTM does not seem the right option.
x_train = x_train.reshape(-1, 1, 9)
x_test = x_test.reshape(-1, 1, 9)
y_train = y_train.reshape(-1, 1, 5)
y_test = y_test.reshape(-1, 1, 5)
model = Sequential()
model.add(LSTM(100, input_shape=(1, 9), return_sequences=True))
model.add(LSTM(5, input_shape=(1, 9), return_sequences=True))
model.compile(loss="mean_absolute_error", optimizer="adam", metrics= ['accuracy'])
history = model.fit(X_train,y_train,epochs=100, validation_data=(X_test,y_test))
I refer to the example given at the Keras website here:
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
num_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, num_classes))
# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, num_classes))
model.fit(x_train, y_train, batch_size=64, epochs=5, validation_data=(x_val, y_val))
For a real-world example, what should be y_train and y_val? Should they be the same as x_train and x_val respectively, since they come from the same sequence?
Also, how should I understand data_dim and num_classes?
Since your parameter return_sequences = True, your LSTM will be fed numpy arrays of shape [batch_size, time_steps, input_features] and perform a "many-to-many" mapping. Data_dim is simply the number of distinct features your model takes as input. Your y_train will be of shape [[1000, 10]]
The key to understanding the excerpt of code you provided is that setting the parameter return_sequences = True enables the LSTM layer to propagate sequences of values to upstream layers in the network. Note that the final LSTM layer that precedes the 10-way softmax does not set return_sequences = True. This is due to the fact that the Dense layer cannot handle a sequence of inputs - hence, the time_steps dimension is collapsed and the Dense layer receives a vector of inputs, which it can process without issue.
This sort of answers my question.