Keras in Python: LSTM Dimensions - python

I am building an LSTM network.
My data looks as following:
X_train.shape = (134, 300000, 4)
X_train contains 134 sequences, with 300000 timesteps and 4 features.
Y_train.shape = (134, 2)
Y_train contains 134 labels, [1, 0] for True and [0, 1] for False.
Below is my model in Keras.
model = Sequential()
model.add(LSTM(4, input_shape=(300000, 4), return_sequences=True))
model.compile(loss='categorical_crossentropy', optimizer='adam')
Whenever I run the model, I get the following error:
Error when checking target: expected lstm_52 to have 3 dimensions, but got array with shape (113, 2)
It seems to be related to my Y_train data -- as its shape is (113, 2).
Thank you!

The output shape of your LSTM layer is (batch_size, 300000, 4) (because of return_sequences=True). Therefore your model expects the target y_train to have 3 dimensions but you are passing an array with only 2 dimensions (batch_size, 2).
You probably want to use return_sequences=False instead. In this case the output shape of the LSTM layer will be (batch_size, 4). Moreover, you should add a final softmax layer to your model in order to have the desired output shape of (batch_size, 2):
model = Sequential()
model.add(LSTM(4, input_shape=(300000, 4), return_sequences=False))
model.add(Dense(2, activation='softmax')) # 2 neurons because you have 2 classes
model.compile(loss='categorical_crossentropy', optimizer='adam')

Related

ValueError: logits and labels must have the same shape ((1, 7, 7, 2) vs (1, 2))

I'm quite new to CNN.
I'm trying to create a the following model. but I get the following error: "ValueError: logits and labels must have the same shape ((1, 7, 7, 2) vs (1, 2))"
Below the code I'm trying to implement
#create the training data set
train_data=scaled_data[0:training_data_len,:]
#define the number of periods
n_periods=28
#split the data into x_train and y_train data set
x_train=[]
y_train=[]
for i in range(n_periods,len(train_data)):
x_train.append(train_data[i-n_periods:i,:28])
y_train.append(train_data[i,29])
x_train=np.array(x_train)
y_train=np.array(y_train)
#Reshape the train data
x_train=x_train.reshape(x_train.shape[0],x_train.shape[1],x_train.shape[2],1)
x_train.shape
y_train = keras.utils.to_categorical(y_train,2)
# x_train as the folllowing shape (3561, 28, 28, 1)
# y_train as the following shape (3561, 2, 2)
#Build the 2 D CNN model for regression
model= Sequential()
model.add(Conv2D(32,kernel_size=(3,3),padding='same',activation='relu',input_shape=(x_train.shape[1],x_train.shape[2],1)))
model.add(Conv2D(64,kernel_size=(3,3),padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=(4,4)))
model.add(Dropout(0.25))
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='sigmoid'))
model.add(Dense(2, activation='sigmoid'))
model.summary()
#compile the model
model.compile(optimizer='ADADELTA', loss='binary_crossentropy', metrics=['accuracy'])
#train the model
model.fit(x_train, y_train, batch_size=1, epochs=1, verbose=2)
There are two problems in your approach:
You're using Convolutional/MaxPooling layers in which the inputs/outputs are as matrices, i.e., with the shape of (Batch_Size, Height, Width, Depth). You then add some Dense layers which usually expect vectors, not matrices as inputs. Therefore, you have to first flatten the outputs of MaxPooling before giving it to Dense layer, i.e., add a model.add(Flatten()) after model.add(Dropout(0.25)) and before model.add(Dense(128,activation='relu')).
You are doing binary classification, i.e., you have two classes. You are using binary_crossentropy as the loss function, for this to work, you should keep your targets as they are (0 and 1) and not use y_train = keras.utils.to_categorical(y_train,2). Your final layer should have 1 neuron and not 2 (Change model.add(Dense(2, activation='sigmoid')) into model.add(Dense(1, activation='sigmoid')) )

Keras LSTM model

I cannot find a hands on tutorial on how to structure the data for use with keras LSTM.
Data
x_train = 7300 rows where each vector is length 64.
y_train = array of 7300 items either 0's or 1's (the class).
Model
model = Sequential()
model.add(LSTM(200, dropout=0.2, recurrent_dropout=0.2, input_shape = (1, 64)))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train,
epochs = 5,
batch_size = 32,
validation_split = 0.1,
callbacks=[EarlyStopping(monitor='val_loss', patience=3, min_delta=0.0001)])
My question is simply, why doesn't this work? Why isn't is as simple as giving an 2d array of vectors and similar length y values to fit.
Keras LSTM expects input of shape [batch_size, timesteps, features]. Your data is of shape [batch_size, features].
To add the timestep dimension (where number of timesteps is 1), do the following:
x_train = np.expand_dims(x_train, axis=1)

ValueError: Error when checking target: expected dense_13 to have shape (None, 6) but got array with shape (6, 1)

I am training a classification network with training data which has X.shape = (1119, 7) and Y.shape = (1119, 6). Below is my simple Keras network with and output dim of 6 (size of labels). The error which is returned is below the code
hidden_size = 128
model = Sequential()
model.add(Embedding(7, hidden_size))
#model.add(LSTM(128, input_shape=(1,7)))
model.add(LSTM(hidden_size, return_sequences=True))
model.add(LSTM(hidden_size, return_sequences=True))
model.add(Dense(output_dim=6, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=["categorical_accuracy"])
ValueError: Error when checking target: expected dense_13 to have shape (None, 6) but got array with shape (6, 1)
I would perfer not to do this in tensorflow because I am just prototyping yet it is my first run at Keras and am confused about why it cannot take this data. I attempted to reshape the data in a number of ways in which nothing worked. Any advice as to why this isn't work would be greatly appreciated.
You should probably remove the parameter return_sequences=True from your last LSTM layer. When using return_sequences=True, the output of the LSTM layer has shape (seq_len, hidden_size). Passing this on to a Dense layer gives you an output shape of (seq_len, 6), which is incompatible with your labels. If you instead omit return_sequences=True, then your LSTM layer returns shape (hidden_size,) (it only returns the last element of the sequence) and subsequently your final Dense layer will have output shape (6,) like your labels.

expected ndim=3, found ndim=2

I'm new with Keras and I'm trying to implement a Sequence to Sequence LSTM.
Particularly, I have a dataset with 9 features and I want to predict 5 continuous values.
I split the training and the test set and their shape are respectively:
X TRAIN (59010, 9)
X TEST (25291, 9)
Y TRAIN (59010, 5)
Y TEST (25291, 5)
The LSTM is extremely simple at the moment:
model = Sequential()
model.add(LSTM(100, input_shape=(9,), return_sequences=True))
model.compile(loss="mean_absolute_error", optimizer="adam", metrics= ['accuracy'])
history = model.fit(X_train,y_train,epochs=100, validation_data=(X_test,y_test))
But I have the following error:
ValueError: Input 0 is incompatible with layer lstm_1: expected
ndim=3, found ndim=2
Can anyone help me?
LSTM layer expects inputs to have shape of (batch_size, timesteps, input_dim). In keras you need to pass (timesteps, input_dim) for input_shape argument. But you are setting input_shape (9,). This shape does not include timesteps dimension. The problem can be solved by adding extra dimension to input_shape for time dimension. E.g adding extra dimension with value 1 could be simple solution. For this you have to reshape input dataset( X Train) and Y shape. But this might be problematic because the time resolution is 1 and you are feeding length one sequence. With length one sequence as input, using LSTM does not seem the right option.
x_train = x_train.reshape(-1, 1, 9)
x_test = x_test.reshape(-1, 1, 9)
y_train = y_train.reshape(-1, 1, 5)
y_test = y_test.reshape(-1, 1, 5)
model = Sequential()
model.add(LSTM(100, input_shape=(1, 9), return_sequences=True))
model.add(LSTM(5, input_shape=(1, 9), return_sequences=True))
model.compile(loss="mean_absolute_error", optimizer="adam", metrics= ['accuracy'])
history = model.fit(X_train,y_train,epochs=100, validation_data=(X_test,y_test))

understanding shapes for Keras model

I am trying to wrap my head around the shape needed for my specific task. I am attempting to train a qlearner on some time series data which is contained in a dataframe. My dataframe has the following columns: open, close, high, low and I am trying to get a sliding window of say 50x timesteps. Here is example code for each window:
window = df.iloc[0:50]
df_norm = (window - window.mean()) / (window.max() - window.min())
x = df_norm.values
x = np.expand_dims(x, axis=0)
print x.shape
#(1,50, 4)
Now that I know my shape is (1,50,4) for each item in X I'm at a loss for what shape I feed my model. Lets say I have the following:
model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=(50,4)))
model.add(LSTM(32, return_sequences=True))
model.add(Dense(num_actions))
Gives the following error
ValueError: could not broadcast input array from shape (50,4) into shape (1,50)
And here is another attempt:
model = Sequential()
model.add(Dense(hidden_size, input_shape=(50,4), activation='relu'))
model.add(Dense(hidden_size, activation='relu'))
model.add(Dense(num_actions))
model.compile(sgd(lr=.2), "mse")
which gives the following error:
ValueError: could not broadcast input array from shape (50,4) into shape (1,50))
Here is the shape the model is expecting and the state from my env:
print "Inputs: {}".format(model.input_shape)
print "actual: {}".format(env.state.shape)
#Inputs: (None, 50, 4)
#actual: (1, 50, 4)
Can someone explain where I am going wrong with the shapes here?
The recurrent layer takes inputs of shape (batch_size, timesteps, input_features). Since the shape of x is (1, 50, 4), the data should be interpreted as a single batch of 50 timesteps, each containing 4 features. When initializing the first layer of a model, you pass an input_shape: a tuple specifying the shape of the input, excluding the batch_size dimension. In the case of LSTM layers, you can pass None as the timesteps dimension. Hence, this is how the first layer of the network should be initialized:
model.add(LSTM(32, return_sequences=True, input_shape=(None, 4)))
The second LSTM layer is followed by a dense layer. So you don't need to return sequences for this layer. Hence, this is how you should initialize the second LSTM layer:
model.add(LSTM(32))
Every batch of 50 time steps in x is supposed to be mapped to a single action vector in y. Therefore, since the shape of x is (1, 50, 4), the shape of y must be (1, num_actions). Make sure y doesn't have the timesteps dimension.
Therefore, under the assumption that x and y have the right shapes, the following code should work:
model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=(None, 4)))
model.add(LSTM(32))
model.add(Dense(num_actions))
model.compile(sgd(lr=.2), "mse")
# x.shape == (1, 50, 4)
# y.shape == (1, num_actions)
history = model.fit(x, y)

Categories