understanding shapes for Keras model - python

I am trying to wrap my head around the shape needed for my specific task. I am attempting to train a qlearner on some time series data which is contained in a dataframe. My dataframe has the following columns: open, close, high, low and I am trying to get a sliding window of say 50x timesteps. Here is example code for each window:
window = df.iloc[0:50]
df_norm = (window - window.mean()) / (window.max() - window.min())
x = df_norm.values
x = np.expand_dims(x, axis=0)
print x.shape
#(1,50, 4)
Now that I know my shape is (1,50,4) for each item in X I'm at a loss for what shape I feed my model. Lets say I have the following:
model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=(50,4)))
model.add(LSTM(32, return_sequences=True))
model.add(Dense(num_actions))
Gives the following error
ValueError: could not broadcast input array from shape (50,4) into shape (1,50)
And here is another attempt:
model = Sequential()
model.add(Dense(hidden_size, input_shape=(50,4), activation='relu'))
model.add(Dense(hidden_size, activation='relu'))
model.add(Dense(num_actions))
model.compile(sgd(lr=.2), "mse")
which gives the following error:
ValueError: could not broadcast input array from shape (50,4) into shape (1,50))
Here is the shape the model is expecting and the state from my env:
print "Inputs: {}".format(model.input_shape)
print "actual: {}".format(env.state.shape)
#Inputs: (None, 50, 4)
#actual: (1, 50, 4)
Can someone explain where I am going wrong with the shapes here?

The recurrent layer takes inputs of shape (batch_size, timesteps, input_features). Since the shape of x is (1, 50, 4), the data should be interpreted as a single batch of 50 timesteps, each containing 4 features. When initializing the first layer of a model, you pass an input_shape: a tuple specifying the shape of the input, excluding the batch_size dimension. In the case of LSTM layers, you can pass None as the timesteps dimension. Hence, this is how the first layer of the network should be initialized:
model.add(LSTM(32, return_sequences=True, input_shape=(None, 4)))
The second LSTM layer is followed by a dense layer. So you don't need to return sequences for this layer. Hence, this is how you should initialize the second LSTM layer:
model.add(LSTM(32))
Every batch of 50 time steps in x is supposed to be mapped to a single action vector in y. Therefore, since the shape of x is (1, 50, 4), the shape of y must be (1, num_actions). Make sure y doesn't have the timesteps dimension.
Therefore, under the assumption that x and y have the right shapes, the following code should work:
model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=(None, 4)))
model.add(LSTM(32))
model.add(Dense(num_actions))
model.compile(sgd(lr=.2), "mse")
# x.shape == (1, 50, 4)
# y.shape == (1, num_actions)
history = model.fit(x, y)

Related

Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray) - Already have converted the data to numpy array

Trying to train a single layer NN for text based multi label classification problem.
model= Sequential()
model.add(Dense(20, input_dim=400, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(9, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(x_train, y_train, verbose=0, epochs=100)
Getting error as :
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
x_train is a 300-dim word2vec vectorized text data, each instance padded to 400 length. Contains 462 records.
Observations on training data are as below :
print('#### Shape of input numpy array #####')
print(x_train.shape)
print('#### Shape of each element in the array #####')
print(x_train[0].shape)
print('#### Object type for input data #####')
print(type(x_train))
print('##### Object type for first element of input data ####')
print(type(x_train[0]))
#### Shape of input numpy array #####
(462,)
#### Shape of each element in the array #####
(400, 300)
#### Object type for input data #####
<class 'numpy.ndarray'>
##### Object type for first element of input data ####
<class 'numpy.ndarray'>
There are three problems
problem1
This is your main problem, which directly caused the error.
something's wrong with how you initialize/convert your x_train (and I think it is a bug, or you used some unusual way to construct your data), now your x_train is in fact an array of array, instead of a multi-dimensional array. So TensorFlow "thought" you have a 1D array according to its shape, which is not what you want.
the solution is to reconstruct the array before sending to fit():
x_train = np.array([np.array(val) for val in x_train])
problem2
Dense layer expects your input to have shape (batch_size, ..., input_dim), which means your last dimension of x_train must equal to input_dim, but you have 300, which is different from 400.
According to your description, your input dimension, which is the word vector dimension is 300, so you should change input_dim to 300:
model.add(Dense(20, input_dim=300, kernel_initializer='he_uniform', activation='relu'))
or equivalently, directly provide input_shape instead
model.add(Dense(20, input_shape=(400, 300), kernel_initializer='he_uniform', activation='relu'))
problem3
because dense, aka linear layer, is meant for "linear" input, so it expects each of its data to be a vector of one dimensional, so input is usually like (batch_size, vector_length). When dense receive an input of dimension > 2 (you got 3 dimensions), it will perform Dense operation on the last dimension. quote from TensorFlow official documentation:
Note: If the input to the layer has a rank greater than 2, then
Dense computes the dot product between the inputs and the
kernel along the last axis of the inputs and axis 1 of the
kernel (using tf.tensordot).
For example, if input has dimensions (batch_size, d0, d1), then we
create a kernel with shape (d1, units), and the kernel operates
along axis 2 of the input, on every sub-tensor of shape (1, 1, d1)
(there are batch_size * d0 such sub-tensors). The output in this
case will have shape (batch_size, d0, units).
This means your y should have shape (462, 400, 9) instead. which is most likely not what you are looking for (if this is indeed what you are looking for, code in problem1&2 should have solved your problem).
if you are looking for performing dense on the whole 400x300 matrix, you need to first flatten to a one-dimensional vector, like this:
x_train = np.array([np.array(val) for val in x_train]) # reconstruct
model= Sequential()
model.add(Flatten(input_shape=(400, 300)))
model.add(Dense(20, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(9, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(x_train, y_train, verbose=0, epochs=100)
now the output will be (462, 9)

ValueError: Error when checking target: expected dense_13 to have shape (None, 6) but got array with shape (6, 1)

I am training a classification network with training data which has X.shape = (1119, 7) and Y.shape = (1119, 6). Below is my simple Keras network with and output dim of 6 (size of labels). The error which is returned is below the code
hidden_size = 128
model = Sequential()
model.add(Embedding(7, hidden_size))
#model.add(LSTM(128, input_shape=(1,7)))
model.add(LSTM(hidden_size, return_sequences=True))
model.add(LSTM(hidden_size, return_sequences=True))
model.add(Dense(output_dim=6, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=["categorical_accuracy"])
ValueError: Error when checking target: expected dense_13 to have shape (None, 6) but got array with shape (6, 1)
I would perfer not to do this in tensorflow because I am just prototyping yet it is my first run at Keras and am confused about why it cannot take this data. I attempted to reshape the data in a number of ways in which nothing worked. Any advice as to why this isn't work would be greatly appreciated.
You should probably remove the parameter return_sequences=True from your last LSTM layer. When using return_sequences=True, the output of the LSTM layer has shape (seq_len, hidden_size). Passing this on to a Dense layer gives you an output shape of (seq_len, 6), which is incompatible with your labels. If you instead omit return_sequences=True, then your LSTM layer returns shape (hidden_size,) (it only returns the last element of the sequence) and subsequently your final Dense layer will have output shape (6,) like your labels.

Keras in Python: LSTM Dimensions

I am building an LSTM network.
My data looks as following:
X_train.shape = (134, 300000, 4)
X_train contains 134 sequences, with 300000 timesteps and 4 features.
Y_train.shape = (134, 2)
Y_train contains 134 labels, [1, 0] for True and [0, 1] for False.
Below is my model in Keras.
model = Sequential()
model.add(LSTM(4, input_shape=(300000, 4), return_sequences=True))
model.compile(loss='categorical_crossentropy', optimizer='adam')
Whenever I run the model, I get the following error:
Error when checking target: expected lstm_52 to have 3 dimensions, but got array with shape (113, 2)
It seems to be related to my Y_train data -- as its shape is (113, 2).
Thank you!
The output shape of your LSTM layer is (batch_size, 300000, 4) (because of return_sequences=True). Therefore your model expects the target y_train to have 3 dimensions but you are passing an array with only 2 dimensions (batch_size, 2).
You probably want to use return_sequences=False instead. In this case the output shape of the LSTM layer will be (batch_size, 4). Moreover, you should add a final softmax layer to your model in order to have the desired output shape of (batch_size, 2):
model = Sequential()
model.add(LSTM(4, input_shape=(300000, 4), return_sequences=False))
model.add(Dense(2, activation='softmax')) # 2 neurons because you have 2 classes
model.compile(loss='categorical_crossentropy', optimizer='adam')

expected ndim=3, found ndim=2

I'm new with Keras and I'm trying to implement a Sequence to Sequence LSTM.
Particularly, I have a dataset with 9 features and I want to predict 5 continuous values.
I split the training and the test set and their shape are respectively:
X TRAIN (59010, 9)
X TEST (25291, 9)
Y TRAIN (59010, 5)
Y TEST (25291, 5)
The LSTM is extremely simple at the moment:
model = Sequential()
model.add(LSTM(100, input_shape=(9,), return_sequences=True))
model.compile(loss="mean_absolute_error", optimizer="adam", metrics= ['accuracy'])
history = model.fit(X_train,y_train,epochs=100, validation_data=(X_test,y_test))
But I have the following error:
ValueError: Input 0 is incompatible with layer lstm_1: expected
ndim=3, found ndim=2
Can anyone help me?
LSTM layer expects inputs to have shape of (batch_size, timesteps, input_dim). In keras you need to pass (timesteps, input_dim) for input_shape argument. But you are setting input_shape (9,). This shape does not include timesteps dimension. The problem can be solved by adding extra dimension to input_shape for time dimension. E.g adding extra dimension with value 1 could be simple solution. For this you have to reshape input dataset( X Train) and Y shape. But this might be problematic because the time resolution is 1 and you are feeding length one sequence. With length one sequence as input, using LSTM does not seem the right option.
x_train = x_train.reshape(-1, 1, 9)
x_test = x_test.reshape(-1, 1, 9)
y_train = y_train.reshape(-1, 1, 5)
y_test = y_test.reshape(-1, 1, 5)
model = Sequential()
model.add(LSTM(100, input_shape=(1, 9), return_sequences=True))
model.add(LSTM(5, input_shape=(1, 9), return_sequences=True))
model.compile(loss="mean_absolute_error", optimizer="adam", metrics= ['accuracy'])
history = model.fit(X_train,y_train,epochs=100, validation_data=(X_test,y_test))

LSTM Initial state from Dense layer

I am using a lstm on time series data. I have features about the time series that are not time dependent. Imagine company stocks for the series and stuff like company location in the non-time series features. This is not the usecase, but it is the same idea. For this example, let's just predict the next value in the time series.
So a simple example would be:
feature_input = Input(shape=(None, data.training_features.shape[1]))
dense_1 = Dense(4, activation='relu')(feature_input)
dense_2 = Dense(8, activation='relu')(dense_1)
series_input = Input(shape=(None, data.training_series.shape[1]))
lstm = LSTM(8)(series_input, initial_state=dense_2)
out = Dense(1, activation="sigmoid")(lstm)
model = Model(inputs=[feature_input,series_input], outputs=out)
model.compile(loss='mean_squared_error', optimizer='adam', metrics=["mape"])
however, I am just not sure on how to specify the initial state on the list correctly. I get
ValueError: An initial_state was passed that is not compatible with `cell.state_size`. Received `state_spec`=[<keras.engine.topology.InputSpec object at 0x11691d518>]; However `cell.state_size` is (8, 8)
which I can see is caused by the 3d batch dimension. I tried using Flatten, Permutation, and Resize layers but I don't believe that is correct. What am I missing and how can I connect these layers?
The first problem is that an LSTM(8) layer expects two initial states h_0 and c_0, each of dimension (None, 8). That's what it means by "cell.state_size is (8, 8)" in the error message.
If you only have one initial state dense_2, maybe you can switch to GRU (which requires only h_0). Or, you can transform your feature_input into two initial states.
The second problem is that h_0 and c_0 are of shape (batch_size, 8), but your dense_2 is of shape (batch_size, timesteps, 8). You need to deal with the time dimension before using dense_2 as initial states.
So maybe you can change your input shape into (data.training_features.shape[1],) or take average over timesteps with GlobalAveragePooling1D.
A working example would be:
feature_input = Input(shape=(5,))
dense_1_h = Dense(4, activation='relu')(feature_input)
dense_2_h = Dense(8, activation='relu')(dense_1_h)
dense_1_c = Dense(4, activation='relu')(feature_input)
dense_2_c = Dense(8, activation='relu')(dense_1_c)
series_input = Input(shape=(None, 5))
lstm = LSTM(8)(series_input, initial_state=[dense_2_h, dense_2_c])
out = Dense(1, activation="sigmoid")(lstm)
model = Model(inputs=[feature_input,series_input], outputs=out)
model.compile(loss='mean_squared_error', optimizer='adam', metrics=["mape"])

Categories