#x_train.shape = 7x5x5 numpy array
#y_train.shape = 3x5x5 numpy array
#x_test.shape = (7,) numpy array
#y_test.shape = (3,) numpy array I have binary output as 0 or 1.
timeteps = 5
data_dim = 5
model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=`(timesteps,data_dim)))
model.add(LSTM(32, return_sequences=True))
model.add(LSTM(32))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=1)
score = model.evaluate(X_test,y_test,batch_size=1)
ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (3, 5, 5)
I am trying to model LSTM using random data and this error occurs. I have tried many things but I could not succeed.
Thanks in advance.
There are a few problems/misunderstands here?
You can see that your y is actually 3 dimensional. However, the last lstm layer, you have return sequences as false, meaning that the LSTM is returning a single 32 long vector and sending that into the dense layer.
Furthermore, the use of multiple LSTMS here seems to lack purpose, though it does not necessarily harm anything.
In order to fit your presumed data, you would want the last lstm to have return_sequences as True, and have the number of neurons in that lstm not 32, but rather 5, as in the final dimension of your y data.
You could also not have it at all (since you already have two lstms before that, and instead make the second lstm only have 5 neurons and have the final lstm layer be removed entirely. You would then use a time distributed wrapper on the last dense layer
model.add(TimeDistrubuted(Dense(1,activation='sigmoid')))
which says to apply the same dense layer to every timestep of the data, which is required by the shape of your y data.
Related
Trying to train a single layer NN for text based multi label classification problem.
model= Sequential()
model.add(Dense(20, input_dim=400, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(9, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(x_train, y_train, verbose=0, epochs=100)
Getting error as :
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
x_train is a 300-dim word2vec vectorized text data, each instance padded to 400 length. Contains 462 records.
Observations on training data are as below :
print('#### Shape of input numpy array #####')
print(x_train.shape)
print('#### Shape of each element in the array #####')
print(x_train[0].shape)
print('#### Object type for input data #####')
print(type(x_train))
print('##### Object type for first element of input data ####')
print(type(x_train[0]))
#### Shape of input numpy array #####
(462,)
#### Shape of each element in the array #####
(400, 300)
#### Object type for input data #####
<class 'numpy.ndarray'>
##### Object type for first element of input data ####
<class 'numpy.ndarray'>
There are three problems
problem1
This is your main problem, which directly caused the error.
something's wrong with how you initialize/convert your x_train (and I think it is a bug, or you used some unusual way to construct your data), now your x_train is in fact an array of array, instead of a multi-dimensional array. So TensorFlow "thought" you have a 1D array according to its shape, which is not what you want.
the solution is to reconstruct the array before sending to fit():
x_train = np.array([np.array(val) for val in x_train])
problem2
Dense layer expects your input to have shape (batch_size, ..., input_dim), which means your last dimension of x_train must equal to input_dim, but you have 300, which is different from 400.
According to your description, your input dimension, which is the word vector dimension is 300, so you should change input_dim to 300:
model.add(Dense(20, input_dim=300, kernel_initializer='he_uniform', activation='relu'))
or equivalently, directly provide input_shape instead
model.add(Dense(20, input_shape=(400, 300), kernel_initializer='he_uniform', activation='relu'))
problem3
because dense, aka linear layer, is meant for "linear" input, so it expects each of its data to be a vector of one dimensional, so input is usually like (batch_size, vector_length). When dense receive an input of dimension > 2 (you got 3 dimensions), it will perform Dense operation on the last dimension. quote from TensorFlow official documentation:
Note: If the input to the layer has a rank greater than 2, then
Dense computes the dot product between the inputs and the
kernel along the last axis of the inputs and axis 1 of the
kernel (using tf.tensordot).
For example, if input has dimensions (batch_size, d0, d1), then we
create a kernel with shape (d1, units), and the kernel operates
along axis 2 of the input, on every sub-tensor of shape (1, 1, d1)
(there are batch_size * d0 such sub-tensors). The output in this
case will have shape (batch_size, d0, units).
This means your y should have shape (462, 400, 9) instead. which is most likely not what you are looking for (if this is indeed what you are looking for, code in problem1&2 should have solved your problem).
if you are looking for performing dense on the whole 400x300 matrix, you need to first flatten to a one-dimensional vector, like this:
x_train = np.array([np.array(val) for val in x_train]) # reconstruct
model= Sequential()
model.add(Flatten(input_shape=(400, 300)))
model.add(Dense(20, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(9, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(x_train, y_train, verbose=0, epochs=100)
now the output will be (462, 9)
I am trying to build an lstm text classifier using Keras.
This is the model structure:
model_word2vec = Sequential()
model_word2vec.add(Embedding(input_dim=vocabulary_dimension,
output_dim=embedding_dim,
weights=[word2vec_weights,
input_length=longest_sentence,
mask_zero=True,
trainable=False))
model_word2vec.add(LSTM(units=embedding_dim, dropout=0.25, recurrent_dropout=0.25, return_sequences=True))
model_word2vec.add(Dense(3, activation='softmax'))
model_word2vec.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
results = model_word2vec.fit(X_tr_word2vec, y_tr_word2vec, validation_split=0.16, epochs=3, batch_size=128, verbose=0)
Where y_tr_word2vec is a 3-dimensional one-hot encoded variable.
When I run the code above, I get this error:
ValueError: Error when checking model target: expected dense_2 to have 3 dimensions, but got array with shape (15663, 3)
I suppose that the issue could be about y_tr_word2vec shape or the batch size dimension, but I'm not sure.
Update:
I have changed return_sequences=False, y_tr_word2vec from one-hot to categorical, 1 neuron in dense layer, and now I am using sparse_categorical_crossentropy instead of categorical_crossentropy.
Now, I get this error: ValueError: invalid literal for int() with base 10: 'countess'.
Therefore now I suppose that, during fit(), something goes wrong with the input vector X_tr_word2vec, which contains the sentences.
The problem is this code
model_word2vec.add(LSTM(units=dim_embedding, dropout=0.25, recurrent_dropout=0.25, return_sequences=True))
model_word2vec.add(Dense(3, activation='softmax'))
You have set return_sequences=True ,which means LSTM will return a 3D array to dense layer,,whereas dense does not need 3D data...so delete return_sequences=True
model_word2vec.add(LSTM(units=dim_embedding, dropout=0.25, recurrent_dropout=0.25))
model_word2vec.add(Dense(3, activation='softmax'))
why did u set return_sequences=True?
I have a training input in 3 dimensions (8,50,3).
I am trying to pass it as an input to the Sequential Model in Keras. Looking up the documentation I found that this should work:
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(50,3)))
model.add(Dense(100,init="uniform", activation='sigmoid'))
model.add(Dense(50,init="uniform", activation='relu'))
model.add(Dense(output_dim=1))
model.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])
When I try to train this model:
model.fit(train,labelTrain,epochs=1,batch_size=1,verbose=1)
I get the following error:
Error when checking model target: expected dense_148 to have 3 dimensions, but got array with shape (8, 1)
What can it mean?
Also, my first objective was to pass a 3D array where the middle dimension did not have a fixed size but I gave up after finding it impossible. Could it work?
Target means it's the expected result. The problem is in labelTrain, not in the input.
A Dense layer must have a number of neurons. You don't pass it an output shape, you pass the amount of neurons, and the output is automatically (None, neurons)
Your last layer should be:
model.add(Dense(1, activation='I recomend an activation here'))
I'm very new to keras and also to python.
I have a time series dataset with different sequence lengths (for example 1st sequence is 484000x128, 2nd sequence is 563110x128, etc)
I've put the sequences in 3D array.
My question is how to define the input shape, because I'm confused. I was using DL4J but the concept is different in defining the network configuration.
Here is my first trial code:
import numpy as np
from keras.models import Sequential
from keras.layers import Embedding,LSTM,Dense,Dropout
## Loading dummy data
sequences = np.array([[[1,2,3],[1,2,3]], [[4,5,6],[4,5,6],[4,5,6]]])
y = np.array([[[0],[0]], [[1],[1],[1]]])
x_test=np.array([[2,3,2],[4,6,7],[1,2,1]])
y_test=np.array([0,1,1])
n_epochs=40
# model configration
model = Sequential()
model.add(LSTM(100, input_shape=(3,1), activation='tanh', recurrent_activation='hard_sigmoid')) # 100 num of LSTM units
model.add(LSTM(100, activation='tanh', recurrent_activation='hard_sigmoid'))
model.add(Dense(1, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print(model.summary())
## training with batches of size 1 (each batch is a sequence)
for epoch in range(n_epochs):
for seq, label in zip(sequences, y):
model.train(np.array([seq]), [label]) # train a batch at a time..
scores=model.evaluate(x_test, y_test) # evaluate batch at a time..
Here is the docs on input shapes for LSTMs:
Input shapes
3D tensor with shape (batch_size, timesteps, input_dim), (Optional) 2D
tensors with shape (batch_size, output_dim).
Which implies that you you're going to need timesteps with a constant size for each batch.
The canonical way of doing this is padding your sequences using something like keras's padding utility
then you can try:
# let say timestep you choose: is 700000 and dimension of the vectors are 128
timestep = 700000
dims = 128
model.add(LSTM(100, input_shape=(timestep, dim),
activation='tanh', recurrent_activation='hard_sigmoid'))
I edited the answer to remove the batch_size argument. With this setup the batch size is unspecified, you could set that when you fitting the model (in model.fit()).
I am trying to predict the next value in the time series using the previous 20 values. Here is a sample from my code:
X_train.shape is (15015, 20)
Y_train.shape is (15015,)
EMB_SIZE = 1
HIDDEN_RNN = 3
model = Sequential()
model.add(LSTM(input_shape = (EMB_SIZE,), input_dim=EMB_SIZE, output_dim=HIDDEN_RNN, return_sequences=True))
model.add(LSTM(input_shape = (EMB_SIZE,), input_dim=EMB_SIZE, output_dim=HIDDEN_RNN, return_sequences=False))
model.add(Dense(1))
model.add(Activation('softmax'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(X_train,
Y_train,
nb_epoch=5,
batch_size = 128,
verbose=1,
validation_split=0.1)
score = model.evaluate(X_test, Y_test, batch_size=128)
print score
Though when I ran my code I got the following error:
TypeError: ('Bad input argument to theano function with name "/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py:484" at index 0(0-based)', 'Wrong number of dimensions: expected 3, got 2 with shape (32, 20).')
I was trying to replicate the results in this post: neural networks for algorithmic trading. Here is a link to the git repo: link
It seems to be a conceptual error. Please post any sources where I can get a better understanding of LSTMS for time series prediction. Also please explain me how I fix this error, so that I can reproduce the results mentioned in the article mentioned above.
If I understand your problem correctly, your input data a set of 15015 1D sequences of length 20. According to Keras doc, the input is a 3D tensor with shape (nb_samples, timesteps, input_dim). In your case, the shape of X should then be (15015, 20, 1).
Also, you just need to give input_dim to the first LSTM layer. input_shape is redundant and the second layer will infer its input shape automatically:
model = Sequential()
model.add(LSTM(input_dim=EMB_SIZE, output_dim=HIDDEN_RNN, return_sequences=True))
model.add(LSTM(output_dim=HIDDEN_RNN, return_sequences=False))
LSTM in Keras has an input tensor shape of (nb_samples, timesteps, feature_dim)
In your case, X_train should probably have an input shape of (15015, 20, 1). Just reshape it accordingly and the model should run.