I have created the below ANN with 2 fully connected layers and one recurrent. However when running it i get the error: Exception: Input 0 is incompatible with layer lstm_11: expected ndim=3, found ndim=2 Why is it happening?
from keras.models import Sequential
from keras.layers import Dense
from sklearn.cross_validation import train_test_split
import numpy
from sklearn.preprocessing import StandardScaler
from keras.layers import LSTM
seed = 7
numpy.random.seed(seed)
dataset = numpy.loadtxt("sorted_output.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:15]
scaler = StandardScaler(copy=True, with_mean=True, with_std=True ) #data normalization
X = scaler.fit_transform(X) #data normalization
Y = dataset[:,15]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
model = Sequential()
model.add(Dense(12, input_dim=15, init='uniform', activation='relu'))
model.add(LSTM(10, return_sequences=True))
model.add(Dense(15, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test,y_test), nb_epoch=150, batch_size=10)
Based on above all of layers are "Dense" layers in LSTM you are returning the sequence as true.. You should be setting return_sequences=False because of all Dense layer and it should work.
The reason for this error is that LSTM expects the input to have a shape of 3 dimensions (for batch, sequence length and input dimension). But the Dense layer before it outputs a shape of 2 dimensions (for batch and output dimension).
You can see the output shape of the Dense layer by executing below lines of code
>>> model = Sequential()
>>> model.add(Dense(12, input_dim=15, init='uniform', activation='relu'))
>>> model.summary()
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_4 (Dense) (None, 12) 192 dense_input_2[0][0]
====================================================================================================
Total params: 192
Trainable params: 192
Non-trainable params: 0
____________________________________________________________________________________________________
However, you don't explain your intention with the model, so I cannot give you further guidance for this issue. What is your input data? Do you expect the input to be a sequence?
If your input is a sequence, then I suggest you to remove the first Dense layer. But if your input is not a sequence, then I suggest you to remove the LSTM layer.
Related
This question already has an answer here:
ValueError: logits and labels must have the same shape ((None, 4) vs (None, 1))
(1 answer)
Closed 1 year ago.
I have a multi-label classification problem that I am trying to solve with Neural Network using Tensorflow 2.
The problem - I am trying to predict a cause and its corresponding severity. I can have n number of causes and each of the causes can have m possible severity.
Let's say for simplicity
number of causes = 2
number of each causes possible severity = 2
So we essentially have 4 possible outputs
We also have 4 possible input features
I wrote below code -
import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Dropout
from tensorflow.keras import Model
from tensorflow.keras.callbacks import ModelCheckpoint
def get_model_multilabel(n_inputs, n_outputs):
opt = tf.keras.optimizers.SGD(lr=0.01, momentum=0.9)
model = tf.keras.models.Sequential([
#input layer
Dense(10, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'),
## two hidden layer
Dense(10, kernel_initializer='he_uniform', activation='relu'),
Dropout(0.2),
Dense(5, kernel_initializer='he_uniform', activation='relu'),
Dropout(0.2),
## output layer
Dense(n_outputs, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
return model
n_inputs = 4 # because we have 4 features
n_outputs = 4 # because we have 4 labels
mlmodel = get_model_multilabel(n_inputs, n_outputs)
## train the model
mlmodel.fit(X_train,y_train, epochs=50, batch_size=32, validation_split = 0.2, callbacks=callbacks_list)
X_train.shape is (1144, 4) and
y_train.shape is (1144,)
Note the sigmoid activation in the last layer and the binary_crossentropy loss function as I am trying to model a multi-label classification problem. Reference How do I implement multilabel classification neural network with keras
When I train this, it throws error
ValueError: logits and labels must have the same shape ((None, 4) vs (None, 1))
Not sure what am I missing here. Please suggest.
Your Y_train is incorrect in shape it should be (1144,n_outputs) , instead it is (1144,) , which if reshaped is (1144,1) . Your code dosent know the number of samples so it becomes (None,1) . It must match with output shape or (None,4). You have loaded the data incorrectly.
I have read about LSTM and I know that algorithm takes the value of the previous words and consider it in the next word parameters
Now I am trying to apply my first LSTM algorithm
I have this code.
model = Sequential()
model.add(LSTM(units=6, input_shape = (X_train_count.shape[0], X_train_count.shape[1]), return_sequences = True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=ytrain.shape[1], return_sequences=True, name='output'))
model.compile(loss='cosine_proximity', optimizer='sgd', metrics = ['accuracy'])
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
model.summary()
cp=ModelCheckpoint('model_cnn.hdf5',monitor='val_acc',verbose=1,save_best_only=True)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
model.summary()
cp=ModelCheckpoint('model_cnn.hdf5',monitor='val_acc',verbose=1,save_best_only=True)
history = model.fit(X_train_count, ytrain,
epochs=20,
verbose=False,
validation_data=(X_test_count, yval),
batch_size=10,
callbacks=[cp])
1- I cannot see how the LSTM would know the word sequence while my dataset built based on TFIDF?
2- I am getting error that
ValueError: Input 0 of layer sequential_8 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 18644]
The issue seems to be in the shape of X_train_count you are taking in LSTM input shape is always tricky.
If your X_train_count is not in 3D then reshape using the below line.
X_train_count=X_train_count.reshape(X_train_count.shape[0],X_train_count.shape[1],1))
In the LSTM layer, the input_shape should be (timesteps, data_dim).
Below is the example to illustrate the same.
from sklearn.feature_extraction.text import TfidfVectorizer
import tensorflow as tf
from tensorflow import keras
from sklearn.model_selection import train_test_split
X = ["first example","one more","good morning"]
Y = ["first example","one more","good morning"]
vectorizer = TfidfVectorizer().fit(X)
tfidf_vector_X = vectorizer.transform(X).toarray()
tfidf_vector_Y = vectorizer.transform(Y).toarray()
tfidf_vector_X = tfidf_vector_X[:, :, None]
tfidf_vector_Y = tfidf_vector_Y[:, :, None]
X_train, X_test, y_train, y_test = train_test_split(tfidf_vector_X, tfidf_vector_Y, test_size = 0.2, random_state = 1)
from tensorflow.keras import Sequential
from tensorflow.keras.layers import LSTM
model = Sequential()
model.add(LSTM(units=6, input_shape = X_train.shape[1:], return_sequences = True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=1, return_sequences=True, name='output'))
model.compile(loss='cosine_proximity', optimizer='sgd', metrics = ['accuracy'])
Model Summary:
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_9 (LSTM) (None, 6, 6) 192
_________________________________________________________________
lstm_10 (LSTM) (None, 6, 6) 312
_________________________________________________________________
lstm_11 (LSTM) (None, 6, 6) 312
_________________________________________________________________
output (LSTM) (None, 6, 1) 32
=================================================================
Total params: 848
Trainable params: 848
Non-trainable params: 0
_________________________________________________________________
None
Here X_train is of shape (2, 6, 1)
To add to the solution, I would like to suggest to go with a dense vector instead of a sparse vector generated from the Tf-Idf approach representation by replacing with pre-trained models like Google News Vector or Glove as weights to the embedding layer which would be better in performance wise and result wise.
My input is simply a csv file with 237124 rows and 37 columns :
The first 36 columns as features
The last column is a Binary class label
I am trying to train my data on the conv1D model.
I have tried to build a CNN with one layer, but I have some problems with it.
The compiler outputs:
ValueError:Error when checking input: expected conv1d_9_input to have shape
(213412, 36) but got array with shape (36, 1)
Code:
import pandas as pd
import numpy as np
import sklearn
from sklearn import metrics
from sklearn.model_selection import KFold
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import StandardScaler
import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.layers import Conv2D,Conv1D, MaxPooling2D,MaxPooling1D
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Dropout,BatchNormalization
dataset=pd.read_csv("C:/Users/User/Desktop/data.csv",encoding='cp1252')
dataset.shape
#output: (237124, 37)
array = dataset.values
X = array[:,0:36]
Y = array[:,36]
kf = KFold(n_splits=10)
kf.get_n_splits(X)
for trainindex, testindex in kf.split(X):
Xtrain, Xtest = X[trainindex], X[testindex]
Ytrain, Ytest = Y[trainindex], Y[testindex]
Xtrain.shape[0]
#output: 213412
Xtrain.shape[1]
#output: 36
Ytrain.shape[0]
#output: 213412
n_timesteps, n_features, n_outputs =Xtrain.shape[0], Xtrain.shape[1],
Ytrain.shape[0]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=1,
activation='relu',input_shape=(n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=1, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(n_outputs, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=
['accuracy'])
# fit network
model.fit(Xtrain, Ytrain, epochs=10, batch_size=32, verbose=0)
# Testing CNN model BY X test
Predictions = model.predict(Xtest,batch_size =100)
rounded = [round(x[0]) for x in Predictions]
Y_predection = pd.DataFrame(rounded)
Y_predection = Y_predection.iloc[:, 0]
.
.
.
I tried to modify the code this way:
Xtrain = np.expand_dims(Xtrain, axis=2)
But the error remains the same.
There's a couple of problems I notice with your code.
Xtrain - Needs to be a 3D tensor. Because anything else, Conv1D cannot process. So if you have 2D data you need to add a new dimension to make it 3D.
Your input_shape needs to be changed to reflect that. For example, if you added only a single channel, it should be [n_features, 1].
# Here I'm assuming some dummy data
# Xtrain => [213412, 36, 1] (Note that you need Xtrain to be 3D not 2D - So we're adding a channel dimension of 1)
Xtrain = np.expand_dims(np.random.normal(size=(213412, 36)),axis=-1)
# Ytrain => [213412, 10]
Ytrain = np.random.choice([0,1], size=(213412,10))
n_timesteps, n_features, n_outputs =Xtrain.shape[0], Xtrain.shape[1], Ytrain.shape[1]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=1,
activation='relu',input_shape=(n_features,1)))
model.add(Conv1D(filters=64, kernel_size=1, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(n_outputs, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(Xtrain, Ytrain, epochs=10, batch_size=32, verbose=0)
You need to specifi only how many dimension X has, not how many samples you will pass for the input layer.
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_features,)))
This means that the input will be N samples of shape n_features
For the last layer you should change the number of units to how many classes you have instead of how many rows your data has.
I'm building a model to classify text into one of 9 layers, and am having this error when running it. Activation 1 seems to refer to the Convolutional layer's input, but I'm unsure about what's wrong with the input.
num_classes=9
Y_train = keras.utils.to_categorical(Y_train, num_classes)
#Reshape data to add new dimension
X_train = X_train.reshape((100, 150, 1))
Y_train = Y_train.reshape((100, 9, 1))
model = Sequential()
model.add(Conv1d(1, kernel_size=3, activation='relu', input_shape=(None, 1)))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x=X_train,y=Y_train, epochs=200, batch_size=20)
Running this results in the following error:
"ValueError: Error when checking target: expected activation_1 to have shape (None, 9) but got array with shape (9,1)
There are several typos and bugs in your code.
Y_train = Y_train.reshape((100,9))
Since you reshape X_train to (100,150,1), I guess your input step is 150, and channel is 1. So for the Conv1D, (there is a typo in your code), input_shape=(150,1).
You need to flatten your output of conv1d before feeding into Dense layer.
import keras
from keras import Sequential
from keras.layers import Conv1D, Dense, Flatten
X_train = np.random.normal(size=(100,150))
Y_train = np.random.randint(0,9,size=100)
num_classes=9
Y_train = keras.utils.to_categorical(Y_train, num_classes)
#Reshape data to add new dimension
X_train = X_train.reshape((100, 150, 1))
Y_train = Y_train.reshape((100, 9))
model = Sequential()
model.add(Conv1D(2, kernel_size=3, activation='relu', input_shape=(150,1)))
model.add(Flatten())
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x=X_train,y=Y_train, epochs=200, batch_size=20)
I am new in Deep Learning, and I'm trying to create this simple LSTM architecture in Keras using Google Colab:
Input layer of 12 input neurons
One Recurrent hidden layer of 1 hidden neuron for now
Output layer of 1 output neuron
The original error was:
ValueError: Error when checking input: expected lstm_2_input to have 3 dimensions, but got array with shape (4982, 12).
Then I tried:
input_shape=train_x.shape[1:]
But I got:
ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=2
Then I tried:
X_train = np.reshape(X_train, X_train.shape + (1,))
But I got another error again:
ValueError: Must pass 2-d input
Then I tried:
train_x = np.reshape(train_x, (train_x.shape[0], 1, train_x.shape[1]))
But it didn't work:
Must pass 2-d input
Here is my original code:
df_tea = pd.read_excel('cleaned2Apr2019pt2.xlsx')
df_tea.head()
train_x, valid_x = model_selection.train_test_split(df_tea,random_state=2, stratify=df_tea['offer_Offer'])
train_x.shape #(4982, 12)
valid_x.shape #(1661, 12)
model = Sequential()
model.add(LSTM(32, input_shape=train_x.shape, return_sequences=True))
model.add(LSTM(32, return_sequences=True))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['acc'])
history = model.fit(train_x, valid_x,
epochs=10,
batch_size=128,
validation_split=0.2)
I have looked through several stackoverflow and github suggestions for a similar problem, but none works.
Could someone help me please as I don't understand why all these methods failed.
According to your code, timesteps = 1 (in LSTM terminology), input_dim = 12. Hence you should let
input_shape = (1,12)
A general formula is input_shape = (None, timesteps, input_dim) or
input_shape = (timesteps, input_dim)
An example:
import numpy as np
from keras.layers import LSTM, Dense
from keras.models import Sequential
n_examples = 4982 #number of examples
n_ft = 12 #number of features
train_x= np.random.randn(n_examples, n_ft)
#valid_x.shape #(1661, 12)
model = Sequential()
model.add(LSTM(32, input_shape=(1, n_ft), return_sequences=True))
model.add(LSTM(32, return_sequences=True))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['acc'])
model.summary()