How to specify input_shape for Keras Sequential model - python

How do you deal with this error?
Error when checking target: expected dense_3 to have shape (1,) but got array with shape (398,)
I Tried changing the input_shape=(14,) which is the amount of columns in the train_samples, but i still get the error.
set = pd.read_csv('NHL_DATA.csv')
set.head()
train_labels = [set['Won/Lost']]
train_samples = [set['team'], set['blocked'],set['faceOffWinPercentage'],set['giveaways'],set['goals'],set['hits'],
set['pim'], set['powerPlayGoals'], set['powerPlayOpportunities'], set['powerPlayPercentage'],
set['shots'], set['takeaways'], set['homeaway_away'],set['homeaway_home']]
train_labels = np.array(train_labels)
train_samples = np.array(train_samples)
scaler = MinMaxScaler(feature_range=(0,1))
scaled_train_samples = scaler.fit_transform(train_samples).reshape(-1,1)
model = Sequential()
model.add(Dense(16, input_shape=(14,), activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(Adam(lr=.0001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(scaled_train_samples, train_labels, batch_size=1, epochs=20, shuffle=True, verbose=2)

1) You reshape your training example with .reshape(-1,1) which means all training samples have 1 dimension. However, you define the input shape of the network as input_shape=(14,) that tells the input dimension is 14. I guess this is one problem with your model.
2) You used sparse_categorical_crossentropy which means the ground truth labels are sparse (train_labels should be sparse) but I guess it is not.
Here is an example of how your input should be:
import numpy as np
from tensorflow.python.keras.engine.sequential import Sequential
from tensorflow.python.keras.layers import Dense
x = np.zeros([1000, 14])
y = np.zeros([1000, 2])
model = Sequential()
model.add(Dense(16, input_shape=(14,), activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile('adam', 'categorical_crossentropy')
model.fit(x, y, batch_size=1, epochs=1)

Related

Using conv1D “Error when checking input: expected conv1d_input to have 3 dimensions, but got array with shape (213412, 36)”

My input is simply a csv file with 237124 rows and 37 columns :
The first 36 columns as features
The last column is a Binary class label
I am trying to train my data on the conv1D model.
I have tried to build a CNN with one layer, but I have some problems with it.
The compiler outputs:
ValueError:Error when checking input: expected conv1d_9_input to have shape
(213412, 36) but got array with shape (36, 1)
Code:
import pandas as pd
import numpy as np
import sklearn
from sklearn import metrics
from sklearn.model_selection import KFold
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import StandardScaler
import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.layers import Conv2D,Conv1D, MaxPooling2D,MaxPooling1D
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Dropout,BatchNormalization
dataset=pd.read_csv("C:/Users/User/Desktop/data.csv",encoding='cp1252')
dataset.shape
#output: (237124, 37)
array = dataset.values
X = array[:,0:36]
Y = array[:,36]
kf = KFold(n_splits=10)
kf.get_n_splits(X)
for trainindex, testindex in kf.split(X):
Xtrain, Xtest = X[trainindex], X[testindex]
Ytrain, Ytest = Y[trainindex], Y[testindex]
Xtrain.shape[0]
#output: 213412
Xtrain.shape[1]
#output: 36
Ytrain.shape[0]
#output: 213412
n_timesteps, n_features, n_outputs =Xtrain.shape[0], Xtrain.shape[1],
Ytrain.shape[0]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=1,
activation='relu',input_shape=(n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=1, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(n_outputs, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=
['accuracy'])
# fit network
model.fit(Xtrain, Ytrain, epochs=10, batch_size=32, verbose=0)
# Testing CNN model BY X test
Predictions = model.predict(Xtest,batch_size =100)
rounded = [round(x[0]) for x in Predictions]
Y_predection = pd.DataFrame(rounded)
Y_predection = Y_predection.iloc[:, 0]
.
.
.
I tried to modify the code this way:
Xtrain = np.expand_dims(Xtrain, axis=2)
But the error remains the same.
There's a couple of problems I notice with your code.
Xtrain - Needs to be a 3D tensor. Because anything else, Conv1D cannot process. So if you have 2D data you need to add a new dimension to make it 3D.
Your input_shape needs to be changed to reflect that. For example, if you added only a single channel, it should be [n_features, 1].
# Here I'm assuming some dummy data
# Xtrain => [213412, 36, 1] (Note that you need Xtrain to be 3D not 2D - So we're adding a channel dimension of 1)
Xtrain = np.expand_dims(np.random.normal(size=(213412, 36)),axis=-1)
# Ytrain => [213412, 10]
Ytrain = np.random.choice([0,1], size=(213412,10))
n_timesteps, n_features, n_outputs =Xtrain.shape[0], Xtrain.shape[1], Ytrain.shape[1]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=1,
activation='relu',input_shape=(n_features,1)))
model.add(Conv1D(filters=64, kernel_size=1, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(n_outputs, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(Xtrain, Ytrain, epochs=10, batch_size=32, verbose=0)
You need to specifi only how many dimension X has, not how many samples you will pass for the input layer.
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_features,)))
This means that the input will be N samples of shape n_features
For the last layer you should change the number of units to how many classes you have instead of how many rows your data has.

How to use Embedding layer for RNN with a categorical feature - Classification Task for RecoSys

I would like to build a model (RNN >> LSTM) with an Embedding layer for a categorical feature (Item ID), My training set looks so:
train_x = [[[184563.1], [184324.1], [187853.1], [174963.1], [181663.1]], [[…],[…],[…],[…],[…]], …]
I predict the sixth item ID:
train_y = [0,1,2, …., 12691]
I have 12692 unique item IDs, length of timesteps = 5 and this is a classification task.
This is a brief summary for what I've done so far: (Please correct me if I'm wrong)
One-hot-encoding for the categorical feature:
train_x = [[[1 0 0 … 0 0 0], [0 1 0 … 0 0 0], [0 0 1 … 0 0 0], […], […]], [[…],[…],[…],[…],[…]], …]
Build model:
model = Sequential()
model.add(Embedding(input_dim=12692 , output_dim=250, input_length=5))
model.add(LSTM(128, return_sequences=True)
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(LSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(12692, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
print(model.summary())
history = model.fit(
train_x, train_y,
batch_size=64,
epochs=epochs,
validation_data=(validation_x, validation_y))
score = model.evaluate(validation_x, validation_y, verbose=0)
I get this model summary:
Train on 131204 samples, validate on 107904 samples
But after that, this error appears:
ValueError: Error when checking input: expected embedding_input to have 2 dimensions, but got array with shape (131204, 5, 12692)
Where is my mistake and what would be the solution?
The embedding layer turns positive integers (indexes) into dense vectors of fixed size (Docs). So your train_x is not one-hot-encoded but the integer representing its index in the vocab. It will be the integer corresponding to the categorical feature.
train_x.shape will be (No:of sample X 5) --> Each representing the index of of the categorical feature
train_y.shape will be (No:of sample) --> Each representing the index of the sixth item in your time series.
Working sample
import numpy as np
import keras
from keras.layers import Embedding, LSTM, Dense
n_samples = 100
train_x = np.random.randint(0,12692,size=(n_samples ,5))
train_y = np.random.randint(0,12692,size=(n_samples))
model = keras.models.Sequential()
model.add(Embedding(input_dim=12692+1, output_dim=250, input_length=5))
model.add(LSTM(128, return_sequences=True))
model.add(LSTM(32))
model.add(Dense(32, activation='relu'))
model.add(Dense(12692, activation='softmax'))
opt = keras.optimizers.Adam(lr=0.001, decay=1e-6)
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
print(model.summary())
history = model.fit(
train_x, train_y,
batch_size=64,
epochs=32)

TensorFlow "Please provide as model inputs a single array or a list of arrays"

This is the error and data I entered into my model. I just can't figure out why it won't work since the dimensions are okay and it literally prints a list of arrays.
My Model + Code before:
import numpy as np
training = np.array(training)
training_inputs = list(training[:,0])
training_outputs = list(training[:,1])
print("train inputs ", training_inputs)
print("train outputs ", training_outputs)
# Now lets create our tensorflow model
# In[10]:
from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import LSTM, Dense
model = Sequential()
model.add(Dense(training_inputs[0], activation='linear'))
model.add(Dense(15, activation='linear'))
model.add(Dense(15, activation='linear'))
model.add(Dense(15, activation='linear'))
model.add(Dense(len(training_outputs[0]), activation='softmax'))
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy', 'loss']
)
model.fit(x=training_inputs, y=training_outputs,
epochs=10000,
batch_size=20,
verbose=True,
shuffle=True)
model.save('models/basic_chat.json')
You need an input layer to your model:
...
model = Sequential()
model.add(Dense(15, activation='linear', input_shape=( len(training_inputs[0]),)))
model.add(Dense(15, activation='linear'))
...
training_inputs = np.array(training[:,0])
training_outputs = np.array(training[:,1])

ValueError: Error when checking input: expected cu_dnnlstm_22_input to have 3 dimensions, but got array with shape (2101, 17)

I am new to machine learning. I am having trouble getting my data into my network.
This is the error that I am receiving:
ValueError: Error when checking input: expected cu_dnnlstm_22_input to have 3 dimensions, but got array with shape (2101, 17)
I have tried adding model.add(Flatten()) before the dense layer. I would really appreciate your help!
BATCH_SIZE = 64
test_size_length = int(len(main_df)*TESTING_SIZE)
training_df = main_df[:test_size_length]
validation_df = main_df[test_size_length:]
train_x, train_y = training_df.drop('target',1).to_numpy(), training_df['target'].tolist()
validation_x, validation_y = validation_df.drop('target',1).to_numpy(), validation_df['target'].tolist()
#train_x.shape is (2101, 17)
model = Sequential()
# model.add(Flatten())
model.add(CuDNNLSTM(128, input_shape=(train_x.shape), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(CuDNNLSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(CuDNNLSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(2, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
# Compile model
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy']
)
tensorboard = TensorBoard(log_dir="logs/{}".format(NAME))
filepath = "RNN_Final-{epoch:02d}-{val_acc:.3f}" # unique file name that will include the epoch and the validation acc for that epoch
checkpoint = ModelCheckpoint("models/{}.model".format(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')) # saves only the best ones
# Train model
history = model.fit(
train_x, train_y,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_data=(validation_x, validation_y),
callbacks=[tensorboard, checkpoint],
)
# Score model
score = model.evaluate(validation_x, validation_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
# Save model
model.save("models/{}".format(NAME))
The input to your LSTM layer (CuDNNLSTM) should have shape: (batch_size, timesteps, input_dim).
It looks like you are missing one of these dimensions.
Often we can oversee the last dimension in the case where the input dimension is 1. If this is the case with your model (if you are predicting from a sequence of single numbers), then you might consider expanding the dimensions before the CuDNNLSTM layer with something like this:
model.add(Lambda(lambda t: tf.expand_dims(t, axis=-1)))
model.add(CuDNNLSTM(128))
Without knowing the problem you are working on it's hard to know whether this is a valid way forward but certainly you should keep in mind the required shape of a LSTM layer and reshape/expand dims accordingly.

Keras Model: Same array that is used for model.fit is not being processed in model.predict

I have a model:
model.add(Dense(16, input_dim = X.shape[1], activation = 'tanh'))
model.add(Dropout(0.2))
model.add(Dense(8, activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(4, activation = 'tanh'))
model.add(Dropout(0.2))
model.add(Dense(2, activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mae'])
And during Model.evaluvate it works just fine with 'X' s input:
history = model.fit(X, Y, validation_split=0.2, epochs=10, callbacks= [PrintDot()], batch_size=10, verbose=0)
But during prediction as I use X[1] it throws an error:
ValueError: Error when checking input: expected dense_8_input to have shape (500,) but got array with shape (1,)
But X[1].Shape is (500,):
X[1].shape
--> (500,)
How can I mend this error, any help appreciated
Keras model.predict expects to receive input of (amount_of_items, features).
So even when attempting to predict a single sample, you must reshape it to (1, features) , and in your case, (1, 500).

Categories