Please, help to define appropriate Dense input shapes in keras models. Maybe I have to reshape my data first. I have data set with dimensions shown below:
Data shapes are X_train: (2858, 2037) y_train: (2858, 1) X_test: (715, 2037) y_test: (715, 1)
Number of features (input shape) is 2037
I want to define Sequential keras model like that
``
batch_size = 128
num_classes = 2
epochs = 20
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(X_input_shape,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.summary()
model.compile(loss='binary_crossentropy',
optimizer=RMSprop(),
from_logits=True,
metrics=['accuracy'])
``
Model summary:
``
Layer (type) Output Shape Param #
=================================================================
dense_20 (Dense) (None, 512) 1043456
_________________________________________________________________
dropout_12 (Dropout) (None, 512) 0
_________________________________________________________________
dense_21 (Dense) (None, 512) 262656
=================================================================
Total params: 1,306,112
Trainable params: 1,306,112
Non-trainable params: 0
``
And when I try to fit it...
``
history = model.fit(X_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(X_test, y_test))
``
I got an error:
``
ValueError: Error when checking target: expected dense_21 to have shape (512,) but got array with shape (1,)
``
Modify
model.add(Dense(512, activation='relu'))
to
model.add(Dense(1, activation='relu'))
The output shape to be of size 1, same as y_train.shape[1].
Related
I've been trying to train a CNN model with facial data for creating emojies using facial expression.I'm actually new to machine learing. The code isn't actually my own but I keep getting this ValueError while trying to train the model.
ValueError: One of the dimensions in the output is <= 0 due to downsampling in conv2d. Consider increasing the input size. Received input shape [None, 100, 100, 1] which would produce output shape with a zero or negative value in a dimension.
The code which I'm trying to run is:
def cnn_model():
num_of_classes = get_num_of_classes()
model = Sequential()
model.add(Conv2D(32, (5,5), input_shape=(image_x, image_y, 1), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(10, 10), strides=(10, 10), padding='same'))
model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.6))
model.add(Dense(num_of_classes, activation='softmax'))
sgd = optimizers.SGD(lr=1e-2)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
filepath="cnn_model_keras.h5"
checkpoint1 = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint1]
from keras.utils import plot_model
plot_model(model, to_file='model.png', show_shapes=True)
return model, callbacks_list
num_of_classes value = 12
image_x,image_y = 100
I reorganized your code, you are defining a function to build the model, It would be better to keep only what is related to the model architecture and keep the callbacks and model plotting out of the function.
from keras.models import Sequential
from tensorflow.keras.layers import Conv2D, BatchNormalization, MaxPooling2D,Flatten,Dropout,Dense
from tensorflow.keras import optimizers
import numpy as np
def cnn_model():
image_x = 100
image_y = 100
model = Sequential()
model.add(Conv2D(32, (5,5), input_shape=(image_x, image_y, 1), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(10, 10), strides=(10, 10), padding='same'))
model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.6))
model.add(Dense(12, activation='softmax'))
sgd = optimizers.SGD(lr=1e-2)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
return model
########################
# Model summary
model = cnn_model()
model.summary()
#################### TEST
input = np.ones((1,100,100,1))
print("Output:", model.predict(input))
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 96, 96, 32) 832
_________________________________________________________________
batch_normalization (BatchNo (None, 96, 96, 32) 128
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 10, 10, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 3200) 0
_________________________________________________________________
dense (Dense) (None, 1024) 3277824
_________________________________________________________________
batch_normalization_1 (Batch (None, 1024) 4096
_________________________________________________________________
dropout (Dropout) (None, 1024) 0
_________________________________________________________________
dense_1 (Dense) (None, 12) 12300
=================================================================
Total params: 3,295,180
Trainable params: 3,293,068
Non-trainable params: 2,112
_________________________________________________________________
2022-07-03 15:15:36.927515: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Output: [[0.06614503 0.10535268 0.07621874 0.08486015 0.08070944 0.08046351
0.06786356 0.06059184 0.10280456 0.05683669 0.12510006 0.09305366]]
My training data is an overlapping sliding window of users daily data. it's shape is (1470, 3, 256, 18):
1470 batches of 3 days of data, each day has 256 samples of 18 features each.
My targets shape is (1470,):
a label value for each batch.
I want to train an LSTM to predict a [3 days batch] -> [one target]
The 256 day samples is padded with -10 for days that were missing 256 sampels
I've written the following code to build the model:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dropout,Dense,Masking,Flatten
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.callbacks import TensorBoard,ModelCheckpoint
from tensorflow.keras import metrics
def build_model(num_samples, num_features):
opt = RMSprop(0.001)
model = Sequential()
model.add(Masking(mask_value=-10., input_shape=(num_samples, num_features)))
model.add(LSTM(32, return_sequences=True, activation='tanh'))
model.add(Dropout(0.3))
model.add(LSTM(16, return_sequences=False, activation='tanh'))
model.add(Dropout(0.3))
model.add(Dense(16, activation='tanh'))
model.add(Dense(8, activation='tanh'))
model.add(Dense(1))
model.compile(loss='mse', optimizer=opt ,metrics=['mae','mse'])
return model
model = build_model(256,18)
model.summary()
Model: "sequential_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
masking_7 (Masking) (None, 256, 18) 0
_________________________________________________________________
lstm_14 (LSTM) (None, 256, 32) 6528
_________________________________________________________________
dropout_7 (Dropout) (None, 256, 32) 0
_________________________________________________________________
lstm_15 (LSTM) (None, 16) 3136
_________________________________________________________________
dropout_8 (Dropout) (None, 16) 0
_________________________________________________________________
dense_6 (Dense) (None, 16) 272
_________________________________________________________________
dense_7 (Dense) (None, 8) 136
_________________________________________________________________
dense_8 (Dense) (None, 1) 9
=================================================================
Total params: 10,081
Trainable params: 10,081
Non-trainable params: 0
_________________________________________________________________
I can see that the shapes are incompatible, but I can't figure out how to change the code to fit my problem.
Any help would be appreciated
Update: I've reshaped my data like so:
train_data.reshape(1470*3, 256, 18)
is that right?
I think you are looking for TimeDistributed(LSTM(...)) (source)
day, num_samples, num_features = 3, 256, 18
model = Sequential()
model.add(Masking(mask_value=-10., input_shape=(day, num_samples, num_features)))
model.add(TimeDistributed(LSTM(32, return_sequences=True, activation='tanh')))
model.add(Dropout(0.3))
model.add(TimeDistributed(LSTM(16, return_sequences=False, activation='tanh')))
model.add(Dropout(0.3))
model.add(Dense(16, activation='tanh'))
model.add(Dense(8, activation='tanh'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam' ,metrics=['mae','mse'])
model.summary()
I'm trying to run keras model as follows:
model = Sequential()
model.add(Dense(10, activation='relu',input_shape=(286,)))
model.add(Dense(1, activation='softmax',input_shape=(324827, 286)))
This code works, but if I'm trying to add an embedding layer:
model = Sequential()
model.add(Embedding(286,64, input_shape=(286,)))
model.add(Dense(10, activation='relu',input_shape=(286,)))
model.add(Dense(1, activation='softmax',input_shape=(324827, 286)))
I'm getting the following error :
ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (324827, 1)
My data have 286 features and 324827 rows.
I'm probably doing something wrong with the shape definitions, can you tell me what it is ?
Thanks
You don't need to provide the input_shape in the second Dense layer, and neither the first one, only on the first layer, the following layers shape will be coomputed :
from tensorflow.keras.layers import Embedding, Dense
from tensorflow.keras.models import Sequential
# 286 features and 324827 rows (324827, 286)
model = Sequential()
model.add(Embedding(286,64, input_shape=(286,)))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='softmax'))
model.compile(loss='mse', optimizer='adam')
model.summary()
returns :
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_2 (Embedding) (None, 286, 64) 18304
_________________________________________________________________
dense_2 (Dense) (None, 286, 10) 650
_________________________________________________________________
dense_3 (Dense) (None, 286, 1) 11
=================================================================
Total params: 18,965
Trainable params: 18,965
Non-trainable params: 0
_________________________________________________________________
I hope it's what you're looking for
In my last post linked here, it was said that I have to modify my model for it to be better. To quote the only answerer's comment to my questions (again, thank you, Sir):
The accuracy of prediction is a metric of how good your neural network architecture is and it also depends on your train/validation data. You will have to tune your neural network in such a way that you generalize well by adjusting the hyper parameters such as number of layers, type of layers, learning rate, optimizer etc. ...
I would like to know how I would do these mentioned. Or at the least, be pointed in the right direction. I am honestly both lost in theory and practice.
The only thing I have been able to do is to adjust the epoch above 100. I have also cleaned the images to be identified as much as I can.
Currently, here is how I create my model. It is only based on Tensorflow 2.0's tutorial.
import numpy as np
import tensorflow as tf
from tensorflow import keras
# Load and prepare the MNIST dataset. Convert the samples from integers to floating-point numbers:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
def createModel():
# Build the tf.keras.Sequential model by stacking layers.
# Choose an optimizer and loss function used for training:
model = tf.keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
model = createModel()
model.fit(x_train, y_train, epochs=102, validation_data=(x_test, y_test))
model.evaluate(x_test, y_test)
It gave out a validation accuracy of around .9800 for me. But its performance against images of handwritten characters I've extracted from documents is dismal. I would also like it to be extended such that it can also read other selected characters, but I guess that can be another question for another day.
Thanks!
You could have multiple layers of Convolution/ Max Pool at the beginning that would perform a feature extraction by scanning the image. After that you use a fully connected NN like you did before and a softmax.
You could create a model with a CNN that way:
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from keras.models import Sequential
# Create the model
model = Sequential()
# Add the 1st Convolution/ max pool
model.add(Conv2D(40, kernel_size=5, padding="same",input_shape=(28, 28, 1), activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# 2nd convolution / max pool
model.add(Conv2D(200, kernel_size=3, padding="same", activation = 'relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(1, 1)))
# 3rd convolution/ max pool
model.add(Conv2D(512, kernel_size=3, padding="valid", activation = 'relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(1, 1)))
# Reduce dimensions from 2d to 1d
model.add(Flatten())
model.add(Dense(units=100, activation='relu'))
# Add dropout to prevent overfitting
model.add(Dropout(0.5))
# Final fullyconnected layer
model.add(Dense(10, activation="softmax"))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
Which returns the following model:
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 28, 28, 40) 1040
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 40) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 14, 14, 200) 72200
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 12, 12, 200) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 10, 10, 512) 922112
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 8, 8, 512) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 32768) 0
_________________________________________________________________
dense_1 (Dense) (None, 100) 3276900
_________________________________________________________________
dropout_1 (Dropout) (None, 100) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 1010
=================================================================
Total params: 4,273,262
Trainable params: 4,273,262
Non-trainable params: 0
_________________________________________________________________
I am new to Keras so I really appreciate any help here. For my project, I am trying to train the neural network on multiple time series. I got it work by running a for loop through to fit each time series to the model. The code looks like this:
for i in range(len(train)):
history = model.fit(train_X[i], train_Y[i], epochs=20, batch_size=6, verbose=0, shuffle=True)
If I am not wrong, I am doing online training here. Now I'm trying to do batch training to see if the result is better. I tried to fit a list consisting of all timeseries (each converted into a numpy array), but I get this error:
Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 56 arrays:
Here is the info about the dataset and the model:
model = Sequential()
model.add(LSTM(1, input_shape=(1,16),return_sequences=True))
model.add(Flatten())
model.add(Dense(1, activation='tanh'))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy'])
model.summary()
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 1, 1) 72
_________________________________________________________________
flatten_2 (Flatten) (None, 1) 0
_________________________________________________________________
dense_2 (Dense) (None, 1) 2
=================================================================
Total params: 74
Trainable params: 74
Non-trainable params: 0
print(len(train_X), train_X[0].shape, len(train_Y), train_Y[0].shape)
56 (1, 23, 16) 56 (1, 23, 1)
Here is the block of code that gives me the error :
pyplot.figure(figsize=(16, 25))
history = model.fit(train_X, train_Y, epochs=1, verbose=1, shuffle=False, batch_size = len(train_X))
Input shape of LSTM should be - batch_size, timesteps, features.But we need not to mention batch_size in input shape if you want you can use batch_input_shape.
model = Sequential()
model.add(LSTM(1, input_shape=(23,16),return_sequences=True))
# model.add(Flatten())
model.add(Dense(1, activation='tanh'))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy'])
model.summary()
X = np.random.random((56,1, 23, 16))
y = np.random.random((56,1, 23, 1))
X=np.squeeze(X,axis =1) #as input shape should be (`batch_size`, `timesteps`, `features`)
y = np.squeeze(y,axis =1)
model.fit(X,y,epochs=1, verbose=1, shuffle=False, batch_size = len(X))
I am not sure if it serves your purpose.