Keras tuner Bayesian Optmization graph error - python

I am trying to optimize a convolutional neural network with Bayesian Optimization algorithm provided in keras tuner library.
When I perform the line: tuner_cnn.search(datagen.flow(X_trainRusReshaped,Y_trainRusHot), epochs=50, batch_size=256)
I encounter this error:
InvalidArgumentError: Graph execution error
One-Hot-Encode y_train and y_test as the following:
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
X_trainShape = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_testShape = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_trainFlat = X_train.reshape(X_train.shape[0], X_trainShape)
X_testFlat = X_test.reshape(X_test.shape[0], X_testShape)
# One-hot-encoding
Y_trainRusHot = to_categorical(Y_trainRus, num_classes = 2)
Y_testRusHot = to_categorical(Y_testRus, num_classes = 2)
I defined my model builder like that:
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=180,
horizontal_flip=True,vertical_flip = True)
def model_builder(hp):
model = Sequential()
#model.add(Input(shape=(50,50,3)))
for i in range(hp.Int('num_blocks', 1, 2)):
hp_padding=hp.Choice('padding_'+ str(i), values=['valid', 'same'])
hp_filters=hp.Choice('filters_'+ str(i), values=[32, 64])
model.add(Conv2D(hp_filters, (3, 3), padding=hp_padding, activation='relu', kernel_initializer='he_uniform', input_shape=(50, 50, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(hp.Choice('dropout_'+ str(i), values=[0.0, 0.1, 0.2])))
model.add(Flatten())
hp_units = hp.Int('units', min_value=25, max_value=150, step=25)
model.add(Dense(hp_units, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10,activation="softmax"))
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3])
hp_optimizer=hp.Choice('Optimizer', values=['Adam', 'SGD'])
if hp_optimizer == 'Adam':
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3])
elif hp_optimizer == 'SGD':
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3])
nesterov=True
momentum=0.9
model.compile(loss=keras.losses.binary_crossentropy, optimizer=tf.keras.optimizers.Adam(learning_rate=hp_learning_rate), metrics=['accuracy'])
return model
perform the tuner search:
tuner_cnn = kt.tuners.BayesianOptimization(
model_builder,
objective='val_loss',
max_trials=100,
directory='.',
project_name='tuning-cnn')
tuner_cnn.search(datagen.flow(X_trainRusReshaped,Y_trainRusHot), epochs=50, batch_size=256)
I also tried to do:
tuner_cnn.search(X_trainRusReshaped, Y_trainRusHot, epochs=80, validation_data=(X_testRusReshaped, Y_testRusHot), callbacks=[stop_early])
But it does not work neither. Any idea?

From the full error message I was able to narrow down where the issue is coming from. The issue is that your last Dense layer has 10 units, which means you expect 10 classes (you even chose the correct activation function given the number of units). However you have Binary CrossEntropy as loss.
So you either have 10 classes and use either categorical or sparse categorical CrossEntropy or you have 2 classes and so the loss is indeed Binary CrossEntropy.

Related

How to calculate mean relative error on test datasets

def create_model():
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(40002, 12)))
model.add(LSTM(50, return_sequences= True))
model.add(LSTM(50, return_sequences= True))
model.add(tf.keras.layers.LSTM(30))
model.add(Dense(2, activation='linear'))
def rmse(Y_test, prediction):
return K.sqrt(K.mean(K.square(Y_test-prediction)))
# compile
model.compile(optimizer='adam', loss=rmse, metrics=['mean_squared_error', rmse])
return model
# fit the model
model = create_model()
model.fit(x_train, Y_train, shuffle=False, verbose=1, epochs=10)
# # predict model
prediction = model.predict(x_test, verbose=0)
print(prediction)
How to calculate mean relative error for tensor inputs i.e my Y_test and prediction are tensor.
Y_test and prediction as 2 values
Example:
Y_test = [[0.2,0.003],
[0.3, 0.008]]
prediction = [[0.4,0.005],
[0.5,0.007]]
mean_relative_error = mean(absolute(0.2-0.4)/0.2 + absolute(0.003-0.005)/0.003), mean(absolute(0.3-0.5)/0.3 + absolute(0.008-0.007)/0.008)
mean_relative_error = [0.533, 0.3925]
Please note that I don't want to use it for backpropagation to improve the network.
Would have added like this:
from tensorflow.math import reduce_mean, abs, reduce_sum
relative_error = reduce_mean(reduce_sum(abs(prediction-Y_test)/prediction, axis=1))
# [0.9, 0.54285717]
mean_relative_error = reduce_mean(relative_error)
# 0.7214286
I couldn't use tf.keras.losses.MeanAbsoluteError(reduction=tf.keras.losses.Reduction.NONE) because of a bug. The MeanAbsoluteError still does reduce to mean despite specifying it not to. The bug reported HERE

Tensorflow Neural Network like MLPClassifier (sklearn)

So, I am creating an AI which predicts how long a user will take to finish exercises. I previously created a NN with Sklearn, but I want to integrate Tensorflow.
I have 6 features as input and 1 output, which is a number.
I tried this but it does not seem to be willing to work:
# Train data
X_train = X[:1500]
y_train = y[:1500]
# Test data
X_test = X[1500:]
y_test = y[1500:]
# Create the TF model
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(6,)),
tf.keras.layers.Dense(256, activation='softmax'),
tf.keras.layers.Dense(128, activation='softmax'),
tf.keras.layers.Dense(64, activation='softmax'),
tf.keras.layers.Dense(1)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10)
With that, it used to work with a simple MLPClassifier.
I also managed to get this nice error which does not seem to be fixed by changing the layers:
Received a label value of 1209638408 which is outside the valid range of [0, 1).
So I changed it a bit and came up with this:
features_train = features[:1500]
output_train = output[:1500]
features_test = features[1500:]
output_test = output[1500:]
classifier = Sequential()
classifier.add(Dense(units = 16, activation = 'relu', input_dim = 6))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 64, activation = 'relu'))
classifier.add(Dense(units = 32, activation = 'relu'))
classifier.add(Dense(units = 8, activation = 'relu'))
classifier.add(Dense(units = 2, activation = 'relu'))
classifier.add(Dense(units = 1))
classifier.compile(optimizer='rmsprop', loss='binary_crossentropy')
classifier.fit(features_train, output_train, batch_size = 1, epochs = 10)
But now I get a loss of 100%.
You should use a smaller network. Try with fewer Dense layers, 2 or 3 maximum. If you use the binary_crossentropy loss, use a sigmoid activation in the last Dense layer. You can also pass metrics=['accuracy'] when compiling the model to monitor the accuracy.

LSTM used for regression

Problem: I have S sequences of T timesteps each and each timestep contains F features so collectively, a dataset of
(S x T x F) and each s in S is described by 2 values (Target_1 and Target_2)
Goal: Model/Train an architecture using LSTMs in order to learn/achieve a function approximator model M and given a sequence s, to predict Target_1 and Target_2 ?
Something like this:
M(s) ~ (Target_1, Target_2)
I'm really struggling to find a way, below is a Keras implementation of an example that probably does not work. I made 2 models one for the first Target value and 1 for the second.
model1 = Sequential()
model1.add(Masking(mask_value=-10.0))
model1.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences = True))
model1.add(Flatten())
model1.add(Dense(hidden_units, activation = "relu"))
model1.add(Dense(1, activation = "linear"))
model1.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model1.fit(x_train, y_train[:,0], validation_data=(x_test, y_test[:,0]), epochs=epochs, batch_size=batch, shuffle=False)
model2 = Sequential()
model2.add(Masking(mask_value=-10.0))
model2.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences=True))
model2.add(Flatten())
model2.add(Dense(hidden_units, activation = "relu"))
model2.add(Dense(1, activation = "linear"))
model2.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model2.fit(x_train, y_train[:,1], validation_data=(x_test, y_test[:,1]), epochs=epochs, batch_size=batch, shuffle=False)
I want to make somehow good use of LSTMs time relevant memory in order to achieve good regression.
IIUC, you can start off with a simple (naive) approach by using two output layers:
import tensorflow as tf
timesteps, features = 20, 5
inputs = tf.keras.layers.Input((timesteps, features))
x = tf.keras.layers.Masking(mask_value=-10.0)(inputs)
x = tf.keras.layers.LSTM(32, return_sequences=False)(x)
x = tf.keras.layers.Dense(32, activation = "relu")(x)
output1 = Dense(1, activation = "linear", name='output1')(x)
output2 = Dense(1, activation = "linear", name='output2')(x)
model = tf.keras.Model(inputs, [output1, output2])
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001))
x_train = tf.random.normal((500, timesteps, features))
y_train = tf.random.normal((500, 2))
model.fit(x_train, [y_train[:,0],y_train[:,1]] , epochs=5, batch_size=32, shuffle=False)

TypeError: fit() missing 1 required positional argument: 'y' while GridSearching CNN

train_dataset = train.flow_from_directory('/kaggle/input/temp-frames/frames/train', target_size=(64,64), batch_size=256, class_mode='categorical')
validation_dataset = train.flow_from_directory('/kaggle/input/temp-frames/frames/validation', target_size=(64,64), batch_size=256, class_mode='categorical')
test_dataset = train.flow_from_directory('/kaggle/input/temp-frames/frames/test',target_size=(64,64), batch_size=256, class_mode='categorical')
def create_model():
model = Sequential()
model.add(Conv2D(filters= 128, kernel_size=(3,3), activation ='relu',strides = (2,2), padding = 'valid', input_shape= (64,64,3)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters= 256, kernel_size=(3,3), activation ='relu',strides = (2,2), padding = 'valid'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.5))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(37))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
import sklearn
seed = 7
np.random.seed(seed)
model = tf.keras.wrappers.scikit_learn.KerasRegressor(build_fn = create_model, verbose = 10)
​
batch_size_list = [10]
epochs = [10]
param_grid = dict(batch_size=batch_size_list, nb_epoch=epochs)
grid = sklearn.model_selection.GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(train_dataset) # error here
I get an error at the last line:
TypeError: fit() missing 1 required positional argument: 'y'`
How should I fix this? I was able to use history = model.fit(train_dataset,batch_size = 2048, epochs = 100, validation_data = validation_dataset, shuffle = True) without any problems previously.
sklearn.model_selection.GridSearchCV works as any scikit-learn estimator,
so need a y argument for Neural Networks - see the documentation.
So you should divide train_dataset into two separate datasets (best is DataFrame) of dimensions [n, m-p] and [n, p].
For supervised learning, the .fit method requires both X (input features) and y (true labels) arguments - check the documentation.
You must split your training dataset into inputs and labels and feed both to grid.fit.

Metrics and Loss function keras

Code -
def define_model():
# channel 1
inputs1 = Input(shape=(32,1))
conv1 = Conv1D(filters=256, kernel_size=2, activation='relu')(inputs1)
#bat1 = BatchNormalization(momentum=0.9)(conv1)
pool1 = MaxPooling1D(pool_size=2)(conv1)
flat1 = Flatten()(pool1)
# channel 2
inputs2 = Input(shape=(32,1))
conv2 = Conv1D(filters=256, kernel_size=4, activation='relu')(inputs2)
pool2 = MaxPooling1D(pool_size=2)(conv2)
flat2 = Flatten()(pool2)
# channel 3
inputs3 = Input(shape=(32,1))
conv3 = Conv1D(filters=256, kernel_size=4, activation='relu')(inputs3)
pool3 = MaxPooling1D(pool_size=2)(conv3)
flat3 = Flatten()(pool3)
# channel 4
inputs4 = Input(shape=(32,1))
conv4 = Conv1D(filters=256, kernel_size=6, activation='relu')(inputs4)
pool4 = MaxPooling1D(pool_size=2)(conv4)
flat4 = Flatten()(pool4)
# merge
merged = concatenate([flat1, flat2, flat3, flat4])
# interpretation
dense1 = Dense(128, activation='relu')(merged)
dense2 = Dense(96, activation='relu')(dense1)
outputs = Dense(10, activation='softmax')(dense2)
model = Model(inputs=[inputs1, inputs2, inputs3, inputs4 ], outputs=outputs)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=[categorical_accuracy])
plot_model(model, show_shapes=True, to_file='/content/q.png')
return model
model_concat = define_model()
# fit model
print()
red_lr= ReduceLROnPlateau(monitor='val_loss',patience=2,verbose=2,factor=0.001,min_delta=0.01)
check=ModelCheckpoint(filepath=r'/content/drive/My Drive/Colab Notebooks/gen/concatcnn.hdf5', verbose=1, save_best_only = True)
History = model_concat.fit([X_train, X_train, X_train, X_train], y_train , epochs=20, verbose = 1 ,validation_data=([X_test, X_test, X_test, X_test], y_test), callbacks = [check, red_lr], batch_size = 32)
model_concat.summary
Unfortunately, I used binary crossentropy as loss and 'accuracy' as metrics. I got above 90% of val_accuracy.
Then, I found this link, Keras binary_crossentropy vs categorical_crossentropy performance?.
After reading the first answer, I used binary crossentropy as loss and categorical crossentropy as metrics...
Even though, I Changed this, the val_acc is not improving, it shows around 62%. what to do...
I minimised the model complexity to learn the data, but the accuracy is not improving. Am I miss anything...
Data set shape, x_train is (800,32) y_train is (200,32) y_train is (800,10) and y_test is (200, 10). Before fed into Network, I used standard scalar in x.and changed the x_train and, x_test shape to (800, 32, 1) and, (200, 32, 1).
Thanks

Categories