Implement adaHessian as an optimizer in neural network model - python

I want to use second order optimizer instead of using SGD, Adam, Adagrad, AdaDelta etc in neural network model. So I found AdaHessian as an optimizer. I have the following code:
# define model
model = Sequential()
model.add(Dense(132, input_dim=66, activation='linear'))
model.add(Dropout(0.5))
model.add(Dense(88, activation='linear'))
model.add(Dropout(0.5))
model.add(Dense(44, activation='linear'))
model.add(Dropout(0.5))
model.add(Dense(22, activation='linear'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='relu'))
model.compile(loss='mse', optimizer=optimizer = Adahessian(params=model), metrics=['accuracy'])
# fit model
model.fit(trainX, trainy, epochs=50, validation_split=0.2,verbose=0)
# evaluate the model
loss, test_acc = model.evaluate(testX, testy, verbose=0)
return model, loss, test_acc
Can anyone help me what params= means in AdaHessian() and what should I do to write i.e., model.parameters() or others? But I got error message:
AttributeError: 'Sequential' object has no attribute 'parameters'
Please show instructions so I can follow to implement second order optimizers in compilation of model.

Related

Why is my multiclass neural model not training (accuracy and loss staying same)?

I am learning neural networks. I get 98% accuracy with classical ML methods, so I think I made a coding error. The neural networks model is not learning.
Things I tried:
Changing X and y to float64 or float32
Normalizing data
Changing the activation to "linear" or "relu"
Removing Flatten()
Adding hidden layers
Using stochastic gradient descent as optimizer, instead of "adam".
Changing the y label with another label
There are 9 labels in X_train and 8 different classes in y_train.
X_train:
y_train:
Code:
model = keras.models.Sequential()
model.add(keras.layers.Input(shape=(9,)))
model.add(keras.layers.Dense(8, activation='softmax'))
model.add(layers.Flatten())
model.compile(optimizer= 'adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
Fitting:
I tried these lines by changing the target label. None of them help training the model. Some give "nan" loss, some go slightly up and down, but all of them are below 0.1% accuracy:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=20, batch_size=24)
or this:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(3, activation='relu', name='relu1'))
model.add(layers.Dense(16, activation='relu', name='relu2'))
model.add(layers.Dense(16, activation='relu', name='relu3'))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history = model.fit(x=X_train, y=y_train, epochs=20)

Val_loss metric error in Keras/TensorFlow

I am currently trying to write a neural network for a multi-class classification problem using early stopping monitoring validation loss, but I keep receiving the error:
WARNING:tensorflow:Early stopping conditioned on metric `val_loss` which is not available.
The code for my network is below:
model = Sequential()
model.add(Dense(4, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
early_stop = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10)
model.fit(x=X_train, y=y_train, epochs=100, validation_data=(X_test, y_test), callbacks=[early_stop])
EDIT:
I have uninstalled and reinstalled all packages and am now receiving this error instead:
UnimplementedError: Cast string to float is not supported
[[node loss/output_1_loss/Cast (defined at /opt/anaconda3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_distributed_function_938]

Training with Adam gets slower each epoch

I am training a simple neural network in Keras with Theano backend consisting of 4 dense layers connected to a Merge layer and then to a softmax classifier layer. Using Adam for training, the first few epochs train in about 60s each (in the CPU) but, after that, the training time per epoch starts increasing, taking more than 400s by epoch 70, making it unusable.
Is there anything wrong with my code or is this suppose to happen?
This only happens when using Adam, not with sgd, adadelta, rmsprop or adagrad. I'd use any of the other methods but Adam produces far better results.
The code:
modela = Sequential()
modela.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modelb = Sequential()
modelb.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modelc = Sequential()
modelc.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modeld = Sequential()
modeld.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
model = Sequential()
model.add(Merge([modela, modelb, modelc, modeld], mode='concat', concat_axis=1))
model.add(Dense(258, init='uniform', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
hist = model.fit([Xa, Xb, Xc, Xd], Ycat, validation_split=.25, nb_epoch=80, batch_size=100, verbose=2)

How to boost a Keras based neural network using AdaBoost?

Assuming I fit the following neural network for a binary classification problem:
model = Sequential()
model.add(Dense(21, input_dim=19, init='uniform', activation='relu'))
model.add(Dense(80, init='uniform', activation='relu'))
model.add(Dense(80, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(x2, training_target, nb_epoch=10, batch_size=32, verbose=0,validation_split=0.1, shuffle=True,callbacks=[hist])
How would I boost the neural network using AdaBoost? Does keras have any commands for this?
This can be done as follows:
First create a model (for reproducibility make it as a function):
def simple_model():
# create model
model = Sequential()
model.add(Dense(25, input_dim=x_train.shape[1], kernel_initializer='normal', activation='relu'))
model.add(Dropout(0.2, input_shape=(x_train.shape[1],)))
model.add(Dense(10, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
Then put it inside the sklearn wrapper:
ann_estimator = KerasRegressor(build_fn= simple_model, epochs=100, batch_size=10, verbose=0)
Then and finally boost it:
boosted_ann = AdaBoostRegressor(base_estimator= ann_estimator)
boosted_ann.fit(rescaledX, y_train.values.ravel())# scale your training data
boosted_ann.predict(rescaledX_Test)
Keras itself does not implement adaboost. However, Keras models are compatible with scikit-learn, so you probably can use AdaBoostClassifier from there: link. Use your model as the base_estimator after you compile it, and fit the AdaBoostClassifier instance instead of model.
This way, however, you will not be able to use the arguments you pass to fit, such as number of epochs or batch_size, so the defaults will be used. If the defaults are not good enough, you might need to build your own class that implements the scikit-learn interface on top of your model and passes proper arguments to fit.
Apparently, neural networks are not compatible with the sklearn Adaboost, see https://github.com/scikit-learn/scikit-learn/issues/1752

NN Accuracy Saturates After the Very First Epoch with Keras

I'm trying to fit a simple Neural Network to predict a binary target using keras-1.0.6. The output saturates after the very first epoch. I try playing around with the learning rate (from 0.1 to 1e-6), decay and momentum of the SGD optimizer and with the layers (10-512 hidden neurons and 1-2 hidden layers) and their activation functions of the network, but nothing worked - the prediction accuracy was the same.
My training set has shape (13602, 115) and my validation set has shape (3400,115). The target variable y_train and y_test have values encoded as 1 and 0 (60% are 1's and 40% are 0's). At first, the data was not normalized though when I normalized it I got the same results.
Verifying the output, I see that the model is predicting only 1 class. Sometimes it predicts only 1's and other times only 0's (when I tweak the model).
I also tried to encode the target variable in the shape (n_sample, 2) but the output was the same.
I followed some questions here and googling that suggests tunning the learning rate and not using 'softmax' activation but couldn't improve the results.
Some of the models I tried is below:
The simplest model:
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
Model 2:
model = Sequential()
model.add(Dense(512, input_dim=X_train.shape[1]))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation('sigmoid'))
Model 3
model.add(Dense(64, input_dim=X_train.shape[1], init='uniform', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
Model 4
model.add(Dense(64, input_dim=X_train.shape[1], init='uniform', activation='sigmoid'))
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
and to compile and fit the model:
sgd = SGD(lr=0.01, decay=0.1, momentum=0.0, nesterov=True)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train2, nb_epoch=5, batch_size=50, validation_split=0.2)
model.predict(X_test)
The output gives either [0,0,0,0,0,0,0,...] or [1,1,1,1,1,1,1,1,...]
Does anybody have a clue on what's going on here?

Categories