I want to use second order optimizer instead of using SGD, Adam, Adagrad, AdaDelta etc in neural network model. So I found AdaHessian as an optimizer. I have the following code:
# define model
model = Sequential()
model.add(Dense(132, input_dim=66, activation='linear'))
model.add(Dropout(0.5))
model.add(Dense(88, activation='linear'))
model.add(Dropout(0.5))
model.add(Dense(44, activation='linear'))
model.add(Dropout(0.5))
model.add(Dense(22, activation='linear'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='relu'))
model.compile(loss='mse', optimizer=optimizer = Adahessian(params=model), metrics=['accuracy'])
# fit model
model.fit(trainX, trainy, epochs=50, validation_split=0.2,verbose=0)
# evaluate the model
loss, test_acc = model.evaluate(testX, testy, verbose=0)
return model, loss, test_acc
Can anyone help me what params= means in AdaHessian() and what should I do to write i.e., model.parameters() or others? But I got error message:
AttributeError: 'Sequential' object has no attribute 'parameters'
Please show instructions so I can follow to implement second order optimizers in compilation of model.
Can any one explain the functionality of this code,please?
embedding_vector_length = 32 model = Sequential()
model.add(Embedding(vocab_size,embedding_vector_length,
input_length=200) ) model.add(SpatialDropout1D(0.25))
model.add(LSTM(50, dropout=0.5, recurrent_dropout=0.5))
model.add(Dropout(0.2)) model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='adam', metrics=['accuracy'])
print(model.summary())
This code defines a recurrent neural network (often referred to as a 'Sequence' or 'Sequential' model). In this context is is being set up for natural language processing with a binary cross entropy loss function, adam optimizer, and two dropout layers.
I'm having some trouble understanding why my Keras model has problems generating proper results (it now always returns 0). I have been able to find some others with this problem (ref 1, ref 2), but I haven't been able to understand the underlying cause.
Question: Why is my model only giving one, constant prediction?
Training Data Example
The last column is the prediction, 0 or 1.
32856500,1,1,200,6842314460,0
32800000,-1,0,0,0,0
32800000,-1,1,0,6845343222,0
32800000,-1,2,0,13692319489,0
32800000,-1,3,0,20539336035,0
32769900,-1,4,-30100,27389628085,0
32769900,-1,5,-30100,34239941481,0
32750000,-1,6,-50000,41091099905,0
32750000,-1,7,-50000,47945852379,1
Keras Code for Training
I'm using the sigmoid activation for the binary results. But I'm not sure if the issue lies here or in -for example- the binary_crossentropy or SGD optimizer.
def trainKerasModel(X, Y, path, dimensions):
# Create model
model = Sequential()
model.add(Dense(120, input_dim=dimensions, activation='sigmoid'))
model.add(Dense(100, activation='sigmoid'))
model.add(Dense(80, activation='sigmoid'))
model.add(Dense(60, activation='sigmoid'))
model.add(Dense(40, activation='sigmoid'))
model.add(Dense(20, activation='sigmoid'))
model.add(Dense(12, activation='sigmoid'))
model.add(Dense(10, activation='sigmoid'))
model.add(Dense(8, activation='sigmoid'))
model.add(Dense(6, activation='sigmoid'))
model.add(Dense(4, activation='sigmoid'))
model.add(Dense(2, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.01), metrics=['accuracy'])
# Fit the model
model.fit(X, Y, epochs=EPOCHS, batch_size=BATCHSIZE)
# Evaluate
scores = model.evaluate(X, Y)
Helpers().Log(model.metrics_names[1], scores[1]*100)
# Save model
with open(path+".json", "w") as json_file:
json_file.write(model.to_json())
# serialize weights to HDF5
model.save_weights(path+".h5")
Helpers().Log("Saved model to disk")
someFilePath = "file.csv"
dataset = numpy.loadtxt(someFilePath, delimiter=",")
dimensions = len(dataset[0]) - 1
trainKerasModel(dataset[:,0:dimensions], dataset[:,dimensions], someFilePath, dimensions)
Keras Code for Predictions
model = model_from_json(loaded_model_json)
model.load_weights(someWeightsFile)
Xnew = preprocess_input(numpy.array([[32856500,1,1,200,6842314460,0], [32800000,-1,3,0,20539336035,0], [32750000,-1,7,-50000,47945852379,1]]))
Ynew = model.predict_classes(Xnew)
print(Ynew)
12 sigmoid fc layers will never learn anything.
Read theory.
maybe you sould try just 3 layers with tanh , and no af if tanh on input. -1 for false, 1 for true.
Also apply tanh to input datasincethey are not normalized. Also cross entropy has no sence if you have only one output.
plus extending 5 input to 120 features then 12 layers is horrible overfit. You should have here 3 layers like with ~20, 16,10 items, tanh, mse loss, ca 1e-3 1e-4 learning rate
I have seen some examples of transfer learning where one can use pre-trained models from keras.application (Xception, VGG16, VGG19, ResNet50 e.t.c) but what I want is to transfer the learning from the model I saved using model.save('model.h5')
This is my current model:
model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model.add(LSTM(32))
model.add(Dropout(0.6))
model.add(Dense(2, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy',metrics=['acc'])
model.fit(sequences, labels, epochs=10, batch_size=32, validation_split=0.2)
Now, Instead of saying
model_base = keras.applications.vgg16.VGG16(include_top=False, weights='imagenet')
I want to load the saved model probably with load_model('model.h5') and add it as a layer to my current model.
try this
model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model.add(LSTM(32))
model.add(Dropout(0.6))
model.add(Dense(2, activation='sigmoid'))
model.layers[0].set_weights([embedding_matrix])
model.layers[0].trainable = False
model.load_weights('model.h5')
model.compile(optimizer='rmsprop', loss='binary_crossentropy',metrics=['acc'])
model_base = model
Dont forget to remove your classifier
I'm having trouble making an MLP in MxNet learn. It tends to output fairly constant values, only occasionally outputting anything different. I'm using the Pima Indians dataset to do binary classification, but no matter what I do (normalisation, scaling, changing activations, objective functions, number of neurons, batch size, epochs) it wouldn't produce anything useful.
The same MLP in Keras works fine.
Here's the MxNet code:
batch_size=10
train_iter=mx.io.NDArrayIter(mx.nd.array(df_train), mx.nd.array(y_train),
batch_size, shuffle=True)
val_iter=mx.io.NDArrayIter(mx.nd.array(df_test), mx.nd.array(y_test), batch_size)
data=mx.sym.var('data')
fc1 = mx.sym.FullyConnected(data=data, num_hidden=12)
act1 = mx.sym.Activation(data=fc1, act_type='relu')
fc2 = mx.sym.FullyConnected(data=act1, num_hidden=8)
act2 = mx.sym.Activation(data=fc2, act_type='relu')
fcfinal = mx.sym.FullyConnected(data=act2, num_hidden=2)
mlp = mx.sym.SoftmaxOutput(data=fcfinal, name='softmax')
mlp_model = mx.mod.Module(symbol=mlp, context=mx.cpu())
mlp_model.fit(train_iter,
eval_data=val_iter,
optimizer='sgd',
eval_metric='ce',
num_epoch=150)
And the same MLP in Keras:
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(df_train_res, y_train_res)
I would recommend that you initialize your parameters before you start training. Having all parameters start at zero is not ideal.
You could add the following as a parameter to your model.fit()
initializer=mx.init.Xavier(rnd_type='gaussian')
See here for more discussion
https://mxnet.incubator.apache.org/api/python/optimization.html