I am trying to optimize the hyperparameters of my NN using Keras and sklearn.
I am wrapping up with KerasClassifier (it´s a classification problem).
I am trying to optimize the number of hidden layers.
I can´t figure it out how to do it with keras (actually I am wondering how to set up the function create_model in order to maximize the number of hidden layers)
Could anyone please help me?
My code (just the important part):
## Import `Sequential` from `keras.models`
from keras.models import Sequential
# Import `Dense` from `keras.layers`
from keras.layers import Dense
def create_model(optimizer='adam', activation = 'sigmoid'):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
my_classifier = KerasClassifier(build_fn=create_model, verbose=0)# Create
hyperparameter space
epochs = [5, 10]
batches = [5, 10, 100]
optimizers = ['rmsprop', 'adam']
activation1 = ['relu','sigmoid']
# Create grid search
grid = RandomizedSearchCV(estimator=my_classifier,
param_distributions=hyperparameters) #inserir param_distributions
# Fit grid search
grid_result = grid.fit(X_train, y_train)
# Create hyperparameter options
hyperparameters = dict(optimizer=optimizers, epochs=epochs,
batch_size=batches, activation=activation1)
# View hyperparameters of best neural network
grid_result.best_params_
If you want to make the number of hidden layers a hyperparameter you have to add it as parameter to your KerasClassifier build_fn like:
def create_model(optimizer='adam', activation = 'sigmoid', hidden_layers=1):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
for i in range(hidden_layers):
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
Then you will be able to optimize the number of hidden layers by adding it to the dictionary, which is passed to RandomizedSearchCV's param_distributions.
One more thing, you probably should separate the activation you use for the output layer from the other layers.
Different classes of activation functions are suitable for hidden layers and for output layers used in binary classification.
Related
I'm newbie in Neural Network. I'm going to do a text classification research using MLP model with keras. Input layer consisting of 900 nodes, 2 hidden layers, and 2 outputs.
The code I use is as follows:
model = Sequential()
model.add(Dense(units = 100, activation = 'sigmoid'))
model.add(Dense(units = 100, activation = 'sigmoid'))
model.add(Dense(units = 2, activation = 'sigmoid'))
opt = Adam (learning_rate=0.001)
model.compile(loss = 'binary_crossentropy', optimizer = opt, metrics = ['accuracy'])
print(model.summary())
But get error:
This model has not yet been built. Build the model first by calling `build()` or by calling the model on a batch of data.
How to fix that error? Thank you.
With all I know. pretrained CNN can do way better than CNN. I have a dataset of 855 images. I have applied CNN and got 94% accuracy.Then I applied Pretrained model (VGG16, ResNet50, Inception_V3, MobileNet)also with fine tuning but still i got highest 60% and two of them are doing very bad on classification. Can CNN really do better than pretrained model or my implementation is wrong. I've converted my image into 100 by 100 dimensions and followed the way of keras application. Then What is the issue ??
Naive CNN approach :
def cnn_model():
size = (100,100,1)
num_cnn_layers =2
NUM_FILTERS = 32
KERNEL = (3, 3)
MAX_NEURONS = 120
model = Sequential()
for i in range(1, num_cnn_layers+1):
if i == 1:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, input_shape=size,
activation='relu', padding='same'))
else:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, activation='relu',
padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(int(MAX_NEURONS), activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(int(MAX_NEURONS/2), activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
return model
VGG16 approach:
def vgg():
` `vgg_model = keras.applications.vgg16.VGG16(weights='imagenet',include_top=False,input_shape = (100,100,3))
model = Sequential()
for layer in vgg_model.layers:
model.add(layer)
# Freeze the layers
for layer in model.layers:
layer.trainable = False
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(3, activation='softmax'))
model.compile(optimizer=keras.optimizers.Adam(lr=1e-5),
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
What you're referring to as CNN in both cases talk about the same thing, which is a type of a neural network model. It's just that the pre-trained model has been trained on some other data instead of the dataset you're working on and trying to classify.
What is usually used here is called Transfer Learning. Instead of freezing all the layers, trying leaving the last few layers open so they can be retrained with your own data, so that the pretrained model can edit its weights and biases to match your needs as well. It could be the case that the dataset you're trying to classify is foreign to the pretrained models.
Here's an example from my own work, there are additional pieces of code but you can make it work with your own code, the logic remains the same
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
Adding to what #Ilknur Mustafa mentioned, as your dataset may be foreign to the images used for pre-training, you can try to re-train few last layers of the pre-trained model instead of adding a whole new layers. The below example code doesn't add any additional trainable layer other than the output layer. In this way, you can benefit by retraining the last few layers on the existing weights, rather than training from scratch. This may be beneficial if you don't have a large dataset to train on.
# load model without classifier layers
vgg = VGG16(include_top=False, input_shape=(100, 100, 3), weights='imagenet', pooling='avg')
# make only last 2 conv layers trainable
for layer in vgg.layers[:-4]:
layer.trainable = False
# add output layer
out_layer = Dense(3, activation='softmax')(vgg.layers[-1].output)
model_pre_vgg = Model(vgg.input, out_layer)
# compile model
opt = SGD(lr=1e-5)
model_pre_vgg.compile(optimizer=opt, loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
I have a neural network model that is created in convnet.js that I have to define using Keras. Does anyone have an idea how can I do that?
neural = {
net : new convnetjs.Net(),
layer_defs : [
{type:'input', out_sx:4, out_sy:4, out_depth:1},
{type:'fc', num_neurons:25, activation:"regression"},
{type:'regression', num_neurons:5}
],
neuralDepth: 1
}
this is what I could do so far. I cannot ve sure if it's correct.
#---Build Model-----
model = models.Sequential()
# Input - Layer
model.add(layers.Dense(4, activation = "relu", input_shape=(4,)))
# Hidden - Layers
model.add(layers.Dense(25, activation = "relu"))
model.add(layers.Dense(5, activation = "relu"))
# Output- Layer
model.add(layers.Dense(1, activation = "linear"))
model.summary()
# Compile Model
model.compile(loss= "mean_squared_error" , optimizer="adam", metrics=["mean_squared_error"])
From the Convnet.js doc : "your last layer must be a loss layer ('softmax' or 'svm' for classification, or 'regression' for regression)."
Also : "Create a regression layer which takes a list of targets (arbitrary numbers, not necessarily a single discrete class label as in softmax/svm) and backprops the L2 Loss."
It's unclear. I suspect "regression" layer is just another layer of Dense (Fully connected) neurons. The 'regression' word probably refers to linear activity. So, no 'relu' this time ?
Anyway, it would probably look something like (no sequential mode):
from keras.layers import Dense
from keras.models import Model
my_input = Input(shape = (4, ))
x = Dense(25, activation='relu')(x)
x = Dense(4)(x)
my_model = Model(input=my_input, output=x, loss='mse', metrics='mse')
my_model.compile(optimizer=Adam(LEARNING_RATE), loss='binary_crossentropy', metrics=['mse'])
After reading a bit of the docs, the convnet.js seems like a nice project. It would be much better with somebody with neural network knowledge on board.
This is part of my codes.
model = Sequential()
model.add(Dense(3, input_shape=(4,), activation='softmax'))
model.compile(Adam(lr=0.1),
loss='categorical_crossentropy',
metrics=['accuracy'])
with this code, it will apply softmax to all the outputs at once. So the output indicates probability among all. However, I am working on non-exclusive classifire, which means I want the outputs to have independent probability.
Sorry my English is bad...
But what I want to do is to apply sigmoid function to each outputs so that they will have independent probabilities.
There is no need to create 3 separate outputs like suggested by the accepted answer.
The same result can be achieved with just one line:
model.add(Dense(3, input_shape=(4,), activation='sigmoid'))
You can just use 'sigmoid' activation for the last layer:
from tensorflow.keras.layers import GRU
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation
import numpy as np
from tensorflow.keras.optimizers import Adam
model = Sequential()
model.add(Dense(3, input_shape=(4,), activation='sigmoid'))
model.compile(Adam(lr=0.1),
loss='categorical_crossentropy',
metrics=['accuracy'])
pred = model.predict(np.random.rand(5, 4))
print(pred)
Output of independent probabilities:
[[0.58463055 0.53531045 0.51800555]
[0.56402034 0.51676977 0.506389 ]
[0.665879 0.58982867 0.5555959 ]
[0.66690147 0.57951677 0.5439698 ]
[0.56204814 0.54893976 0.5488999 ]]
As you can see the classes probabilities are independent from each other. The sigmoid is applied to every class separately.
You can try using Functional API to create a model with n outputs where each output is activated with sigmoid.
You can do it like this
in = Input(shape=(4, ))
dense_1 = Dense(units=4, activation='relu')(in)
out_1 = Dense(units=1, activation='sigmoid')(dense_1)
out_2 = Dense(units=1, activation='sigmoid')(dense_1)
out_3 = Dense(units=1, activation='sigmoid')(dense_1)
model = Model(inputs=[in], outputs=[out_1, out_2, out_3])
Suppose I have trained the model below for an epoch:
model = Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu'),
Dense(10), # output_dim, input_dim is taken for granted from above
Activation('softmax'),
])
And I got the weights dense1_w, biases dense1_b of first hidden layer (named it dense1) and a single data sample sample.
How do I use these to get the output of dense1 on the sample in keras?
Thanks!
The easiest way is to use the keras backend. With the keras backend you can define a function that gives you the intermediate output of a keras model as defined here (https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer).
So in essence:
get_1st_layer_output = K.function([model.layers[0].input],
[model.layers[1].output])
layer_output = get_1st_layer_output([X])
Just recreate the first part of the model up until the layer for which you would like the output (in your case only the first dense layer). Afterwards you can load the trained weights of the first part in your newly created model and compile it.
The output of the prediction with this new model will be the output of the layer (in your case the first dense layer).
from keras.models import Sequential
from keras.layers import Dense, Activation
import numpy as np
model = Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu'),
Dense(10), # output_dim, input_dim is taken for granted from above
Activation('softmax'),
])
model.compile(optimizer='adam', loss='categorical_crossentropy')
#create some random data
n_features = 5
samples = np.random.randint(0, 10, 784*n_features).reshape(-1,784)
labels = np.arange(10*n_features).reshape(-1, 10)
#train your sample model
model.fit(samples, labels)
#create new model
new_model= Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu')])
#set weights of the first layer
new_model.set_weights(model.layers[0].get_weights())
#compile it after setting the weights
new_model.compile(optimizer='adam', loss='categorical_crossentropy')
#get output of the first dens layer
output = new_model.predict(samples)
As for weights, I had a none-Sequential model. What I did was using model.summary() to get the desired layers name and then model.get_layer("layer_name").get_weights() to get the weights.