How do MLP Classification using Keras - python

I'm newbie in Neural Network. I'm going to do a text classification research using MLP model with keras. Input layer consisting of 900 nodes, 2 hidden layers, and 2 outputs.
The code I use is as follows:
model = Sequential()
model.add(Dense(units = 100, activation = 'sigmoid'))
model.add(Dense(units = 100, activation = 'sigmoid'))
model.add(Dense(units = 2, activation = 'sigmoid'))
opt = Adam (learning_rate=0.001)
model.compile(loss = 'binary_crossentropy', optimizer = opt, metrics = ['accuracy'])
print(model.summary())
But get error:
This model has not yet been built. Build the model first by calling `build()` or by calling the model on a batch of data.
How to fix that error? Thank you.

Related

Tensorflow Neural Network like MLPClassifier (sklearn)

So, I am creating an AI which predicts how long a user will take to finish exercises. I previously created a NN with Sklearn, but I want to integrate Tensorflow.
I have 6 features as input and 1 output, which is a number.
I tried this but it does not seem to be willing to work:
# Train data
X_train = X[:1500]
y_train = y[:1500]
# Test data
X_test = X[1500:]
y_test = y[1500:]
# Create the TF model
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(6,)),
tf.keras.layers.Dense(256, activation='softmax'),
tf.keras.layers.Dense(128, activation='softmax'),
tf.keras.layers.Dense(64, activation='softmax'),
tf.keras.layers.Dense(1)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10)
With that, it used to work with a simple MLPClassifier.
I also managed to get this nice error which does not seem to be fixed by changing the layers:
Received a label value of 1209638408 which is outside the valid range of [0, 1).
So I changed it a bit and came up with this:
features_train = features[:1500]
output_train = output[:1500]
features_test = features[1500:]
output_test = output[1500:]
classifier = Sequential()
classifier.add(Dense(units = 16, activation = 'relu', input_dim = 6))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 64, activation = 'relu'))
classifier.add(Dense(units = 32, activation = 'relu'))
classifier.add(Dense(units = 8, activation = 'relu'))
classifier.add(Dense(units = 2, activation = 'relu'))
classifier.add(Dense(units = 1))
classifier.compile(optimizer='rmsprop', loss='binary_crossentropy')
classifier.fit(features_train, output_train, batch_size = 1, epochs = 10)
But now I get a loss of 100%.
You should use a smaller network. Try with fewer Dense layers, 2 or 3 maximum. If you use the binary_crossentropy loss, use a sigmoid activation in the last Dense layer. You can also pass metrics=['accuracy'] when compiling the model to monitor the accuracy.

Trying to recreate BiLSTM model from Adhikari et al. 2019 (LSTM_reg) in Tensorflow

Im trying to recreate this model LSTM_reg from this paper in TensorFlow to use in my problem. I've come up with the following code:
def get_model(lr=0.001):
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Embedding(nb_words, output_dim=embed_size, weights=[embedding_matrix], input_length = maxlen, trainable=False))
model.add(tf.keras.layers.Dropout(0.2)) # embedding dropouts
model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(256, return_sequences=True, recurrent_dropout=0.2, activation = 'tanh'))) # weight drop on recurrent layers using recurrent_dropout
model.add(tf.keras.layers.MaxPooling1D(pool_size=2, padding = 'valid'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(20))
model.add(tf.keras.layers.Activation('sigmoid'))
model.compile(loss = 'categorical_crossentropy' , optimizer = 'adam', metrics = ['accuracy', tfa.metrics.F1Score(num_classes = 20)])
return model
Have I gone about this the right way? Got some pretty weird values while training my dataset, hence was wondering about my implementation.. There is a pytorch implementation for this model here. But I'm not sure if I have reproduced this correctly.
One major difference I can see is that the paper uses a global max pooling, whereas you've only used max pooling with a kernel size of 2:
def get_model(lr=0.001):
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Embedding(nb_words, output_dim=embed_size, weights=[embedding_matrix], input_length = maxlen, trainable=False))
model.add(tf.keras.layers.Dropout(0.2)) # embedding dropouts
model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(256, return_sequences=True, recurrent_dropout=0.2, activation = 'tanh'))) # weight drop on recurrent layers using recurrent_dropout
model.add(tf.keras.layers.GlobalMaxPooling1D())
model.add(tf.keras.layers.Dense(512, activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(20))
model.add(tf.keras.layers.Activation('sigmoid'))
model.compile(loss = 'categorical_crossentropy' , optimizer = 'adam', metrics = ['accuracy', tfa.metrics.F1Score(num_classes = 20)])
return model
I obviously don't have your data so make sure that the arguments are set correctly: https://keras.io/api/layers/pooling_layers/global_max_pooling1d/.
Another change is that the pytorch repo you shared has a ReLU after the LSTM (I don't know why). You could try adding that in and see if it helps.

Can CNN do better than pretrained CNN?

With all I know. pretrained CNN can do way better than CNN. I have a dataset of 855 images. I have applied CNN and got 94% accuracy.Then I applied Pretrained model (VGG16, ResNet50, Inception_V3, MobileNet)also with fine tuning but still i got highest 60% and two of them are doing very bad on classification. Can CNN really do better than pretrained model or my implementation is wrong. I've converted my image into 100 by 100 dimensions and followed the way of keras application. Then What is the issue ??
Naive CNN approach :
def cnn_model():
size = (100,100,1)
num_cnn_layers =2
NUM_FILTERS = 32
KERNEL = (3, 3)
MAX_NEURONS = 120
model = Sequential()
for i in range(1, num_cnn_layers+1):
if i == 1:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, input_shape=size,
activation='relu', padding='same'))
else:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, activation='relu',
padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(int(MAX_NEURONS), activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(int(MAX_NEURONS/2), activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
return model
VGG16 approach:
def vgg():
` `vgg_model = keras.applications.vgg16.VGG16(weights='imagenet',include_top=False,input_shape = (100,100,3))
model = Sequential()
for layer in vgg_model.layers:
model.add(layer)
# Freeze the layers
for layer in model.layers:
layer.trainable = False
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(3, activation='softmax'))
model.compile(optimizer=keras.optimizers.Adam(lr=1e-5),
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
What you're referring to as CNN in both cases talk about the same thing, which is a type of a neural network model. It's just that the pre-trained model has been trained on some other data instead of the dataset you're working on and trying to classify.
What is usually used here is called Transfer Learning. Instead of freezing all the layers, trying leaving the last few layers open so they can be retrained with your own data, so that the pretrained model can edit its weights and biases to match your needs as well. It could be the case that the dataset you're trying to classify is foreign to the pretrained models.
Here's an example from my own work, there are additional pieces of code but you can make it work with your own code, the logic remains the same
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
Adding to what #Ilknur Mustafa mentioned, as your dataset may be foreign to the images used for pre-training, you can try to re-train few last layers of the pre-trained model instead of adding a whole new layers. The below example code doesn't add any additional trainable layer other than the output layer. In this way, you can benefit by retraining the last few layers on the existing weights, rather than training from scratch. This may be beneficial if you don't have a large dataset to train on.
# load model without classifier layers
vgg = VGG16(include_top=False, input_shape=(100, 100, 3), weights='imagenet', pooling='avg')
# make only last 2 conv layers trainable
for layer in vgg.layers[:-4]:
layer.trainable = False
# add output layer
out_layer = Dense(3, activation='softmax')(vgg.layers[-1].output)
model_pre_vgg = Model(vgg.input, out_layer)
# compile model
opt = SGD(lr=1e-5)
model_pre_vgg.compile(optimizer=opt, loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])

Loading weights TensorFlow 2.0 model error

I am using Python 3.X and TensorFlow 2.0 along with "tensorflow_model_optimization" package for neural network pruning. The code I have is as follows-
from tensorflow_model_optimization.sparsity import keras as sparsity
l = tf.keras.layers
# Original model without pruning-
model = Sequential()
model.add(l.InputLayer(input_shape = (784, )))
model.add(Flatten())
model.add(Dense(units = 300, activation='relu', kernel_initializer = tf.initializers.GlorotUniform()))
model.add(l.Dropout(0.2))
model.add(Dense(units = 100, activation='relu', kernel_initializer = tf.initializers.GlorotUniform()))
model.add(l.Dropout(0.1))
model.add(Dense(units = num_classes, activation='softmax'))
# Define callbacks-
callbacks = [
# tf.keras.callbacks.TensorBoard(log_dir=logdir, profile_batch = 0),
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience = 3)
]
# Compile designed Neural Network-
model.compile(
loss = tf.keras.losses.categorical_crossentropy,
optimizer = 'adam',
metrics = ['accuracy'])
# Save untrained and initial weights to disk-
model.save_weights("Initial_non_trained_weights.h5")
epochs = 12
num_train_samples = X_train.shape[0]
end_step = np.ceil(1.0 * num_train_samples / batch_size).astype(np.int32) * epochs
print("end_step parameter for this dataset = {0}".format(end_step))
# end_step = 5628
# Specify the parameters to be used for layer-wise pruning:
pruning_params = {
'pruning_schedule': sparsity.PolynomialDecay(
initial_sparsity=0.50, final_sparsity=0.90,
begin_step=2000, end_step=end_step, frequency=100)
}
# Neural network which is to be pruned-
pruned_model = Sequential()
pruned_model.add(l.InputLayer(input_shape=(784, )))
pruned_model.add(Flatten())
pruned_model.add(sparsity.prune_low_magnitude(Dense(units = 300, activation='relu', kernel_initializer=tf.initializers.GlorotUniform()),
**pruning_params))
pruned_model.add(l.Dropout(0.2))
pruned_model.add(sparsity.prune_low_magnitude(Dense(units = 100, activation='relu', kernel_initializer=tf.initializers.GlorotUniform()),
**pruning_params))
pruned_model.add(l.Dropout(0.1))
pruned_model.add(sparsity.prune_low_magnitude(Dense(units = num_classes, activation='softmax'), **pruning_params))
# Compile pruned CNN-
pruned_model.compile(
loss=tf.keras.losses.categorical_crossentropy,
optimizer='adam',
metrics=['accuracy'])
# Load weights from before-
pruned_model.load_weights("Initial_non_trained_weights.h5")
This last line of loading initial weights into the pruned model gives me error:
ValueError: Layer #0 (named "prune_low_magnitude_dense_9" in the current model) was found to correspond to layer dense in the save file.
However the new layer prune_low_magnitude_dense_9 expects 5 weights, but the saved weights have 2 elements.
What's going wrong?
Thanks!

Grid Search the number of hidden layers with keras

I am trying to optimize the hyperparameters of my NN using Keras and sklearn.
I am wrapping up with KerasClassifier (it´s a classification problem).
I am trying to optimize the number of hidden layers.
I can´t figure it out how to do it with keras (actually I am wondering how to set up the function create_model in order to maximize the number of hidden layers)
Could anyone please help me?
My code (just the important part):
## Import `Sequential` from `keras.models`
from keras.models import Sequential
# Import `Dense` from `keras.layers`
from keras.layers import Dense
def create_model(optimizer='adam', activation = 'sigmoid'):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
my_classifier = KerasClassifier(build_fn=create_model, verbose=0)# Create
hyperparameter space
epochs = [5, 10]
batches = [5, 10, 100]
optimizers = ['rmsprop', 'adam']
activation1 = ['relu','sigmoid']
# Create grid search
grid = RandomizedSearchCV(estimator=my_classifier,
param_distributions=hyperparameters) #inserir param_distributions
# Fit grid search
grid_result = grid.fit(X_train, y_train)
# Create hyperparameter options
hyperparameters = dict(optimizer=optimizers, epochs=epochs,
batch_size=batches, activation=activation1)
# View hyperparameters of best neural network
grid_result.best_params_
If you want to make the number of hidden layers a hyperparameter you have to add it as parameter to your KerasClassifier build_fn like:
def create_model(optimizer='adam', activation = 'sigmoid', hidden_layers=1):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
for i in range(hidden_layers):
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
Then you will be able to optimize the number of hidden layers by adding it to the dictionary, which is passed to RandomizedSearchCV's param_distributions.
One more thing, you probably should separate the activation you use for the output layer from the other layers.
Different classes of activation functions are suitable for hidden layers and for output layers used in binary classification.

Categories