Can CNN do better than pretrained CNN? - python

With all I know. pretrained CNN can do way better than CNN. I have a dataset of 855 images. I have applied CNN and got 94% accuracy.Then I applied Pretrained model (VGG16, ResNet50, Inception_V3, MobileNet)also with fine tuning but still i got highest 60% and two of them are doing very bad on classification. Can CNN really do better than pretrained model or my implementation is wrong. I've converted my image into 100 by 100 dimensions and followed the way of keras application. Then What is the issue ??
Naive CNN approach :
def cnn_model():
size = (100,100,1)
num_cnn_layers =2
NUM_FILTERS = 32
KERNEL = (3, 3)
MAX_NEURONS = 120
model = Sequential()
for i in range(1, num_cnn_layers+1):
if i == 1:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, input_shape=size,
activation='relu', padding='same'))
else:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, activation='relu',
padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(int(MAX_NEURONS), activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(int(MAX_NEURONS/2), activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
return model
VGG16 approach:
def vgg():
` `vgg_model = keras.applications.vgg16.VGG16(weights='imagenet',include_top=False,input_shape = (100,100,3))
model = Sequential()
for layer in vgg_model.layers:
model.add(layer)
# Freeze the layers
for layer in model.layers:
layer.trainable = False
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(3, activation='softmax'))
model.compile(optimizer=keras.optimizers.Adam(lr=1e-5),
loss='categorical_crossentropy',
metrics=['accuracy'])
return model

What you're referring to as CNN in both cases talk about the same thing, which is a type of a neural network model. It's just that the pre-trained model has been trained on some other data instead of the dataset you're working on and trying to classify.
What is usually used here is called Transfer Learning. Instead of freezing all the layers, trying leaving the last few layers open so they can be retrained with your own data, so that the pretrained model can edit its weights and biases to match your needs as well. It could be the case that the dataset you're trying to classify is foreign to the pretrained models.
Here's an example from my own work, there are additional pieces of code but you can make it work with your own code, the logic remains the same
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])

Adding to what #Ilknur Mustafa mentioned, as your dataset may be foreign to the images used for pre-training, you can try to re-train few last layers of the pre-trained model instead of adding a whole new layers. The below example code doesn't add any additional trainable layer other than the output layer. In this way, you can benefit by retraining the last few layers on the existing weights, rather than training from scratch. This may be beneficial if you don't have a large dataset to train on.
# load model without classifier layers
vgg = VGG16(include_top=False, input_shape=(100, 100, 3), weights='imagenet', pooling='avg')
# make only last 2 conv layers trainable
for layer in vgg.layers[:-4]:
layer.trainable = False
# add output layer
out_layer = Dense(3, activation='softmax')(vgg.layers[-1].output)
model_pre_vgg = Model(vgg.input, out_layer)
# compile model
opt = SGD(lr=1e-5)
model_pre_vgg.compile(optimizer=opt, loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])

#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])

Related

Training CNN with transfer learning in Keras - image input doesn't work but vector input does

I'm trying to do transfer learning in Keras. I set up a ResNet50 network set to not trainable with some extra layers:
# Image input
model = Sequential()
model.add(ResNet50(include_top=False, pooling='avg')) # output is 2048
model.add(Dropout(0.05))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.15))
model.add(Dense(512, activation='relu'))
model.add(Dense(7, activation='softmax'))
model.layers[0].trainable = False
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
Then I create input data: x_batch using the ResNet50 preprocess_input function, along with the one hot encoded labels y_batch and do the fitting as so:
model.fit(x_batch,
y_batch,
epochs=nb_epochs,
batch_size=64,
shuffle=True,
validation_split=0.2,
callbacks=[lrate])
Training accuracy gets close to 100% after ten or so epochs, but validation accuracy actually decreases from around 50% to 30% with validation loss steadily increasing.
However if I instead create a network with just the last layers:
# Vector input
model2 = Sequential()
model2.add(Dropout(0.05, input_shape=(2048,)))
model2.add(Dense(512, activation='relu'))
model2.add(Dropout(0.15))
model2.add(Dense(512, activation='relu'))
model2.add(Dense(7, activation='softmax'))
model2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model2.summary()
and feed in the output of the ResNet50 prediction:
resnet = ResNet50(include_top=False, pooling='avg')
x_batch = resnet.predict(x_batch)
Then validation accuracy gets up to around 85%... What is going on? Why won't the image input method work?
Update:
This problem is really bizarre. If I change ResNet50 to VGG19 it seems to work ok.
After a lot of googling I found that the problem is to do with the Batch Normalisation layers in ResNet. There are no batch normalisation layers in VGGNet which is why it works for that topology.
There is a pull request to fix this in Keras here, which explains in more detail:
Assume we use one of the pre-trained CNNs of Keras and we want to fine-tune it. Unfortunately, we get no guarantees that the mean and variance of our new dataset inside the BN layers will be similar to the ones of the original dataset. As a result, if we fine-tune the top layers, their weights will be adjusted to the mean/variance of the new dataset. Nevertheless, during inference the top layers will receive data which are scaled using the mean/variance of the original dataset. This discrepancy can lead to reduced accuracy.
This means that the BN layers are adjusting to the training data, however when validation is performed, the original parameters of the BN layers are used. From what I can tell, the fix is to allow the frozen BN layers to use the updated mean and variance from training.
A work around is to pre-compute the ResNet output. In fact, this decreases training time considerably, as we are not repeating that part of the calculation.
you can try :
Res = keras.applications.resnet.ResNet50(include_top=False,
weights='imagenet', input_shape=(IMG_SIZE , IMG_SIZE , 3 ) )
# Freeze the layers except the last 4 layers
for layer in vgg_conv.layers :
layer.trainable = False
# Check the trainable status of the individual layers
for layer in vgg_conv.layers:
print(layer, layer.trainable)
# Vector input
model2 = Sequential()
model2.add(Res)
model2.add(Flatten())
model2.add(Dropout(0.05 ))
model2.add(Dense(512, activation='relu'))
model2.add(Dropout(0.15))
model2.add(Dense(512, activation='relu'))
model2.add(Dense(7, activation='softmax'))
model2.compile(optimizer='adam', loss='categorical_crossentropy', metrics =(['accuracy'])
model2.summary()

Convert Convnet.js neural network model to Keras Tensorflow

I have a neural network model that is created in convnet.js that I have to define using Keras. Does anyone have an idea how can I do that?
neural = {
net : new convnetjs.Net(),
layer_defs : [
{type:'input', out_sx:4, out_sy:4, out_depth:1},
{type:'fc', num_neurons:25, activation:"regression"},
{type:'regression', num_neurons:5}
],
neuralDepth: 1
}
this is what I could do so far. I cannot ve sure if it's correct.
#---Build Model-----
model = models.Sequential()
# Input - Layer
model.add(layers.Dense(4, activation = "relu", input_shape=(4,)))
# Hidden - Layers
model.add(layers.Dense(25, activation = "relu"))
model.add(layers.Dense(5, activation = "relu"))
# Output- Layer
model.add(layers.Dense(1, activation = "linear"))
model.summary()
# Compile Model
model.compile(loss= "mean_squared_error" , optimizer="adam", metrics=["mean_squared_error"])
From the Convnet.js doc : "your last layer must be a loss layer ('softmax' or 'svm' for classification, or 'regression' for regression)."
Also : "Create a regression layer which takes a list of targets (arbitrary numbers, not necessarily a single discrete class label as in softmax/svm) and backprops the L2 Loss."
It's unclear. I suspect "regression" layer is just another layer of Dense (Fully connected) neurons. The 'regression' word probably refers to linear activity. So, no 'relu' this time ?
Anyway, it would probably look something like (no sequential mode):
from keras.layers import Dense
from keras.models import Model
my_input = Input(shape = (4, ))
x = Dense(25, activation='relu')(x)
x = Dense(4)(x)
my_model = Model(input=my_input, output=x, loss='mse', metrics='mse')
my_model.compile(optimizer=Adam(LEARNING_RATE), loss='binary_crossentropy', metrics=['mse'])
After reading a bit of the docs, the convnet.js seems like a nice project. It would be much better with somebody with neural network knowledge on board.

Increasing Accuracy of Image Classification Model

I'm trying to use do image classification on two different classes using the pre-trained Inception V3 model. I have a data set of around 1400 images which are roughly balanced. When I run my program I get results that are off at the first couple epochs. Is this normal when training the model?
epochs = 175
batch_size = 64
#include_top = false to accomodate new classes
base_model = keras.applications.InceptionV3(
weights ='imagenet',
include_top=False,
input_shape = (img_width,img_height,3))
#Classifier Model ontop of Convolutional Model
model_top = keras.models.Sequential()
model_top.add(keras.layers.GlobalAveragePooling2D(input_shape=base_model.output_shape[1:], data_format=None)),
model_top.add(keras.layers.Dense(350,activation='relu'))
model_top.add(keras.layers.Dropout(0.4))
model_top.add(keras.layers.Dense(1,activation = 'sigmoid'))
model = keras.models.Model(inputs = base_model.input, outputs = model_top(base_model.output))
#freeze the convolutional layers of InceptionV3
for layer in model.layers[:30]:
layer.trainable = False
#Compiling model using Adam Optimizer
model.compile(optimizer = keras.optimizers.Adam(
lr=0.000001,
beta_1=0.9,
beta_2=0.999,
epsilon=1e-08),
loss='binary_crossentropy',
metrics=['accuracy'])
With my current parameters I only get an accuracy of 89% with a test loss of 0.3 when testing on a separated set of images. Do I need to add more layers to my model to increase this accuracy?
There are several issues with your code...
To start with, your way to build model_top is quite unconventional (and IMHO quite messy as well); in such cases, the documentation examples are your best friend. So, start with replacing your model_top part with:
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(350, activation='relu')(x)
x = Dropout(0.4)(x)
predictions = Dense(1, activation='sigmoid')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
Notice that I have not changed your parameters of choice - you could certainly experiment with more units in the dense layer (the example in the docs uses 1024)...
Second, it is not clear why you choose to freeze only 30 layers of the InceptionV3, which has no less than 311 layers:
len(base_model.layers)
# 311
So, replace also this part with
for layer in base_model.layers:
layer.trainable = False
Third, your learning rate seems way too small; the Adam optimizer is supposed to work well enough out of the box with its default parameters, so I also suggest to compile your model simply as
model.compile(optimizer = keras.optimizers.Adam(),
loss='binary_crossentropy',
metrics=['accuracy'])

Keras LSTM predict two features from one input in Text classification?

I have X as text, with two different labels(columns) to train.
--input.csv--
content, category, rate
text test, 1, 3
new test, 2, 2
Here my input X will be content. I have converted it to sequence matrix. I need both category and rate to be trained along with content. I couldn't figure out how to pass this inside the layers.
def RNN():
num_categories = 2
num_rates = 3
inputs = Input(name='inputs',shape=[max_len])
layer = Embedding(max_words,150,input_length=max_len)(inputs)
layer = LSTM(100)(layer)
shared_layer = Dense(256, activation='relu', name='FC1')(layer)
shared_layer = Dropout(0.5)(shared_layer)
cat_out = Dense(num_categories, activation='softmax', name='cat_out')(shared_layer)
rate_out = Dense(num_rates, activation='softmax', name='rate_out')(shared_layer)
model = Model(inputs=inputs,outputs=[cat_out, rate_out])
return model
model = RNN()
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(sequences_matrix,[Y_train, Z_train])
Y_train contains only category. I want to add rate to the training. Does any one know?
I want two results. One should be about category and another is Rate.
Currently its returning only the label. Not with the rate. I don't know the way to add a layer for the Rate column.
You can achieve this with the functional API, just let the network have 2 outputs from the shared feature layer:
shared_layer = Dense(256, activation='relu', name='FC1')(layer)
shared_layer = Dropout(0.5)(shared_layer)
cat_out = Dense(num_categories, activation='softmax', name='cat_out')(shared_layer)
rate_out = Dense(num_rates, activation='softmax', name='rate_out')(shared_layer)
model = Model(inputs=inputs,outputs=[cat_out, rate_out])
return model
You will now train with two targets, y_train_cat and y_train_rate and give them as a list to model.fit(X_train, [y_train_cat, y_train_rate]) and the model will make two distinct predictions.
Have a look at the functional API documentation on how to handle multi-input / multi-output models.

Grid Search the number of hidden layers with keras

I am trying to optimize the hyperparameters of my NN using Keras and sklearn.
I am wrapping up with KerasClassifier (it´s a classification problem).
I am trying to optimize the number of hidden layers.
I can´t figure it out how to do it with keras (actually I am wondering how to set up the function create_model in order to maximize the number of hidden layers)
Could anyone please help me?
My code (just the important part):
## Import `Sequential` from `keras.models`
from keras.models import Sequential
# Import `Dense` from `keras.layers`
from keras.layers import Dense
def create_model(optimizer='adam', activation = 'sigmoid'):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
my_classifier = KerasClassifier(build_fn=create_model, verbose=0)# Create
hyperparameter space
epochs = [5, 10]
batches = [5, 10, 100]
optimizers = ['rmsprop', 'adam']
activation1 = ['relu','sigmoid']
# Create grid search
grid = RandomizedSearchCV(estimator=my_classifier,
param_distributions=hyperparameters) #inserir param_distributions
# Fit grid search
grid_result = grid.fit(X_train, y_train)
# Create hyperparameter options
hyperparameters = dict(optimizer=optimizers, epochs=epochs,
batch_size=batches, activation=activation1)
# View hyperparameters of best neural network
grid_result.best_params_
If you want to make the number of hidden layers a hyperparameter you have to add it as parameter to your KerasClassifier build_fn like:
def create_model(optimizer='adam', activation = 'sigmoid', hidden_layers=1):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
for i in range(hidden_layers):
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
Then you will be able to optimize the number of hidden layers by adding it to the dictionary, which is passed to RandomizedSearchCV's param_distributions.
One more thing, you probably should separate the activation you use for the output layer from the other layers.
Different classes of activation functions are suitable for hidden layers and for output layers used in binary classification.

Categories