Train with custom weights - python

Currently I am using transfer learning to train a neural network.
I am using the ResNet50 pretrained model provided by keras.
base_model=ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# function to finetune model
def build_finetune_model(base_model, dropout, fc_layers, num_classes):
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = Flatten()(x)
for fc in fc_layers:
# New FC layer, random init
x = Dense(fc, use_bias=False)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Dropout(dropout)(x)
# New softmax layer
x = Dense(num_classes, use_bias=False)(x)
x = BatchNormalization()(x)
predictions = Activation('softmax')(x)
finetune_model = Model(inputs=base_model.input, outputs=predictions)
return finetune_model
FC_LAYERS = [1024, 512]
dropout = 0.5
model = build_finetune_model(base_model, dropout=dropout, fc_layers=FC_LAYERS,num_classes=len(categories))
Now I wanted to see whether or not using Resnet50 1by2 (amongst others) would increase my accuracy. These models are provided as caffe models. I used the Caffe weight converter to convert these models to a keras h5 file.
The problem now is that these files do not contain a model that can be trained, only the weights.
How can I use the weights to train a model in keras?

If you only have saved weights, you can only load those weights into a network with the same architecture. Assuming the weights you have match the architecture of the Keras Applications model, you can:
base_model = ResNet50(...,weights=None)
base_model.load_weights('my_weights_file.h5')
for layer in base_model.layers:
layer.training = False

Related

Changing input layer size in Keras pertained model

I'm using Inception model in Keras with the pre-trained weights of image net.
The problem is that default input size for this model is 299x299 as per Keras documentation. While my images are 230 * 350 and I don't want to resize them as it will distort the image. So I am trying to find a method to change input layer size.
Below is code is what I tried so far, however I am doubting that the image net weights are being preserved as I thing the architecture will change when I change input size.
Any ideas ..
input = Input(shape=(230, 350, 3), name='image_input')
base_model = InceptionV3(weights='imagenet', include_top=False, input_tensor=input)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(64, activation='relu')(x)
predictions = Dense(1, activation='sigmoid')(x)
model = Model(inputs=input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = True
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr=0.0001),
metrics=['accuracy'])
Inception V3 is a fully convolutional model. You use the global pooling on the top of convolutional encoder, so slight deviation from the 299x299 should not be a big deal. If you do not have error messages with your code, it must be absolutely fine to use it like this.

Combining CNN with LSTM using Tensorflow Keras

I'm using pre-trained ResNet-50 model and want to feed the outputs of the penultimate layer to a LSTM Network. Here is my sample code containing only CNN (ResNet-50):
N = NUMBER_OF_CLASSES
#img_size = (224,224,3)....same as that of ImageNet
base_model = ResNet50(include_top=False, weights='imagenet',pooling=None)
x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(1024, activation='relu')(x)
model = Model(inputs=base_model.input, outputs=predictions)
Next, I want to feed it to a LSTM network, as follows...
final_model = Sequential()
final_model.add((model))
final_model.add(LSTM(64, return_sequences=True, stateful=True))
final_model.add(Dense(N, activation='softmax'))
But I'm confused how to reshape the output to the LSTM input. My original input is (224*224*3) to CNN.
Also, should I use TimeDistributed?
Any kind of help is appreciated.
Adding an LSTM after a CNN does not make a lot of sense, as LSTM is mostly used for temporal/sequence information, whereas your data seems to be only spatial, however if you still like to use it just use
x = Reshape((1024,1))(x)
This would convert it to a sequence of 1024 samples, with 1 feature
If you are talking of spatio-temporal data, Use Timedistributed on the Resnet Layer and then you can use convlstm2d
Example of using pretrained network with LSTM:
inputs = Input(shape=(config.N_FRAMES_IN_SEQUENCE, config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
cnn = VGG16(include_top=False, weights='imagenet', input_shape=(config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
x = TimeDistributed(cnn)(inputs)
x = TimeDistributed(Flatten())(x)
x = LSTM(256)(x)

Training pretrained model keras_vggface produces very high loss after adding batch normalization

I am trying to train my model using the pretrained Keras VGGFace on a dataset(all faces) of 1774 training images and 313 validation images consisting of 12 classes.
I have recently added batch normalization and dropout in my code script since it was overfitting. (My training acc was arround 99 and val acc. around 80). This is my code:
train_data_path = 'dataset_cfps/train'
validation_data_path = 'dataset_cfps/validation'
#Parametres
img_width, img_height = 224, 224
vggface = VGGFace(model='resnet50', include_top=False, input_shape=(img_width, img_height, 3))
last_layer = vggface.get_layer('avg_pool').output
x = Flatten(name='flatten')(last_layer)
x1 = Dense(12, activation='sigmoid', name='classifier')(x)
x2 = BatchNormalization()(x1)
x3 = Dropout(0.5)(x2)
custom_vgg_model = Model(vggface.input, x3)
# Create the model
model = models.Sequential()
# Add the convolutional base model
model.add(custom_vgg_model)
When I try to train the model, the loss just goes above 10, which is not supposed to happen. Is the Batch Normalization and dropout layers added in the correct place?
I tried to adopt the code from their github repo.
Batch normalization and dropout is being added after the sigmoid classification layer in your code. Dropout would cause some of the sigmoid activations or probabilities which are output by the Dense layer to not be considered, which is why there could be incorrect classification predictions while trying to find the class with maximum probability.
Please try modifying your code as follows, to include two Dense layers. The first hidden dense layer should have a dimension around half of the sum of the input and output dimensions. The final dense layer can be the layer for predicting the output class probabilities with sigmoid activation. The regularization layers can be added between the two layers.
train_data_path = 'dataset_cfps/train'
validation_data_path = 'dataset_cfps/validation'
#Parametres
img_width, img_height = 224, 224
vggface = VGGFace(model='resnet50', include_top=False, input_shape=(img_width, img_height, 3))
last_layer = vggface.get_layer('avg_pool').output
x = Flatten(name='flatten')(last_layer)
x1 = Dense(1024, activation='relu')(x)
x2 = BatchNormalization()(x1)
x3 = Dropout(0.5)(x2)
x4 = Dense(12, activation='sigmoid', name='classifier')(x3)
custom_vgg_model = Model(vggface.input, x4)
# Create the model
model = models.Sequential()
# Add the convolutional base model
model.add(custom_vgg_model)

keras fine-tuning a pre-trained model does not change weights when using multi_gpu_model and layers.trainable = True

I'm loading the VGG16 pretrained model, adding a couple of dense layers and fine tuning the last 5 layers of the base VGG16. I'm training my model on mutliple gpus. I saved the model before and after training. The weights are the same inspite of having layers.trainable = True.
Please help!
Heres the code
from keras import applications
from keras import Model
<import other relevant Keras layers, etc.>
model = applications.VGG16(weights = "imagenet", include_top = False, input_shape = (224,224,3))
model.save('./before_training')
for layer in model.layers:
layer.trainable = False
for layer in model.layers[-5:]:
layer.trainable = True
x = model.output
x = Flatten()(x)
x = Dense(1024, activation = "relu")(x)
x = Dropout(0.5)(x)
x = Dense(1024, activation = "relu")(x)
predictions = Dense(2, activation = "softmax")(x)
model_final = Model(input = model.input, output = predictions)
from keras.utils import multi_gpu_model
parallel_model = multi_gpu_model(model_final, gpus = 4)
parallel_model.compile(loss = "categorical_crossentropy" ..... )
datagen = ImageDataGenerator(....)
early = EarlyStopping(...)
train_generator = datagen.flow_from_directory(train_data_dir,...)
validation_generator = datagen.flow_from_directory(test_data_dir,...)
parallel_model.fit_generator(train_generator, validation_data = valiudation_generator,...)
model_final.save('./after_training)
Weights in before_training and after_training models are the same!!! Which is not what I expected!

Keras - All layer names should be unique

I combine two VGG net in keras together to make classification task. When I run the program, it shows an error:
RuntimeError: The name "predictions" is used 2 times in the model. All layer names should be unique.
I was confused because I only use prediction layer once in my code:
from keras.layers import Dense
import keras
from keras.models import Model
model1 = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000)
model1.layers.pop()
model2 = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000)
model2.layers.pop()
for layer in model2.layers:
layer.name = layer.name + str("two")
model1.summary()
model2.summary()
featureLayer1 = model1.output
featureLayer2 = model2.output
combineFeatureLayer = keras.layers.concatenate([featureLayer1, featureLayer2])
prediction = Dense(1, activation='sigmoid', name='main_output')(combineFeatureLayer)
model = Model(inputs=[model1.input, model2.input], outputs= prediction)
model.summary()
Thanks for #putonspectacles help, I follow his instruction and find some interesting part. If you use model2.layers.pop() and combine the last layer of two models using "model.layers.keras.layers.concatenate([model1.output, model2.output])", you will find that the last layer information is still showed using the model.summary(). But actually they do not exist in the structure. So instead, you can use model.layers.keras.layers.concatenate([model1.layers[-1].output, model2.layers[-1].output]). It looks tricky but it works.. I think it is a problem about synchronization of the log and structure.
First, based on the code you posted you have no layers with a name attribute 'predictions', so this error has nothing to do with your layer
Dense layer prediction: i.e:
prediction = Dense(1, activation='sigmoid',
name='main_output')(combineFeatureLayer)
The VGG16 model has a Dense layer with name predictions. In particular this line:
x = Dense(classes, activation='softmax', name='predictions')(x)
And since you're using two of these models you have layers with duplicate names.
What you could do is rename the layer in the second model to something other than predictions, maybe predictions_1, like so:
model2 = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000)
# now change the name of the layer inplace.
model2.get_layer(name='predictions').name='predictions_1'
You can change the layer's name in keras, don't use 'tensorflow.python.keras'.
Here is my sample code:
from keras.layers import Dense, concatenate
from keras.applications import vgg16
num_classes = 10
model = vgg16.VGG16(include_top=False, weights='imagenet', input_tensor=None, input_shape=(64,64,3), pooling='avg')
inp = model.input
out = model.output
model2 = vgg16.VGG16(include_top=False,weights='imagenet', input_tensor=None, input_shape=(64,64,3), pooling='avg')
for layer in model2.layers:
layer.name = layer.name + str("_2")
inp2 = model2.input
out2 = model2.output
merged = concatenate([out, out2])
merged = Dense(1024, activation='relu')(merged)
merged = Dense(num_classes, activation='softmax')(merged)
model_fusion = Model([inp, inp2], merged)
model_fusion.summary()
Example:
# Network for affine transform estimation
affine_transform_estimator = MobileNet(
input_tensor=None,
input_shape=(config.IMAGE_H // 2, config.IMAGE_W //2, config.N_CHANNELS),
alpha=1.0,
depth_multiplier=1,
include_top=False,
weights='imagenet'
)
affine_transform_estimator.name = 'affine_transform_estimator'
for layer in affine_transform_estimator.layers:
layer.name = layer.name + str("_1")
# Network for landmarks regression
landmarks_regressor = MobileNet(
input_tensor=None,
input_shape=(config.IMAGE_H // 2, config.IMAGE_W // 2, config.N_CHANNELS),
alpha=1.0,
depth_multiplier=1,
include_top=False,
weights='imagenet'
)
landmarks_regressor.name = 'landmarks_regressor'
for layer in landmarks_regressor.layers:
layer.name = layer.name + str("_2")
input_image = Input(shape=(config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
downsampled_image = MaxPooling2D(pool_size=(2,2))(input_image)
x1 = affine_transform_estimator(downsampled_image)
x2 = landmarks_regressor(downsampled_image)
x3 = add([x1,x2])
model = Model(inputs=input_image, outputs=x3)
optimizer = Adadelta()
model.compile(optimizer=optimizer, loss=mae_loss_masked)

Categories