I am trying to build a siamese model with a rather complex base network. After building the base network, I use the following code to build my siamese network:
base_network=create_base_model(0.2)
img1=Input(shape=(256,256,3))
img2=Input(shape=(256,256,3))
text_input1 = Input(shape=(), dtype=tf.string, name='text_1')
text_input2 = Input(shape=(), dtype=tf.string, name='text_2')
output1= base_network([img1, text_input1])
output2= base_network([img2, text_input2])
distance = Lambda(euclidean_distance)([output1, output2])
siamese_model = Model([[img1,text_input1], [img2, text_input2]], distance)
The output of the base network is of the form model where
model=Model(inputs=[input1,input2], outputs=[z])
The issue is that after training the siamese network, I want to use the output of the base network as an embedding so I can run unsupervised learning algorithms. However, when training the siamese network, I want to train it for 10 epochs at a time, then save it and continue training if needed. In this scenario, I am not sure how to save/access the base network when I save and reload the Siamese model. For example, I get the following plot for the siamese model which requires 2 inputs (my base model uses 2 inputs so technically I have 4 inputs as shown in the diagram), but I want to use the base model which requires only 1 input post training (technically 2 as my base model uses 2).
Can anyone give me advice on how to load the updated base model using the saved siamese model, or if there's a better approach saving it in the first place?
Thanks very much.
if epoch %5 == 0
path = f'/tmp/model{epoch}.h5'
base_network.save(path)
base_network = tf.keras.models.load_model(path)
Isn't that okay?
Related
If the pretrained model such as Resnet101 were trained on ImageNet dataset, then I change some layers inside it. Can I still be able to use the pretrained model on different ABC dataset?
Lets say This is ResNet34 Model,
It is pretrained on ImageNet and saved as ResNet.pt file.
If I changed some layers inside it, lets say I made it more deeper by introducing some layers in conv4_x (check image)
model = Resnet34() #I have changes some layers inside this ResNet34()
optimizer = optim.Adam(model.parameters(), lr=0.00005)
model.load_state_dict(torch.load('Resnet.pt')['state_dict']) #This is pretrained model of ResNet before some changes
optimizer.load_state_dict(torch.load('Resnet.pt')['optimizer'])
Can I do this? or there are anyother method?
You can do anything you like - the question is: would it be better than training from scratch?
Here are a few issues you might encounter:
1. A mismatch between weights saved in ResNet.pt (the trained weights of the original ResNet18) and the state_dict of your modified model.
You would probably need to manually make sure that the old weights are correctly assigned to the original layers and only the new layer is not initialized.
2. Initializing the weights of the new layer.
Since you are training a resNet - you can take advantage of the residual connections and init the weights of the new layer such that it would initially make no contribution to the predicted value and only pass the input directly to the output via the residual link.
I am training a CNN model for Image Classification using Keras. I am using VGG19 model and a custom dataset with 200 classes and uniformly distributed 90000 training images, 10000 Validation Images and 10000 test images. Even though the training is at 200 epochs, the accuracy is staying at a constant 0.0050. Same with the loss, 5.2988. I am using Kaggle's TPU instance to run this model.
How can I make the model more accurate? Can you suggest any different pretrained models for this purpose?
Your CNN model is behaving like a random model.
I know this because since there are 200 classes, the probability of getting a correct class at random is 1/200=0.0050 which is the accuracy that you have.
This happens when you use tensorflow/keras API instead of sequential()
Since you are using VGG19, if you are trying to use transfer learning, then maybe you have freezed the wrong layer.
If you are using API then you have to do
model = Model(inputs = input_layer, outputs = output_layer) #which is not required in sequential()
print(model.layers) # if you are using API or sequential() this is used to check your layers
Then you have to freeze the layer required as
model.layers[index_of_freeze_layer].trainable = False
If you are not freezing your model layers, then try to use lower learning rate since VGG19 is very sensitive to learning rate. (0.00001 or less depends)
I am trying to build a face verification system using keras and resnet50 model with vggface weights. The way i am trying to achieve this is by the following steps:
given two image i first find out the face using mtcnn as embeddings
then i calculate the cosine distance between two vector embeddings. the distance starts from 0 to 1..... (Here to be noted
that the lower the distance the same two faces is)
Using the pre-trained model of resnet50 i get fairly good result. But since the model was trained mostly on european data and i want face verification on indian sub-contient i cannot rely on that. I want to train them on my own dataset. I have 10000 classes with each class containing 2 image. With image augmentation i can create 10-15 image per class from those two image.
here is the sample code i am using for training
base_model = VGGFace(model='resnet50',include_top=False,input_shape=(224, 224, 3))
base_model.layers.pop()
base_model.summary()
for layer in base_model.layers:
layer.trainable = False
y=base_model.input
x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024,activation='relu')(x) #we add dense layers so that the model can learn more complex functions and classify for better results.
x=Dense(1024,activation='relu')(x) #dense layer 2
x=Dense(512,activation='relu')(x) #dense layer 3
preds=Dense(8322,activation='softmax')(x) #final layer with softmax activation
model=Model(inputs=base_model.input,outputs=preds)
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()
train_datagen=ImageDataGenerator(preprocessing_function=preprocess_input) #included in our dependencies
train_generator=train_datagen.flow_from_directory('/Users/imac/Desktop/Fayed/Facematching/testenv/facenet/Dataset/train', # this is where you specify the path to the main data folder
target_size=(224,224),
color_mode='rgb',
batch_size=32,
class_mode='categorical',
shuffle=True)
step_size_train=train_generator.n/train_generator.batch_size
model.fit_generator(generator=train_generator,
steps_per_epoch=step_size_train,
epochs=10)
model.save('directory')
As far as the code code is concern what i understand is that i disable the last layer then add 4 layer train them and store them in a diectory.
i then load the model using
model=load_model('directory of my saved model')
model.summary()
yhat = model.predict(samples)
i predict the embedding of two image and then calculate cosine distance. But the problem is that the prediction gets worsen with my trained model. For two image of same person the pre-trained model gives distance of 0.3 whereas my trained model show distance of 1.0. Although during training loss function is decreasing with each epoch and accuracy is increasing but that doesn't reflect on my prediction output. I want to increase the prediction result of pre-trained model.
How can i achieve that with my own data?
N.B: I am relatively new in machine learning and don't know a lot about model layers
What I would suggest is to go with triplet or siamese with these many number of classes. Use MTCNN to extract faces and then use facenet architecture to generate 512 dimensions embedding vectors, then visualize it using TSNE plot. Every face will be assigned a small embedding cluster. Go through this link for Keras to generate face embeddings: Link.
Then, try Triplets semi-hard and hard loss on your dataset to cluster them into 10000 classes. It might help. Go through this detailed blog on triplet loss: Triplets. Codes to go through some of the repositries: Code.
I am working on a Signature Verification project . I have used the ICDAR 2011 Signature Dataset.Currently,I am pairing the encoding of an original image and a forgery to get a training sample(labelled 0). The encodings are obtained from a pre-trained VGG-16 convolutional neural network (removing the fully connected layer). I have then modified the fully connected layer having the following architecture :
Input size : 50177
1st hidden layer : 1000 units (activation : "sigmoid",Dropout : 0.5)
2nd hidden layer : 500 units (activation : "sigmoid",Dropout : 0.2)
Output Layer : 1 unit (activation : "sigmoid")
The issue is that although the training set accuracy increases the validation accuracy fluctuates randomly.It performs very badly on the test set
I have tried different architectures but nothing seems to work
So is there any other way to prepare the data or should I continue trying different architectures??
I don't think that using a VGG16 model for features extraction for your task is the right way to go. You are using a model that was trained on relatively complex RGB images and than try to use it for a dataset that basically consists of grayscale images of edges (signatures). And you are using the last embedding layer which contains the most complex and specialized representation of the ImageNet dataset (the original training dataset for the VGG model).
The features you get have no real meaning and that is probably why the training accuracy and validation accuracy are not correlated at all when you try to fine-tune the model.
My suggestion is to either use an earlier layer of the VGG16 for feature extraction (I'm talking somewhere around layer no.5-6), or better yet, use a simpler model that was trained on a more similar dataset, like the MNIST dataset.
The MNIST dataset consists of handwritten digits so it is considerably more similar to your task and any model trained on it will act as a much better feature extractor for your task.
You can pick any model from the following list of benchmark results on the MNIST and use it as a feature extractor:
MNIST Benchmark Results
I'm trying to implement an adversarial loss in keras.
The model consists of two networks, one auto-encoder (the target model) and one discriminator. The two models share the encoder.
I created the adversarial loss of the auto-encoder by setting a keras variable
def get_adv_loss(d_loss):
def loss(y_true, y_pred):
return some_loss(y_true, y_pred) - d_loss
return loss
discriminator_loss = K.variable()
L = get_adv_loss(discriminator_loss)
autoencoder.compile(..., loss=L)
and during training I interleave train_on_batch of discriminator and autoencoder to update discriminator_loss
d_loss = disciminator.train_on_batch(x, y_domain)
discriminator_loss.assign(d_loss)
a_loss, ... = self.segmenter.train_on_batch(x, y_target)
However, I found out that the value of these variables is frozen when the model is compiled. I tried to recompile the model during training but that raise the error
Node 'IsVariableInitialized_13644': Unknown input node
'training_12/Adam/Variable'
which I guess it means i cant recompile during training? any suggestion on how i can inject the discriminator loss in the autoencoder?
Keras model supports multiple outputs. So just include your discirminator into your keras model and freeze the discrminator layers, if the discriminator should not be trained.
The next question would be how to combine autoencoder loss and discriminator loss. Luckily keras model.compile supports loss weights. If autoencoder is your first output and discriminator is your second you could do something like loss_weights=[1, -1]. So a better discriminator is worse for the autoencoder.
Edit: Here is an example, how to implement an Adversary Network:
# Build your architecture
auto_encoder_input = Input((5,))
auto_encoder_net = Dense(10)(auto_encoder_input)
auto_encoder_output = Dense(5)(auto_encoder_net)
discriminator_net = Dense(20)(auto_encoder_output)
discriminator_output = Dense(5)(discriminator_net)
# Define outputs of your model
train_autoencoder_model = Model(auto_encoder_input, [auto_encoder_output, discriminator_output])
train_discriminator_model = Model(auto_encoder_input, discriminator_output)
# Compile the models (compile the first model and then change the trainable attribute for the second)
for layer_index, layer in enumerate(train_autoencoder_model.layers):
layer.trainable = layer_index < 3
train_autoencoder_model.compile('Adam', loss=['mse', 'mse'], loss_weights=[1, -1])
for layer_index, layer in enumerate(train_discriminator_model.layers):
layer.trainable = layer_index >= 3
train_discriminator_model.compile('Adam', loss='mse')
# A simple example how a training can look like
for i in range(10):
auto_input = np.random.sample((10,5))
discrimi_output = np.random.sample((10,5))
train_discriminator_model.fit(auto_input, discrimi_output, steps_per_epoch=5, epochs=1)
train_autoencoder_model.fit(auto_input, [auto_input, discrimi_output], steps_per_epoch=1, epochs=1)
As you can see there is no much magic behind building an Adversary Model with keras.
Unless you decide to go deep in the keras source code, I don't think you can do this easily. Before writing your own adversarial module, you should check the existing works carefully. As far as I know, keras-adversarial is still used by many people. Of course, it only supports old keras versions, e.g. 2.0.8.
Several other things:
be careful when you freeze your model weights. If you first compile a model and then freeze some weights, these weights are still trainable, because when the train function is generated during compiling. So you should freeze weights first then compile.
keras-adversarial does this job in a more elegant way. Instead of making two models, shared weights but freeze some weights in different ways, it creates two train functions, one for each player.