How to add a preprocessing layer to a pretrained caffe model? - python

I have a pre-trained image classification model saved in caffe, the model is expected to get grayscale(one channel) images. I want to use this model in a tool that only provides input of RGB(three channels) to the model. It is not possible to change the way this tool provides images so I thought of adding a layer before the input layer that transforms the input to one channel only, is that possible in caffe? and how?
I'm looking for a solution that doesn't require to define new layers to caffe if possible.
Note that I have the ".prototxt" and the ".weights" files of the model.
I previously did a similar thing in tensorflow but I don't know if this is possible in caffe and didn't find much material online.

You can add a Python layer to do it for you.
What is a Python layer.
An example of such a layer can be found here.

Related

Using a siamese model to obtain an embedding by cutting off one half?

I used Keras to build a Siamese network using the coding format of one of the questions posted (please see code sample here). To explain this briefly, I built a Siamese network using the pretrained efficient net so that each copy of the network produces a dense layer which then get combined into into a L1-similarity output.
However, during prediction time, I only want to obtain the dense output of one of the layers (as an embedding). I plan on using a variety of unsupervised learning methods (including KNN) on these outputs.
During prediction, how can I ask keras to run only one copy of my network graph using a single input? Can I extract only a part of the NN graph? I don't want to have to always generate pairs of images or run the cost of running 2 images when I only need one output.
Let me just make sure that I understand your question and context. You are using a Siamese network (efficient net) and you want to generate embeddings for your input images.
From the image below, you only want to save the image encodings for one the ConvNets?
If that is the case, I dont really see the point of building a Siamese network at all. Just go for a single ConvNet (using efficient net). Because if you use the Siamese network model, it will always ask you to make image pairs.
If you go for only a single ConvNet model, and you identify the layer which you want to use to get the embeddings, then you can use the tf.keras.backend.function like this:
get_layer_output = tf.keras.backend.function([fine_tuned_model.layers[0].input],[fine_tuned_model.layers[-2].output])
Which then, for the predict, you can call it like this:
features = get_layer_output([x])[0]

Which layer of VGG19 should I use to extract feature

Now, I want feature of image to compute their similarity. We can get feature using pre-trained VGG19 model in tensorflow easily. But VGG19 model has many layers, and I don't know which layer should I use to get feature. Which layer's output is appropriate for this problem?
# I think this how is correct to extract feature
model = tf.keras.application.VGG19(include_top=True,
weight='imagenet')
input = model.input
output = model.layers[-2].output
extract_model = tf.keras.Model(input, output)
It's my infer that the more closer to last output, the more the model output powerful feature. But some tutorials says 'use include_top=False to extract feature' (e.g Image Captioning with Attention TensorFlow)
So, I don't know which layer should I use. Please try to help me here in this thread.
The include_top=False may be used because the last 3 layers (for that specific model) are fully connected layers which are not typically good feature vectors. If the model directly outputs a feature vector, then you don't need it.
Most people use the last layer for transfer learning, but it may depend on your application. For example, Gatys et. al. show that the first few layers of VGG are sensitive to the style of the image and later layers are sensitive to the content.
I would probably try all of them in a hyperparameter search and see which gives the best performance. If by image similarity you mean the similarity of objects contained inside, I would probably start with the last layer.

In a CNN, how to view the weights of multiple filters?

I am trying to get a better understanding of CNNs and so I am using keras to basically make a small CNN and want to go through the calculations by hand.
I downloaded the images from the GTSRB database, then using PIL library package converted the image set to greyscale and resized to (6 x 6).
The code below shows the CNN I've created.
It includes 1 convolution layer (with 2 filters of size 2x2), 1 max pooling layer (2x2), a flattening layer and a dense layer at the end.
model = keras.models.Sequential()
model.add(keras.layers.Conv2D(2, kernel_size=(2,2),activation='relu', input_shape=(6,6,1)))
model.add(keras.layers.MaxPool2D(pool_size=(2,2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(len(sign_label_list),activation='relu'))
I then trained the network and saved the model and weights.
I read online that for checking the weights (h5 file type), I need a tool to view the weights. So I downloaded HDFView tool.
Now I am trying to view the weights for each of the filters, but I can only see the weights of 1 of the filters.
Filter weights
How would I get the weights of both the filters?
Does anyone know if there is a way to view the weights through python?
Originally, I wanted to test with only 1 filter but I get nan when I view the weights.
Looking through the documentation and Keras FAQ found here.
The suggested way to view weights for a particular layer is to do this:
weights,biases = model.layers[0].get_weights()
I then printed the weights to the console using print(weights) and this displayed the values of all filters.
However, I still had trouble viewing the weights of multiple filters using the HDFView tool.

Tensorflow: How to load a pre-trained ResNet model

I want to use a pre-trained ResNet model which Tensorflow provides here.
First I downloaded the code (resnet_v1.py) to reconstruct the model's graph here. The model's weights (resnet_v1_50.ckpt) can be found on the same page here.
The model can be tested using the following script (resnet_v1_test.py) from here. However, I have problems to extract the right information from resnet_v1_test.py. I don't understand many things that happen in this script. Which are the essential functions to pass a random image through the network? How can I access the weights and activations for further work?
What are the next steps from here? I would appreciate any help!
TL;DR: How can I use the resnet_v1_test.py script to perform classification and access weights and activations?

Access lower layer output in higher layer using CNTK and transfer learning

I am searching for a way to forward lower layer output to a higher layer with a loaded VGG16 model using CNTK.
The background of my problem is:
I reimplemented some parts of Fully Convolutional Networks for Semantic Segmentation but then I ran into some problems: Starting with this example I first replaced the fully connected layers with fully convolutional and slit the sequence in the model definition part into chunks where I could simply access pool3 and pool4 for the later usage in eg. Convolution2D((1,1), num_classes, name='score_pool4')(pool4). This works fine but after building the model I noticed, that I need to implement an own way to read batches because the build-in reader does not support 2D labels right now. Now I simply read the images using OpenCV and replaced the training_session(...).train() with a for loop and trainer.train_minibatch({model['features']: my_loaded_features, model['labels']: my_2D_labels}) this works well but because of the removed training_session part I don't know where I could apply the existing VGG16 weights.
My problem is:
I searched for transfer learning examples where those guys load models using C.load_model(...) and then clone the needed layers but now I am wondering how could I access cloned_layers->pool4 (in the middle of the loaded model) if I also want to use it in deeper layers.
I tried Convolution2D((1,1), num_classes, name='score_pool4')(cloned_layers.find_by_name('pool4'))but I ended up with some error messages while learner initialization because of "unknown shape information" in used weight variables.
So how can I access those layers within the loaded model for later (deeper) usage?
Thanks for reading (and maybe helping)!
If you are looking to read custom data. There are two tutorials on building your own readers. https://cntk.ai/pythondocs/manuals.html
Regarding cloning parts of a network - here is a link to another post on StackOverflow that has exemplar code

Categories