If the pretrained model such as Resnet101 were trained on ImageNet dataset, then I change some layers inside it. Can I still be able to use the pretrained model on different ABC dataset?
Lets say This is ResNet34 Model,
It is pretrained on ImageNet and saved as ResNet.pt file.
If I changed some layers inside it, lets say I made it more deeper by introducing some layers in conv4_x (check image)
model = Resnet34() #I have changes some layers inside this ResNet34()
optimizer = optim.Adam(model.parameters(), lr=0.00005)
model.load_state_dict(torch.load('Resnet.pt')['state_dict']) #This is pretrained model of ResNet before some changes
optimizer.load_state_dict(torch.load('Resnet.pt')['optimizer'])
Can I do this? or there are anyother method?
You can do anything you like - the question is: would it be better than training from scratch?
Here are a few issues you might encounter:
1. A mismatch between weights saved in ResNet.pt (the trained weights of the original ResNet18) and the state_dict of your modified model.
You would probably need to manually make sure that the old weights are correctly assigned to the original layers and only the new layer is not initialized.
2. Initializing the weights of the new layer.
Since you are training a resNet - you can take advantage of the residual connections and init the weights of the new layer such that it would initially make no contribution to the predicted value and only pass the input directly to the output via the residual link.
I am trying to train a classifier of images with two classes and here is my neural net
model=Sequential()
model.add(tf.keras.layers.Conv2D(3,(3,3),activation="relu"))
model.add(tf.keras.layers.Conv2D(32,(3,3),activation="relu"))
model.add(tf.keras.layers.Conv2D(16,(3,3),activation="relu"))
model.add(tf.keras.layers.Conv2D(8,(3,3),activation="relu"))
model.add(Flatten())
model.add(Dense(2,activation="softmax"))
that works fine when all images are resized to a particular size.But i wish to train it with out resizing images when i remove the flatten layer my model is giving output for image of any sizewhereas when i use flatten layer with different image size it is giving me error the second time i use my model.
Is there any alternative to replace flatten layer that does work on any input shape plaese let me know
You can make a CNN without a pre-specifed input shape. You need to replace Flatten with GlobalMaxPool2D. This works because contrary to flatten, GlobalMaxPool2D gives an output tensor of size of feature maps present irrespective of the input shape of each feature map. Flatten freezes the size by converting the 2 dimensions to a single dimension output. The shape of each feature map is dependent on the initial input size but the number of feature maps is determined in the model. Specify the input shape as (None, None, channels) this will let the model know that the number of elements in this dimension is not constant (Just like batch training).
The answer would seem a bit messy but in summary you have to do the following:
Change Flatten to GlobalMaxPool2D
Change input shape to (None, None, channels)- repeat None according to the number of image dimensions and number of channels is mandatory.
You can't train a CNN without a fixed input shape, this will produce different feature map sizes. You must use a function to reshape all the input images :
import cv2
def my_resize(img, shape=(32, 32, 1)):
return cv2.resize(img, dsize=shape)
inputs = [...] # list of input images
outputs = list(map(my_resize, inputs))
You can simply do this by changing the input shape to (None, None). Acknowledge why this is so: a convolutional layer is learning the parameters of the filter, whose shape is entirely independent of the shape of the input matrix. But, you must ensure that you train the model on images of varying shape, otherwise, the weights learned will not generalize well for other input shapes. Simply upscaling low-resolution images typically would not suffice; you need actual data of varying shape. However, downscaling high-res data should work fine I suppose.
This is not possible. If you add even a simple perceptron then it will have to be initialised with the weights and it will be completely random. Practically, you cant do what you are intending to do. So you should resize it before sending it into the model.
Here is what I wanna do:
I want to use some transfer learning techniques to deal with sequence problems:
First use dataset_1 to train a lstm model,
Second insert another lstm layer before the output layer,
and then use dataset_2 to only train the newly adding layer, and the variables of other layers are imported from the first training stage and remain unchanged
here's the problem, the existing methods all require the variable name of weight/bias when restoring the pre-trained model. And I want to use the fuction tf.contrib.rnn.MultiRNNCell(*) in my code.However, the function is a black box and unable to obtain concrete variable names. How can I realize the idea?
This might be too stupid to ask ... but ...
When using LSTM after the initial Embedding layer in Keras (for example the Keras LSTM-IMDB tutorial code), how does the Embedding layer know that there is a time dimension? In another word, how does the Embedding layer knowthe length of each sequence in the training data set? How does the Embedding layer know I am training on sentences, not on individual words? Does it simply infer during the training process?
Embedding layer is usually either first or second layer of your model. If it's first (usually when you use Sequential API) - then you need to specify its input shape which is either (seq_len,) or (None,). In a case when it's second layer (usually when you use Functional API) then you need to specify a first layer which is an Input layer. For this layer - you also need to specify shape. In a case when a shape is (None,) then an input shape is inferred from a size of a batch of data fed to a model.
I am studying deep learning and trying to implement it using CAFFE- Python. can anybody tell that how we can assign the weights to each node in input layer instead of using weight filler in caffe?
There is a fundamental difference between weights and input data: the training data is used to learn the weights (aka "trainable parameters") during training. Once the net is trained, the training data is no longer needed while the weights are kept as part of the model to be used for testing/deployment.
Make sure this difference is clear to you before you precede.
Layers with trainable parameters has a filler to set the weights initially.
On the other hand, an input data layer does not have trainable parameters, but it should supply the net with input data. Thus, input layers has no filler.
Based on the type of input layer you use, you will need to prepare your training data.