What are pretrained weights and initialization weights in Mask R-CNN? - python

I'm trying to train Mask R-CNN for instance segmentation. Where are some available pretrained models. Are these weights for the whole neural net or only for encoder/backbone (for instance resnet50)? Also there's initializations weights using imagenet or coco. With the latter all weights in decoder are random?

Here are some of the best Mask RCNN implementation,
1. https://github.com/matterport/Mask_RCNN
2. https://github.com/CharlesShang/FastMaskRCNN
3. https://github.com/multimodallearning/pytorch-mask-rcnn
4. https://github.com/wannabeOG/Mask-RCNN
Pretrained weights and initialization weights are normally for the whole network and not for the backbone.
Recommend you to do your research before posting on stackoverflow. Please go to github and search for Mask RCNN, you'll find loads of repos.

Related

Does it make sense to train a pre-trained architecture (ResNet) with specific images to further train and evaluate with my own specific imagery)

I was wondering if it is useful to train a pre-trained resnet (pre-trained with imagenet) with images that are closer to my classification problem. I want to use 50,000 labeled images of trees from a paper to update the weights of the pre-trained resnet. Then I would like to use these weights to re-train and evaluate the resnet, hopefully better fitted this way, with my own set of images of trees.
I already used the pre-trained resnet on my own images with moderate success. Due to the small dataset size (~5,000 imagery) I thought it might be smart to further train the pre-trained resnet with more similar data.
Any suggestions or experiences you want to share?

TorchVision using pretrained weights for entire model vs backbone

TorchVision Detection models have a weights and a weights_backbone parameter. Does using pretrained weights imply that the model uses pretrained weights_backbone under the hood? I am training a RetinaNet model and um unsure which of the two options I should use and what the differences are.
The difference is pretty simple: you can either choose to do transfer learning on the backbone only or on the whole network.
RetinaNet from Torchvision has a Resnet50 backbone. You should be able to do both of:
retinanet_resnet50_fpn(weights=RetinaNet_ResNet50_FPN_Weights.COCO_V1)
retinanet_resnet50_fpn(backbone_weights=ResNet50_Weights.IMAGENET1K_V1)
As implied by their names, the backbone weights are different. The former were trained on COCO (object detection) while the later were trained on ImageNet (classification).
To answer your question, pretrained weights implies that the whole network, including backbone weights, are initialized. However, I don't think that it calls backbone_weights under the hood.

Can I train my pretrained model with a totally different architecture?

I have trained a pretrained ResNet18 model with my custom dataset on Pytorch and wondered whether I could transfer my model file to train another one with a different architecture, e.g. ResNet50. I know I have to save my model accordingly (explained well on another post here) but this was a question that I have never thought before.
I was planning to use more advanced models like VisionTransformers (ViT) but I couldn't figure out whether I had to start with a pretrained ViT already or I could just take my previous model file and use it as the pretrained model to train a ViT.
Example Scenario: ResNet18 --> ResNet50 --> Inception v3 --> ViT
My best guess it that it's not possible due to number of weights, neurons and layer structures but I would love to hear that if I miss a crucial point here. Thanks!
Between models that only differ in number of layers (Resnet-18 and Resnet-50), it has been done to initialize some layers of the larger model from the weights of the smaller model's layers. Inversely, you can truncate a larger model by taking a subset of regularly spaced layers and initialize a smaller model. In both cases, you need to retrain everything at the end if you hope to achieve semi-decent performances.
The whole point of using architectures that vastly differ (vision transformers vs CNNs) is to learn different features from the inputs and unlock new levels of semantic understanding. Recent models like BeiT also use new self-supervised training schemes that have nothing to do with the classic ImageNet pretraining. Using trained weights from another model would go against the point.
Having said that,if you want to use a ViT, why not start from the available pretrained weights on HuggingFace and fine-tune it on the data you used to train your ResNet50 ?

If we expand or reduce the layer of the same model, can we still be able to train from pretrained model in Pytorch?

If the pretrained model such as Resnet101 were trained on ImageNet dataset, then I change some layers inside it. Can I still be able to use the pretrained model on different ABC dataset?
Lets say This is ResNet34 Model,
It is pretrained on ImageNet and saved as ResNet.pt file.
If I changed some layers inside it, lets say I made it more deeper by introducing some layers in conv4_x (check image)
model = Resnet34() #I have changes some layers inside this ResNet34()
optimizer = optim.Adam(model.parameters(), lr=0.00005)
model.load_state_dict(torch.load('Resnet.pt')['state_dict']) #This is pretrained model of ResNet before some changes
optimizer.load_state_dict(torch.load('Resnet.pt')['optimizer'])
Can I do this? or there are anyother method?
You can do anything you like - the question is: would it be better than training from scratch?
Here are a few issues you might encounter:
1. A mismatch between weights saved in ResNet.pt (the trained weights of the original ResNet18) and the state_dict of your modified model.
You would probably need to manually make sure that the old weights are correctly assigned to the original layers and only the new layer is not initialized.
2. Initializing the weights of the new layer.
Since you are training a resNet - you can take advantage of the residual connections and init the weights of the new layer such that it would initially make no contribution to the predicted value and only pass the input directly to the output via the residual link.

Initializing the weights of a MLP with the RBM weights

I want to build a Deep Believe Network with scikit-learn. As I know one should train many Restricted Boltzmann Machines (RBM) individually. Then one should create a Multilayer Perceptron (MLP) that has the same number of layers as the number of (RBMs), and the weights of the MLP should be initialized with the weights of the RBMs. However I'm unable to find a way to get the weights of the RBMs from scikit-learn's BernoulliRBM. Also it doesn't seem to be a way also to initialize the weights of a MLP in scikit-learn.
Is there a way to do what I described?
scikit-learn does not currently have an MLP implemented which you can initialize via an RBM, but you can still access the weights which are stored in the components_ attribute and the bias which is stored in the intercept_hidden_ attribute.
If you're interested in using modern MLPs, torch7, pylearn2, and deepnet are all modern libraries and most of them contain pretraining routines like you describe.

Categories