Hello Stack Overflow!
I am looking to use a resnet50 face classification model to transform it in a ssd, yolo or efficientDet. Is this even possible? Basically I am looking to use a trained model that detects single classes in an image to detect more than a single class in an image. To partition an input image, detect the objects(faces) in the given image based on my resnet50 classification model, where I give the yolo my resnet classification model as parameter.
Thanks in advance!
Related
Currently, I'm working with a dataset where I have two kinds of images: "sharp version" of the image and "blurry version" of the same images, where a blur was added synthetically. My goal is to train a model that takes the blurry version of the images in and tries to deblur the image as much as it can so that the "deblurred image" is closer to the sharp version. In the literature, the UNet architecture seemed to be a model with good results. Additionally, I can use a pre-trained U-Net via Pytorch (https://pytorch.org/hub/mateuszbuda_brain-segmentation-pytorch_unet/).
My problem is now: When I train this pre-trained U-Net with my images and then try it on my test set, I get the following output:
The original image:
I know that this pre-trained model is usually used for biomedical image segmentation but I'm rather confused about how I have to modify the model to use it for an Image Deblurring/Reconstruction task. Does anyone have any advice on how to do this?
I would appreciate any feedback :)
The U-net you're using is for segmentation (classification of each pixels of the image) whereas you're trying to denoise the image (getting your image "sharper"/remove noise). It explains the results you got.
To get what you want you need and as DerekG said, you first need to modify the number of channels of the output. By modify it, you can't load all the pretrained model. You will have to copy parameters by parameters until the last one.
As the last layer is initialized randomly, you can retrained the model with your training set. You can freeze or not the pretrained parts.
Also, I'm not sure what your new dataset is but if it's really not related to biomedical images you should retrain your network from scratch (transfer learning shouldn't be done in these cases), maybe even change the encoder-decoder network.
From the included link:
import torch
model = torch.hub.load('mateuszbuda/brain-segmentation-pytorch', 'unet',
in_channels=3, out_channels=1, init_features=32, pretrained=True)
The model you define has a single output channel, resulting in a grayscale image output. You need 3 output channels for an RGB image.
I have created an own model and I trained it with ImageTrainGenerator- from Keras using flow_from_directory.
Like this: how to train model with batches. Everything works fine, I checked the generated batches, and the pictures are as it has to be.
My problem is, that I want to use this trained model in online face detection. I crop the faces on desired width and height, I convert it into array, but the prediction is horrible.
I think that the live streamed image has to be the same as what the Imagetraingenerator creates (batches). Any idea how can I convert cv2.imread(path) image into batch to predict the class?
You just have to add the batch dimensions to convert it to a batch with 1 sample: np.expand_dims(img, axis=0).
I have used combination of MTCNN (for face detection) and Facenet model is trained on different faces and have generated weights (face embedding) into .npz file. I have used Keras API to load model and train and use it for inference for further face recognition. This whole setup is working fine.
Now, I want to use the same weights for Face Recognition in Android app using Firebase AutoML custom model implementation which supports only tensorflow-lite models. So I want to convert the Facenet trained weights (face embedding in '.npz' file format) into tensorflow-lite (.tflite) model.
But I am not able to find any solution for it there are options to convert Facenet frozen model '.pb' file to convert into tflite. Click here for details.
Please help if you have any idea about this conversion.
Thanks
I saved an image classifier I trained on two different classes and want to classify a new image using the classifier. Once I have my model loaded what tf function do I call to return the softmax prediction of the final layer after feeding an image?
Thank you
You should run model.predict(image_to_classify), if you just want the index of the prediction, and not the probabilities run np.argmax(model.predict(image_to_classify))
I'm trying to look for the classification of images with labels using RNN with custom data. I can't find any example other than the Mnist dataset. Any help like this repository where CNN is used for classification would be grateful. Any help regarding the classification of images using RNN would be helpful. Trying to replace the CNN network of the following tutorial.
Aymericdamien has some of the best examples out there, and they have an example of using an RNN with images.
https://github.com/aymericdamien/TensorFlow-Examples
https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/recurrent_network.ipynb
The example is using MNIST, but it can be applied to any image.
However, I'll point out that you're unlikely to find many examples of using an RNN to classify an image because RNNs are inferior to CNNs for most image processing tasks. The example linked to above is for educational purposes more than practical purposes.
Now, if you are attempting to use an RNN because you have a sequence of images you wish to process, such as with a video, in this case a more natural approach would be to combine both a CNN (for the image processing part) with an RNN (for the sequence processing part). To do this you would typically pretrain the CNN on some classification task such as Imagenet, then feed the image through the CNN, then the last layer of the CNN would be the input to each timestep of an RNN. You would then let the entire network train with the loss function defined on the RNN.