For a project, I need to have a detector that can detect many different object. For that, the 90 classes of COCO are not enough because I would like to be able to see more.
I have seen that imagenet for instance has many more classes, however I couldn't find a model trained to detect imagenet classes.
I am programming on python and I want to avoid retraining a Network to detect more classes myself.
I have looked on pytorch vision and couple of other repositories but I didn't find anything.
Thanks in advance.
EDIT: I have found a good one now, LVIS dataset has 1200 classes for detections and is using the images from coco (they relabelled them). There is good model for it with detectron2 from facebookai. https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md
I think it's only avaliable for a cuda environnement though (I have no GPU :( )
Check https://keras.io/api/applications/ for more models/datasets
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np
model = ResNet50(weights='imagenet')
ImageNet unfortunately is not labelled for that purpose.
Off the shelf, you can try and use the Tensorflow Object Detection API. There are models trained on the OpenImages dataset which has 600 classes.
You can use them straight away for inference or if you wish, retrain them.
Related
Yolo 5 is an object detection model that can be exported to several different frameworks including TensorFlow and Core ML.
https://github.com/ultralytics/yolov5
I have been able to train a Yolo 5 model, and export it to TensorFlow (TF1 graph def, or TF2 savemodel), and tried Apple Core ML.
I have not been able to find any examples for Yolo 5 on how to use these models once exported.
i.e. how to take an image file and get the detected objects/labels/coordinates
I tried similar python code to TF1 object detection, but the exported model does not seem compatible,
https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/tensorflow-1.14/
neither with TF2,
https://www.tensorflow.org/hub/tutorials/object_detection
or TFlite,
https://www.tensorflow.org/lite/examples/object_detection/overview
Use the weights that YOLOv5 calculated after training. They are usually in this route:
yolov5/runs/train/your_yolo_model/weights/best.pt
I have trained my model of doors in yolo-v3 but now I need it in TensorFlow-Lite. But, I am facing a problem, that is, if I want to train my model for tensorflow, I need annotation file in ".csv" or ".xml" but the ones I have are "*.txt". I did found a software to create annotation files manually from drawing rectangles in pictures but I can not do that for thousands of images due to time shortage.
Can anyone guide me how to handle such situation?
I have followed the following link but the resulted model did not work.
https://medium.com/analytics-vidhya/yolov3-to-tensorflow-lite-conversion-4602cec5c239
i think it will be good to train tensorflow implementation on your data , then converting tensrflow model to tflite should be easy
here is yolov3 in tf : https://github.com/YunYang1994/tensorflow-yolov3
then use official tensorflow codes to convert to tflite : https://www.tensorflow.org/lite/convert
in my project I need to train a object detection model which is able to recognize hands in different poses in real-time from a rgb webcam.
Thus, I'm using the TensorFlow object detection API.
What I did so far is training the model based on the ssd_inception_v2 architecture with the ssd_inception_v2_coco model as finetune-checkpoint.
I want to detect 10 different classes of hand poses. Each class has 300 images which are augmented. In total there are 2400 images in for training and 600 for evaluation. The labeling was done with LabelImg.
The Problem is that the model isn't able to detect the different classes properly. Even if it still wasn't good I got much better results by training with the same images, but only with like 3 different classes. It seems like the problem is the SSD architecture. I've read several times that SSD networks are not good in detecting small objects.
Finally, my questions are the following:
Could I receive better results by using a faster_rcnn_inception architecture for this use case?
Has anyone some advice how to optimize the model?
Do I need more images?
Thanks for your answers!
I am new to Tensorflow and to implementing deep learning. I have a dataset of images (images of the same object).
I want to train a Neural Network model using python and Tensorflow for object detection.
I am trying to import the data to Tensorflow but I am not sure what is the right way to do it.
Most of the tutorials available online are using public datasets (i.e. MNIST), which importing is straightforward but not helpful in the case where I need to use my own data.
Is there a procedure or tutorial that i can follow?
There are many ways to import images for training, you can use Tensorflow but these will be imported as Tensorflow objects, which you won't be able to visualize until you run the session.
My favorite tool to import images is skimage.io.imread. The imported images will have the dimension (width, height, channels)
Or you can use importing tool from scipy.misc.
To resize images, you can use skimage.transform.resize.
Before training, you will need to normalize all the images to have the values between 0 and 1. To do that, you simply divide the images by 255.
The next step is to one hot encode your labels to be an array of 0s and 1s.
Then you can build and train your CNN.
You could create a data directory containing one subdirectory per image class containing the respective image files and use flow_from_directory of tf.keras.preprocessing.image.ImageDataGenerator.
A tutorial on how to use this can be found in the Keras Blog.
Is it possible to have bounding boxes prediction using TensorFlow?
I found TensorBox on github but I'm looking for a better supported or maybe official way to address this problem.
I need to retrain the model for my own classes.
It is unclear what exactly do you mean. Do you need object detection? I assume it from the 'bounding boxes'. If so, inception networks are not directly applicable for your task, they are classification networks.
You should look for object detection models, like Single Shot Detector (SSD) or You Only Look Once (YOLO). They often use pre-trained convolutional layers from classification networks, but have additional layers on the top of it. If you want Inception (aka GoogLeNet), YOLO is based on that. Take a look at this implementation: https://github.com/thtrieu/darkflow or any other you can find in Google.
The COCO2016 winner for object detection was implemented in tensorflow. Some state of the art techniques are Faster R-CNN, R-FCN and SSD. Check the slides from http://image-net.org/challenges/talks/2016/GRMI-COCO-slidedeck.pdf (Slide 14 has key tensorflow ops for you to recreate this pipeline).
Edit 6/19/2017:
Tensorflow released some techniques to predict bboxes:
https://research.googleblog.com/2017/06/supercharge-your-computer-vision-models.html