I am using MRCNN in python to train 20 images (with annotated images info saved as json file) for object detection. The problem is that at the best case the loss is around 4 which it shows that the model has not learned well (the loss fluctuates a lot during the learning process for each epochs). Obviously, when using the trained model for detection the result is wrong, it means that it cannot detect the object and randomly selects some pixels as the object.
Can someone kindly help me how I can improve the performance and also some hints about initial weights if the object is not one of the objects in COCO database.
Thanks in advance.
Related
I'm trying to train a model with Yolo v5 to detect multiple objects on sales flyers. each image in the dataset used in training contains only one object and obviously a single bounding box.
I'm wondering if that will affect the performance of the model in a bad way? because what I'm trying to do at the end is detecting multiple objects on each sale flyer.
Thank you for your help.
It probably will lower your AP if you work like this, but give it a try. It really depends on your training images and your data augmentations. I dont know about YOLOv5 but YOLOv4 has its Mosaic Data Augmentation which will adress your problem in a way i guess.
in my project I need to train a object detection model which is able to recognize hands in different poses in real-time from a rgb webcam.
Thus, I'm using the TensorFlow object detection API.
What I did so far is training the model based on the ssd_inception_v2 architecture with the ssd_inception_v2_coco model as finetune-checkpoint.
I want to detect 10 different classes of hand poses. Each class has 300 images which are augmented. In total there are 2400 images in for training and 600 for evaluation. The labeling was done with LabelImg.
The Problem is that the model isn't able to detect the different classes properly. Even if it still wasn't good I got much better results by training with the same images, but only with like 3 different classes. It seems like the problem is the SSD architecture. I've read several times that SSD networks are not good in detecting small objects.
Finally, my questions are the following:
Could I receive better results by using a faster_rcnn_inception architecture for this use case?
Has anyone some advice how to optimize the model?
Do I need more images?
Thanks for your answers!
I trained my dataset for tensorflow object detection using both ssd and faster r-cnn model.There were 220 train and 30 test images in my dataset.
I trained the model for 200k steps and got loss under 1.But when i tested my trained model on video it was detecting and labelling almost everything in the video.
Can anyone tell me why is that happening?
Thank you
The number of classes you are using is just one and you trained your model with images belonging to the same class and tested it for the same.
So the problem is the model is skewed(predicts the same for all images)
No matter whatever image you test it on, you will get the same output.
Solution:
Train you model with an nearly equal number of negative images.
Ex:220 images containing the object to be identified(label them as 1) and another nearly 220 images not containing the object(label them as 0)
Use F1 score to check your accuracy because it will help you understand if the dataset is skewed or not.
Check this to learn about different kinds of accuracy measures.
Take this course to learn more about CNNs.
I am trying to use ssd_inception_v2_coco pre-trained model from Tensorflow API by training it with a single class dataset and also applying Transfer Learning. I trained the net for around 20k steps(total loss around 1) and using the checkpoint data, I created the inference_graph.pb and used it in the detection code.
To my surprise, when I tested the net with the training data the graph is not able to detect even 1 out of 11 cases (0/11). I am lost in finding the issue.
What might be the possibile mistake?.
P.S : I am not able to run train.py and eval.py at the same time, due to memory issues. So, I don't have info about precision from tensorboard
Has anyone faced similar kind of issue?
I have trained tensorflow object detection api on my own dataset with 1 class using rfcn_resnet101 model. Firstly I used the raccoon dataset and trained for 264600 times and the detection result is weird, it can detect the object, but there are some other little boxes around the right box.
Then I use another dataset containing one class,and there are 80000 images in the dataset, I met the familiar phenomenon. I am very confused.
Have anyone ever met the same situation? What can I do to solve this problem? Thanks in advance!
I had the same behavior on the PASCAL VOC Dataset. I haven't fixed it, because I just implemented a model for kind of a proof-of-concept system. My guess is, that the model predicts the proposal regions and accepts them, if there is an IoU greater or equal to an defined threshold. So setting the nms_iou_threshold might solve the problem.
This Adoption also seems to fit your examples. All of the predicted bounding boxes seem to have an IoU with the groundtruth box.