I trained my dataset for tensorflow object detection using both ssd and faster r-cnn model.There were 220 train and 30 test images in my dataset.
I trained the model for 200k steps and got loss under 1.But when i tested my trained model on video it was detecting and labelling almost everything in the video.
Can anyone tell me why is that happening?
Thank you
The number of classes you are using is just one and you trained your model with images belonging to the same class and tested it for the same.
So the problem is the model is skewed(predicts the same for all images)
No matter whatever image you test it on, you will get the same output.
Solution:
Train you model with an nearly equal number of negative images.
Ex:220 images containing the object to be identified(label them as 1) and another nearly 220 images not containing the object(label them as 0)
Use F1 score to check your accuracy because it will help you understand if the dataset is skewed or not.
Check this to learn about different kinds of accuracy measures.
Take this course to learn more about CNNs.
Related
I am using MRCNN in python to train 20 images (with annotated images info saved as json file) for object detection. The problem is that at the best case the loss is around 4 which it shows that the model has not learned well (the loss fluctuates a lot during the learning process for each epochs). Obviously, when using the trained model for detection the result is wrong, it means that it cannot detect the object and randomly selects some pixels as the object.
Can someone kindly help me how I can improve the performance and also some hints about initial weights if the object is not one of the objects in COCO database.
Thanks in advance.
There's something about GAN's training that i don't understand. I am making a GAN for Anomaly Detection. To start I followed this guide here to create a DCGAN (and understand how it works) and then move into the Anomaly Detection stuff.
I understand how the two training's phases work for GANs and after nearly 2000 epochs the generator generate some good fake images. The problem is that the discriminator is not good to detect anomalies: if i try to give in input a real image, it produce a value between 0.5 and 1, no matter if the image has anomaly or not.
So basically, the discriminator is good to distinguish real images from fake images, but not good to discriminate real images with anomalies.
I tried to train the model some more but the results won't change (instead, it seems worst than before!). The two losses keep varing around 0 and 1, for example now the model has:
gen_loss: 0.97844017, disc_loss: 0.9973822
What should I do to improve my net and make anomaly detection? It needs to be trained even more to make a better discriminator or for make anomaly detection should i add something more?
Thanks in advice, i'm definitely doing something wrong. If needed i can post some code and more information about my net.
P.S. My notebook is very similar to the one i linked before, the only difference is that i tried to feed test images to the discriminator after the training.
There is this interesting paper Efficient GAN-based anomaly detection.
To evaluate the anomaly detection, they use the following experimental setting
MNIST: We generated 10 different datasets from MNIST by successively
making each digit class an anomaly and treating the remaining 9 digits
as normal examples. The training set consists of 80% of the normal data
and the test set consists of the remaining 20% of normal data and all
of the anomalous data. All models were trained only with normal data and
tested with both normal and anomalous data.
I made a simple CNN that classifies dogs and cats and I want this CNN to detect images that aren't cats or dogs and this must be a third different class. How to implement this? should I use R-CNN or something else?
P.S I use Keras for CNN
What you want to do is called "transfer learning" using the learned weights of a net to solve a new problem.
Please note that this is very hard and acts under many constraints i.e. using a CNN that can detect cars to detect trucks is simpler than using a CNN trained to detect people to also detect cats.
In any case you would use your pre-trained model, load the weights and continue to train it with new data and examples.
Whether this is faster or indeed even better than simply training a new model on all desired classes depends on the actual implementation and problem.
Tl:Dr
Transfer learning is hard! Unless you know what you are doing or have a specific reason, just train a new model on all classes.
You can train that almost with the same architecture (of course it depends on this architecture, if it is already bad then it will not be useful on more classes too. I would suggest to use the state of the art model architecture for dogs and cats classification) but you will also need the dogs and cats dataset in addition to this third class dataset. Unfortunately, it is not possible to use pre-trained for making predictions between all 3 classes by only training on the third class later.
So, cut to short, you will need to have all three datasets and train the model from scratch if you want to make predictions between these three classes otherwise use the pre-trained and after training it on third class it can predict if some image belongs to this third class of not.
You should train with new_category by add one more category, it contains images that are not in 2 category before. I mean
--cat_dir
-*.jpg
--dog_dir
-*.jpg
--not_at_all_dir
-*.jpg
so.. total categories you will train are 3 categories.
(categories or classes whatever it is)
then add the output of final dense fullyconnected (3 categories)
I'm starting out with GANs and I am training a DC-GAN on MNIST dataset. I want to evaluate my model using Frechet Inception Distance (FID).
Since Inception network is not trained to classify MNIST digits, can I use any simple MNIST classifier or are there any conditions on what kind of classifier I need to use? Or should I use Inception net only? I have few other questions
Does it make sense to compute FID for MNIST GAN?
How many images from real dataset should be used while computing FID
For a classifier I'm using, I'm getting FID in the order of 10^6. Is the value okay or is something horribly wrong?
If you can answer any of these questions, even partially, that would be of immense help to me. Thanks!
you can refer this.
Use a auto-encoder trained on MNIST and the bottleneck activations as the features as explained here
Model trained on Mnist dont do well on FID computation. As far as I can tell, major reasons are data distribution is too narrow(Gan images are too far from distribution model is trained on) and model is not deep enough to learn a lot of feature variation.
Training a few-convolutional layers model gives 10^6 values on FID. To test the above hypothesis, just adding L2 regularization, the FID values dropped to around 3k(confirming to data distribution being narrow), however the FID value dont improve as GAN training goes on. :(.
Finally, directly computing FID values from Inception network gives a nice plot as images becomes better.
[Note:- You need to rescale mnist image and convert to RGB by repeating one channel 3 times. Make sure real image and generated image have same intensity scales.]
I'm currently using a custom version of YOLO v2 from pjreddie.com written with Tensorflow and Keras. I've successfully got the model to start and finish training over 100 epochs with 10000 training images and 2400 testing images which I randomly generated along with the associated JSON files all on some Titan X gpus with CUDA. I only wish to detect two classes. However, after leaving the training going, the loss function decreases but the test accuracy hovers at below 3%. All the images appear to be getting converted to black and white. The model seems to perform reasonably on one of the classes when using the training data, so the model appears overfitted. What can I do to my code to get the model to become accurate?
Okay, so it turned out that YOLOv2 was performing very well on unseen data except that the unseen data has to be the same size of images as the ones it's trained on. Don't feed Yolo with 800x800 images if it's been trained on 400x400and 300x400 images. Also the Keras accuracy measure is meaningless for detection. It might say 2% accuracy and actually be detecting all objects. Passing unseen data of the same size solved the problem.