I am using Tensorflow retraining model for Image Classification. I am doing single label classification.
I want to set a threshold for correct classification.
In other words, if the highest probability is less than a given threshold, I can say that the image is "unknown" i.e. if np.max(results) < 0.5 -> set label as "unknown".
So, is there any industry standard to set this threshold. I can set a random value say 60%, but is there any literature to back this threshold ?
Any links or references will be very helpful.
Thanks a lot.
Single label classification is not something Neural Networks can do "off-the-shelf".
How do you train it ? With only data relevant to your target domain ? Your model will only learn to output one.
You have two strategies:
you use the same strategy as in the "HotDog or Not HotDog app", you put the whole imagenet in two different folders, one with the class you want, the other one containing everything else.
You use the convnet as feature extractor and then use a second model like a One-Class SVM.
You have to understand that doing one class classification is not a simple and direct problem like binary classification could be.
Related
To explain the title better, I am looking to classify pictures between two classes. For example, let's say that 0 is white, and black is 1. I train and validate the system with pictures that are gray, some lighter than others. In other words, none of the training/validation (t/v) pictures are 0, and none are 1. The t/v pictures range between 0 and 1 depending of how dark the gray is.
Of course, this is just a hypothetical situation, but I want to apply a similar scenario for my work. All of the information I have found online is based on a binary classification (either 1 or 0), rather than a spectrum classification (between 1 and 0).
I assume that this is possible, but I have no idea where to start. Although, I do have a binary code written with good accuracy.
Based on your given example, maybe a classification approach is not the best one. I think that what you have is a regression problem, as you want your output to be a continuous value in some range, that has a meaning itself (as higher or lower values have a proper meaning).
Regression tasks usually have an output with linear activation, and they expect to have a continuous value as the ground truth.
I think you could start by taking a look at this tutorial.
Hope this helps!
If I understand you correctly, it's definitely possible.
The creator of Keras, François Chollet, wrote Deep Learning with Python which is worth reading. In it he describes how you could accomplish what you would like.
I have worked through examples in his book and shared the code: whyboris/ml-with-python-and-keras
There are many approaches, but a fast one is to use a pre-trained model that can recognize a wide variety of images (for example, classify 1,000 different categories). You will use it "headless" (without the last classification layer that takes the vectors and decides which of the 1,000 categories it falls most into). And you will train just the "last step" in the model (freezing all the previous layers) while training your binary classifier.
Alternatively you could train your own classifier from scratch. Specifically glance at my example (based off the book) cat-dog-classifier which trains its own binary classifier.
I was trying to tackle an ML problem with tensor flow, but im not sure what algorithm should I use. I have tagged images on my dataset. When a new image comes on, i want the to correlate the images I have, based on the tags. Where should I start? O.o
What do you mean by correlate the images? Are you attempting to cluster the images based on their tags?
If so, you could train an encoder that runs over your images, produces a feature vector and cluster those feature vectors based on their image tags. So for example, consider you had multiple images of tags: cars & cats. You could run an encoder (consisting of convolutional layers), flatten the final layer to get a feature vector and run a clustering algorithm like K-means (with K=2, since you only have 2 tags -cars & cats).
Depending on the size and nature of the images in your dataset you might have to play around with the encoder architecture, collect more data, use alternate clustering algorithms etc.
In the event your image feature vector can belong to multiple classes and you would like to return possible tags, you'll have to opt for soft clustering algorithms such as GMMs (Gaussian Mixture Models) or FCMs (Fuzzy C Means). These algorithms don't specifically output class but outputs a class score for each data point. So if you want the top 5 tags of a new image, you could:
Run an encoder to get a feature vector
Perform soft clustering on the feature vectors
Get the 5 highest scoring classes
I trained my CNN classifier (using tensorflow) with 3 data categories (ID card, passport, bills).
When I test it with images that belong to one of the 3 categories, it gives the right prediction. However, when I test it with a wrong image (a car image for example) it keeps giving me prediction (i.e. it predicts that the car belongs the ID card category).
Is there a way to make it display an error message instead of giving a wrong prediction?
This should be tackled differently. This is known as open set recognition problem. You can google it and find more about it but basically it's this:
You cannot train your classifier on every class imaginable. It will always run into some other class that it's not familiar with and that it hasn't already seen before.
There are a few solutions from which I will single out the 3 of them:
Separate binary classifier - You can build separate binary classifier that recognizes images and sorts them in two categories depending on if the bill, passport or ID are in the image or not. If they are, it should let the algorithm you have already build to process the image and classify it into one of the 3 categories. If the first classifier says that some other object is in the image, you can immediately discard the image because it's not the image of bill/passport/ID.
Thresholding. In the case when the ID is on the image, probability of the ID is high and probabilities for bill and passport are fairly low. In the case when the image is something else (ex. a car), the probabilities are most probably about the same for all 3 classes. In other words, probability for neither of the classes really stand out. That is a situation in which you pick the highest probability of the ones generated and set the output class to be the class of that probability, regardless the value of probability is 0.4 or something like that. To resolve this, you can set a threshold at, let's say 0.7, and say if neither of probabilities is over that threshold, there is something else on the picture (not ID, passport or bill).
Create the fourth class: Unknown. If you pick this option, you should add few of the other images to the dataset and label them unknown. Then train the classifier and see what the result is.
I would recommend 1 or 2. Hope it helps :)
This is not really a programming problem, its way more complicated. What you want is called Out of Distribution detection, where the classifier has a way to tell you that the sample is not on the training set.
There are recent research papers that deal with this problem, such as https://arxiv.org/abs/1802.04865 and https://arxiv.org/abs/1711.09325
In general you cannot use a model that has not been trained specifically for this, for example, the probabilities produced by a softmax classifier are not calibrated for this purpose, so thresholding these probabilities will not work at all.
Easiest way is to simply add a fourth category for anything but the other three and train it with various completely random photos.
I was searching for same solution and it brought me here. To solve this, I used math.isclose() function to compare the values of my prediction.
def check_distribution(self, prediction):
checker = [x for x in prediction[0] if math.isclose(1, x, abs_tol=1e-9) ]
for probability in prediction[0]:
if len(checker) > 0:
return True
else:
return False
Feel free to alter the abs_tol parameter depending on how brutal you want to be.
I am working on a multi label problem classifying images. I don't have enough data, so I am using transfer learning with CNN being feature extractor. As I have enough data for some of the classes, I have formulated problem in a way:
30 classes and 31st being "rest" of the images, so I can distinguish them.
The 31st rest class is mostly dragging my accuracy and other metrics down. I was thinking about creating multioutput network in Keras, where one output would be binary classification whatever it is "good" or "rest" image and the second would be trained only if the first would be classified as a good.
I do understand fact that I will need evaluate the second output too, as that's how computation graphs works, but is there an option how to tell the layer: Don't adapt on this bad example, based on input from another softmax?
Thanks
I think I understand to an extent what you want to achieve. The way to go for this would be to train two models - first would be a binary classification 'good' vs 'rest' and the output of this model, if 'good' would have to be passed to the second model - 30 class output model. This is actually fairly common way to go about problems as yours.
I worked on a helmet detection problem earlier - I noticed that instead of detecting helmets, it worked better if I detected persons with one model and pass these boxes to a classification model - 'helmet' or 'no helmet'.
I am new to image processing.As my project i am doing "image classifier using SVM".I have the idea of my final software "I select some image and give it as input to my software and it will classify that image .if i give the image of an animal it will classify it to cat or snake suitably"
When I google about it.it says "First you need to train SVM"
What it mean by Training SVM?
What is the actual input to SVM in my case(image classification)?
SVM is just a classifier how it classify images.Is it necessary for me to covert image to any particular format?.please help.
Support Vector Machine (SVM) is a machine learning model for supervised data classification. SVMs essentially learn a hyper-plane which separates the data space into 2 regions (in 2 class case). In your case, suppose you have images of snakes and cats and you need to classify them. The steps you'll need to follow are
Extract 'features' from the images.
These 'features' may be functions of appearance of snake/cat in your case e.g colour of the animal, shape of the animal etc. By concatenating these features you can get a multi-dimensional feature vector.
Train an SVM classifier
Training essentially learns a separating hyper-plane between the feature vectors of snake class and cat class . For example, if your feature vector is 2-dimensional, training an SVM classifier would amount to 'learning' a line which best separates your labeled-data/training-data.
You could use any of the multitude of freely available libraries of machine learning. In case you speak python, you could use sklearn for the task.
This task of learning (hyper-plane in linear SVM) is referred to training.
Classify the images.
Once you have trained your model, you could then use it classify images whose class is not known.
Note: I am simplifying a lot of details/issues involved in this answer. I suggest you should read-up about SVM