There's something about GAN's training that i don't understand. I am making a GAN for Anomaly Detection. To start I followed this guide here to create a DCGAN (and understand how it works) and then move into the Anomaly Detection stuff.
I understand how the two training's phases work for GANs and after nearly 2000 epochs the generator generate some good fake images. The problem is that the discriminator is not good to detect anomalies: if i try to give in input a real image, it produce a value between 0.5 and 1, no matter if the image has anomaly or not.
So basically, the discriminator is good to distinguish real images from fake images, but not good to discriminate real images with anomalies.
I tried to train the model some more but the results won't change (instead, it seems worst than before!). The two losses keep varing around 0 and 1, for example now the model has:
gen_loss: 0.97844017, disc_loss: 0.9973822
What should I do to improve my net and make anomaly detection? It needs to be trained even more to make a better discriminator or for make anomaly detection should i add something more?
Thanks in advice, i'm definitely doing something wrong. If needed i can post some code and more information about my net.
P.S. My notebook is very similar to the one i linked before, the only difference is that i tried to feed test images to the discriminator after the training.
There is this interesting paper Efficient GAN-based anomaly detection.
To evaluate the anomaly detection, they use the following experimental setting
MNIST: We generated 10 different datasets from MNIST by successively
making each digit class an anomaly and treating the remaining 9 digits
as normal examples. The training set consists of 80% of the normal data
and the test set consists of the remaining 20% of normal data and all
of the anomalous data. All models were trained only with normal data and
tested with both normal and anomalous data.
Related
I am currently training a cnn to classify ICs into the classes "scratch" and "no scratch" (binary classification). I am fairly new to deep learning and when I trained my cnn a little bit I got very good accuracies (Good validation accuracy as well). But I quickly learned that my models where not as good as I thought, because when using them on a dataset to test it, it got quite a lot of false classification (false positives and false negatives). In my opinion there are 2 problems:
There is too little training data (about 1000 each class)
The ICs have markings (text) on it, which changes with every batch, so my training data has images of ICs with varying markings on it. And since some batches have more scratched ICs and other have less or none, the amount of IC images with different markings is unbalanced
here are two example images of 2 ICs from training set of the class "scratch":
As you see the text varies very strong. Every line has different characters and the amount of characters also varies.
I ask myself how the cnn should be able to differentiate between a scratch and an character?
Nevertheless I am trying to train my cnn and this is for example one model I currently trained (the other models look quite similar):
There are some points while training where the validation accuracy gets up and then down again. What could that mean? I think it is something like that there is a feature in the val data set that is not covered in my training set. Could this be the cause?
As you see Data Augmentation is no option (Or so I think) because of the text. One thing that came into my mind is to seperate the marking and the IC (cut out text region) with preprocessing (Don't know how I could do it properly and fast) and then classfy them seperately, but I don't know if this would be the right approach.
I first used VGG16, ResNet and InceptionV3 (with transfer learning). Now I tried to train my custom cnn (inspired by VGG but with 10 layers similar to this: https://journals.sagepub.com/doi/full/10.1177/1558925019897396)
Do you guys know how I should approach this problem or do you have any tips?
I am training a model to detect buildings from satellite images in rural Africa. For labels, I use OpenStreetMap geometries. I use the Tensorflow Object Detection API and SSD Inception V2 as a model and I use the default config file. I trained separate models on two different datasets (in different geographical regions). In one area, the model behaves like I would expect:
When training the model in the other area, however, it's performance jumps up and down:
Note that I use the exact same model, configuration, batch size, the training area is of the same size, etc. In the second case, the model's prediction change extremely rapidly and I am not able to see why. For example, here is a comparison of the predictions the model makes at 107k and 108k global steps (i.e. I'd expect the predictions to be similar):
I am quite new to deep learning and cannot understand why this might happen. It might be something simple that I am overlooking. I've checked the labels and they are ok. Also, I thought that it could be a bad batch that in every epoch turns the training in the wrong direction but this is not the case - the performance drops like that for up to several epochs.
I would be very grateful for any tips where to look, etc. I am using TF 1.14.
Let me know if I should provide more information.
To explain the title better, I am looking to classify pictures between two classes. For example, let's say that 0 is white, and black is 1. I train and validate the system with pictures that are gray, some lighter than others. In other words, none of the training/validation (t/v) pictures are 0, and none are 1. The t/v pictures range between 0 and 1 depending of how dark the gray is.
Of course, this is just a hypothetical situation, but I want to apply a similar scenario for my work. All of the information I have found online is based on a binary classification (either 1 or 0), rather than a spectrum classification (between 1 and 0).
I assume that this is possible, but I have no idea where to start. Although, I do have a binary code written with good accuracy.
Based on your given example, maybe a classification approach is not the best one. I think that what you have is a regression problem, as you want your output to be a continuous value in some range, that has a meaning itself (as higher or lower values have a proper meaning).
Regression tasks usually have an output with linear activation, and they expect to have a continuous value as the ground truth.
I think you could start by taking a look at this tutorial.
Hope this helps!
If I understand you correctly, it's definitely possible.
The creator of Keras, François Chollet, wrote Deep Learning with Python which is worth reading. In it he describes how you could accomplish what you would like.
I have worked through examples in his book and shared the code: whyboris/ml-with-python-and-keras
There are many approaches, but a fast one is to use a pre-trained model that can recognize a wide variety of images (for example, classify 1,000 different categories). You will use it "headless" (without the last classification layer that takes the vectors and decides which of the 1,000 categories it falls most into). And you will train just the "last step" in the model (freezing all the previous layers) while training your binary classifier.
Alternatively you could train your own classifier from scratch. Specifically glance at my example (based off the book) cat-dog-classifier which trains its own binary classifier.
I am trying to transfer learn a mobilenet_v2_coco model on the publicly available GTSRB (German Traffic Signs) Dataset.
I selected 3 classes to have a faster training time and I've already trained for about 10 000 epochs. Usually I already get decent results at this point in time. But my SSD fails to find anything on a livestream video I access over a small python program with my webcam. It even classifies almost the entire screen as one of the classes provided (the one that has more training data) with >90% confidence.
My guesses are that either this is because of the unbalanced data set (class1 = 2000 images, class2 = 1000 images, class3 = 800) or because of the images being filled with the object, without much noise or anything. So basically the ROI is almost as big as the dataset images, but the classifier aims to predict dash cam like videos, where the signs are usually very small.
Or do I just have to train harder and longer this time to get decent results?
The second part of my question is, if there is like a rule of thumb what the images in the dataset need to fulfil to output good predictions.
https://github.com/wenxinxu/resnet-in-tensorflow#overall-structure
The link above is the Resnet model for cifar10.
I am modifying above code to do object detection using Resnet and Cifar10 as training/validating dataset. ( I know the dataset is for object classification) I know that it sounds strange, but hear me out. I use Cifar10 for training and validation then during testing I use a sliding window approach, and then I classify each of the windows to one of 10 classes + "background" classes.
for background classes, I used images from ImageNet. I search ImageNet with following keyword: construction, landscape, byway, mountain, sky, ocean, furniture, forest, room, store, carpet, and floor. then I clean bad images out as much as I can including images that contain Cifar10 classes, for example, I delete a few "floor" images that have dogs in it.
I am currently running the result in Floydhub. Total steps that I am running is 60,000 which is where section under "training curve" from the link about suggests that the result starts to consolidate and do not converge further ( I personally run this code myself and I can back up the claim)
My question is:
what is the cause of the sudden step down in training and validation data which occurs at about the same step?
What if(or Is it possible that)training and validation data don't converge in a step-like fashion at about the same step? what I mean is, for example, training steps down at around 40,000 and validation just converge with no step-down? (smoothly converge)
The sudden step down is caused by the learning rate decay happening at 40k steps (you can find this parameter in hyper_parameters.py). The leraning rate suddenly gets divided by 10, which allows you to tune the parameters more precisely, which in this case improves your performance a lot. You still need the first part, with a pretty big learning rate, to get in a "good" area for your parameters, then the part with a 10x smaller learning rate will refine it and find a very good spot in that area for your parameters.
This would be surprising, since there is a clear difference between before and after 40k, that affects training and validation the same way. You could still see different behaviors from that point: for instance you might start overtraining because of a too small LR, and see you train error drop down and validation go up, because the refinements you're doing are too specific to the training data.