I have created a model using TensorFlow for detecting any type of violence in the video. I have trained the model on approx. 2000 videos by splitting it into frames.
But when I use that model on any unseen video or real-time video then it's not predicted correctly.
I just wanted to ask if anyone can tell me I have taken the correct hidden layers and if there are any tweaks I can make for correct predictions.
The neural_v2.ipynb is used to train the model. The test_v2.py is the file that loads the model and captures videos and predicts.
If you need any more technical clarification please ask me.
If anyone can help in any way, I would really appreciate it.
Dataset Link
Code Link
Ideally, you would split your data into three: training, validation, and test (you are using your testing data as your validation).
As #finko's answer, I would try a more epochs, but more importantly a denser model. Experiment with some state of the art models (like VGG16, ResNet152, MobileNet etc). All of these are available as Keras applications (https://www.tensorflow.org/api_docs/python/tf/keras/applications).
You may set the epochs=50 to train again, it will be better
Related
Has anyone successfully trained the method for this paper?
I tried many times but always fail.
I really want to train this model for raw image denoising.
Could someone help me pls?
Below is this paper's github
https://github.com/zhangyi-3/IDR
I would like to use the dataset provided by the author, But I don't know the data structure and how to start.
I am currently working on a system that extracts certain features out of 3D-objects (Voxelgrids to be precise), and i would like to compare those features to automatically made features when it comes to performance (classification) in a tensorflow cNN with some other data, but that is not the point here, just for background.
My idea now was, to take a dataset (modelnet10), train a tensorflow cNN to classify them, and then use what it learned there on my dataset - not to classify, but to extract features.
So i want to throw away everything the cnn does,except for what it takes from the objects.
Is there anyway to get these features? and how do i do that? i certainly have no idea.
Yes, it is possible to train models exclusively for feature extraction. This is called transfer learning where you can either train your own model and then extract the features or you can extract features from pre-trained models and then use it in your task if your task is similar in nature to that of what the pre-trained model was trained for. You can of course find a lot of material online for these topics. However, I am providing some links below which give details on how you can go about it:
https://keras.io/api/applications/
https://keras.io/guides/transfer_learning/
https://machinelearningmastery.com/how-to-use-transfer-learning-when-developing-convolutional-neural-network-models/
https://www.pyimagesearch.com/2019/05/27/keras-feature-extraction-on-large-datasets-with-deep-learning/
https://www.kaggle.com/angqx95/feature-extractor-fine-tuning-with-keras
So, I've googled prior to asking this, obviously, however, there doesn't seem to be much mention on these modes directly. Tensorflow documentation mentions "test" mode in passing which, upon further reading, didn't make very much sense to me.
From what I've gathered, my best shot at this is that to reduce ram, when your model is in prediction mode, you just use a pretrained model to make some predictions based on your input?
If someone could help with this and help me understand, I would be extremely grateful.
Training refers to the part where your neural network learns. By learning I mean how your model changes it's weights to improve it's performance on a task given a dataset. This is achieved using the backpropogation algorithm.
Predicting, on the other hand, does not involve any learning. It is only to see how well your model performs after it has been trained. There are no changes made to the model when it is in prediction mode.
I am looking to train a large model (resnet or vgg) for face identification.
Is it valid strategy to train on few faces (1..3) to validate a model?
In other words - if a model learns one face well - is it evidence that the model is good for the task?
point here is that I don't want to spend a week of GPU expensive time only to find out that my model is no good or data has errors or my TF coding has a bug
Short answer: No, because Deep Learning works well on huge amount of data.
Long answer: No. The problem is that learning only one face could overfit your model on that specific face, without learning features not present in your examples. Because for example, the model has learn to detect your face thanks to a specific, very simple, pattern in that face (that's called overfitting).
Making a stupid simple example, your model has learn to detect that face because there is a mole on your right cheek, and it has learn to identify it
To make your model perform well on the general case, you need an huge amount of data, making your model capable to learn different kind of patterns
Suggestion:
Because the training of a deep neural network is a time consuming task, usually one does not train one single neural network at time, but many neural network are trained in parallel, with different hyperparameters (layers, nodes, activation functions, learning rate, etc).
Edit because of the discussion below:
If your dataset is small is quite impossible to have a good performance on the general case, because the neural network will learn the easiest pattern, which is usually not the general/better one.
Adding data you force the neural network to extract good patterns, that work on the general case.
It's a tradeoff, but usually a training on a small dataset would not lead to a good classifier on the general case
edit2: refrasing everything to make it more clear. A good performance on a small dataset don't tell you if your model when trained on all the dataset is a good model. That's why you train to
the majority of your dataset and test/validate on a smaller dataset
For face recognition, usually a siamese net or triplet loss are used. This is an approach for one-shot learning. Which means it could perform really well given only few examples per class (person face here), but you still need to train it on many examples (different person faces). See for example:
https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d
You wouldn't train your model from scratch but use a pretrained model anyways and fine-tune it for your task
You could also have a look at pretrained face recognition models for better results like facenet
https://github.com/davidsandberg/facenet
I have a model which is trained on the large training corpus. I also have a feedback loop which is providing me the feedback from the users. Model is built on top of Theano and Python.
How can I add this feedback into my model? Right now I am thinking about two approaches :
Add mini-batch to the training corpus and training it again. This is straight forward but it will take a lot of time to train.
Use the saved state of trained model and just train on the mini-batch. This looks promising but right now stuck in how to do it with Theano.
Can someone help me for the second case?