I have a model which is trained on the large training corpus. I also have a feedback loop which is providing me the feedback from the users. Model is built on top of Theano and Python.
How can I add this feedback into my model? Right now I am thinking about two approaches :
Add mini-batch to the training corpus and training it again. This is straight forward but it will take a lot of time to train.
Use the saved state of trained model and just train on the mini-batch. This looks promising but right now stuck in how to do it with Theano.
Can someone help me for the second case?
Related
I am new to reinforcement learning, and currently I am working on a small q-learning project but I am a little confused?
1- what is the testing phase of a q-learning model, and how do we make a prediction (try it on single, unseen data) with it? at this point I have created the needed function for choosing action and getting reward ...etc, and I was able to run a 10000 episodes, but I believe this is the training phase.
2- what are the metrics that we use to say that our model has learned and performed well or not? something like accuracy in classification scheme for example.
Thank you.
So, I've googled prior to asking this, obviously, however, there doesn't seem to be much mention on these modes directly. Tensorflow documentation mentions "test" mode in passing which, upon further reading, didn't make very much sense to me.
From what I've gathered, my best shot at this is that to reduce ram, when your model is in prediction mode, you just use a pretrained model to make some predictions based on your input?
If someone could help with this and help me understand, I would be extremely grateful.
Training refers to the part where your neural network learns. By learning I mean how your model changes it's weights to improve it's performance on a task given a dataset. This is achieved using the backpropogation algorithm.
Predicting, on the other hand, does not involve any learning. It is only to see how well your model performs after it has been trained. There are no changes made to the model when it is in prediction mode.
I am using LSTM for time-series prediction using Keras. I am using 3 LSTM layers with dropout=0.3, hence my training loss is higher than validation loss. To monitor convergence, I using plotting training loss and validation loss together. Results looks like the following.
After researching about the topic, I have seen multiple answers for example ([1][2] but I have found several contradictory arguments on various different places on the internet, which makes me a little confused. I am listing some of them below :
1) Article presented by Jason Brownlee suggests that validation and train data should meet for the convergence and if they don't, I might be under-fitting the data.
https://machinelearningmastery.com/diagnose-overfitting-underfitting-lstm-models/
https://machinelearningmastery.com/learning-curves-for-diagnosing-machine-learning-model-performance/
2) However, following answer on here suggest that my model is just converged :
How do we analyse a loss vs epochs graph?
Hence, I am just bit confused about the whole concept in general. Any help will be appreciated.
Convergence implies you have something to converge to. For a learning system to converge, you would need to know the right model beforehand. Then you would train your model until it was the same as the right model. At that point you could say the model converged! ... but the whole point of machine learning is that we don't know the right model to begin with.
So when do you stop training? In practice, you stop when the model works well enough to do what you want it to do. This might be when validation error drops below a certain threshold. It might just be when you can't afford any more computing power. It's really up to you.
I am looking to train a large model (resnet or vgg) for face identification.
Is it valid strategy to train on few faces (1..3) to validate a model?
In other words - if a model learns one face well - is it evidence that the model is good for the task?
point here is that I don't want to spend a week of GPU expensive time only to find out that my model is no good or data has errors or my TF coding has a bug
Short answer: No, because Deep Learning works well on huge amount of data.
Long answer: No. The problem is that learning only one face could overfit your model on that specific face, without learning features not present in your examples. Because for example, the model has learn to detect your face thanks to a specific, very simple, pattern in that face (that's called overfitting).
Making a stupid simple example, your model has learn to detect that face because there is a mole on your right cheek, and it has learn to identify it
To make your model perform well on the general case, you need an huge amount of data, making your model capable to learn different kind of patterns
Suggestion:
Because the training of a deep neural network is a time consuming task, usually one does not train one single neural network at time, but many neural network are trained in parallel, with different hyperparameters (layers, nodes, activation functions, learning rate, etc).
Edit because of the discussion below:
If your dataset is small is quite impossible to have a good performance on the general case, because the neural network will learn the easiest pattern, which is usually not the general/better one.
Adding data you force the neural network to extract good patterns, that work on the general case.
It's a tradeoff, but usually a training on a small dataset would not lead to a good classifier on the general case
edit2: refrasing everything to make it more clear. A good performance on a small dataset don't tell you if your model when trained on all the dataset is a good model. That's why you train to
the majority of your dataset and test/validate on a smaller dataset
For face recognition, usually a siamese net or triplet loss are used. This is an approach for one-shot learning. Which means it could perform really well given only few examples per class (person face here), but you still need to train it on many examples (different person faces). See for example:
https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d
You wouldn't train your model from scratch but use a pretrained model anyways and fine-tune it for your task
You could also have a look at pretrained face recognition models for better results like facenet
https://github.com/davidsandberg/facenet
I'm working on a machine learning classification task in which I have trained many models with different algorithms in scikit-learn and Random Forest Classifier performed the best. Now I want to train the model further with new examples but if I train the same model by calling the fit method on new examples then it will start training the model from beginning by erasing the old parameters.
So, how can I train the trained model by training it with new examples in scikit-learn?
I got some idea by reading online to pickle and unpickle the model but how would it help I don't know.
You should use incremental learning and estimators implementing the partial_fit API.
RandomForrestClassifier has a flag warm_start. Note that this will not give the same results as if you train on both sets at once.
Append the new data to your existing dataset, and train over the whole thing. Might want to reserve some of the new data for your testset.