I'm working on a machine learning classification task in which I have trained many models with different algorithms in scikit-learn and Random Forest Classifier performed the best. Now I want to train the model further with new examples but if I train the same model by calling the fit method on new examples then it will start training the model from beginning by erasing the old parameters.
So, how can I train the trained model by training it with new examples in scikit-learn?
I got some idea by reading online to pickle and unpickle the model but how would it help I don't know.
You should use incremental learning and estimators implementing the partial_fit API.
RandomForrestClassifier has a flag warm_start. Note that this will not give the same results as if you train on both sets at once.
Append the new data to your existing dataset, and train over the whole thing. Might want to reserve some of the new data for your testset.
Related
I am trying to build a machine learning model in python. I used pytorch and sklearn to make the model. My model is a bit complicated: I have one input feature but several target variables. My target variables are values making a curve and I used each value of the curve as a different feature. I showed five different curves in the upladed figure.
I used algorithms like DecisionTreeRegressor and RandomeForestRegressor to fit the only input variable to several target variables. But the prediction of trained model is not so well for extrapolation. The trained model can create the a series of data but not so accure. Does anyone know such trained model in Python? I tried hyperparameter tuning using GridSearchCV but it did not help me.
In advance I do appreciate your help and feedback.
I am working on a project involving neural machine translation (translating English to French).
I have worked through some examples online, and have now finished the model. Once a model is trained using Keras, how do I then get a prediction of a translation without training the entire model again, because with the large dataset I am using, each epoch takes some time and of course, I can't train the model every time I want a translation.
So what is the correct way of then generating predictions on new inputs without training the whole model again?
Thanks
You need to save your model the model and its weights when the fit ends using :
keras.model.save(model_name)
At any time, you can load your trained model using
model = keras.load(model_name)
then perform predictions as
y_pred = model.predict(x_test)
Hope this will be helpful
You can use the .predict() function which you can pass new inputs into it and it give you a prediction. The docs for this function are here: keras
I am following this tutorial:
https://www.tensorflow.org/tutorials/keras/text_classification_with_hub
It only goes up to fitting the model but I couldn't find how to use the model on a new dataset to classify unlabeled data. I tried following other tutorials but I couldn't get them to work since they might not be text based.
model.add(tf.keras.layers.Dense(1))
I run into an issue where I try to set the layer to 2 for positive, negative but that doesn't work either.
I think you misunderstood the purpose of that tutorial. That tutorial is applying the use of what is known as "transfer learning". Transfer Learning is when you take an already trained model, and train it with other data. What you are doing is creating an entirely new model, which is not the purpose of that specific tutorial. Furthermore, for that model you need a labeled dataset, which is provided in the tutorial using the Tensorflow Datasets library. To accomplish what you are trying to do, you must look at a different tutorial explaining how to train an LSTM model for text classification from scratch.
First of all thank you for taking your time to read my question. I have done a Machine Learning model with a dataset (The famous one about Cancer) and I want to know how can I do to predict the results for new variables. I think that I have to keep training the data (often) to have more accured data to use in my prediction but for predicting new data, ¿Is as simple as changing the test data (y variable) to the new data?
Thank you so much for taking your time and any help would be appreciate it.
You are probably using the SVC class from sklearn.svm.
After fitting the model with the fit method you can predict new data with the predict method. See here.
By the way: For Support Vector Machines you don't have to fit your data multiple times. Maybe you are confusing that with neural networks.
If you are talking in the sense that you are changing the number of features in your test data then you cannot do that.
The number of features has to be the same in training and test set.
However, if your test data have some class of categorical variable which was not there in training data then its better you train your model with one extra category as "NONE" of "Others" for all your features.
This way when you encounter new class of categorical variable in your test data then you changed it to "NONE" or "Others" and do prediction on your trained model.
This way it will not break your model.
I hope I understand your question correctly.
I have a model which is trained on the large training corpus. I also have a feedback loop which is providing me the feedback from the users. Model is built on top of Theano and Python.
How can I add this feedback into my model? Right now I am thinking about two approaches :
Add mini-batch to the training corpus and training it again. This is straight forward but it will take a lot of time to train.
Use the saved state of trained model and just train on the mini-batch. This looks promising but right now stuck in how to do it with Theano.
Can someone help me for the second case?