I am trying to update my already trained model so that it can correct the errors it is making. For that, I am partially fitting the new data for which it was giving incorrect labels with the new correct labels
I have saved my Bayesian model in a file like this:
model1 = MultinomialNB() #NaiveBayes model
model1.partial_fit(features_matrix, label_matrix, [0,1,2])
filename = 'trained_NBmodel.pkl' #saving the trained model
joblib.dump(model1, filename)
and then loading it another file like this:
loaded_model = joblib.load('trained_NBmodel.pkl')
loaded_model.partial_fit(new_features_matrix, new_label_matrix)
filename = 'trained_NBmodel.pkl' #saving the trained model
joblib.dump(loaded_model, filename)
now it should use the updated model and if new_features_matrix is given to predict, it should predict new_label_matrix with high accuracy but the model is not. It gives the same label matrix as it was giving before refitting. Could it be that I have trained my initial model with similar data many times with different labels that it is not able to learn from fewer data?
Related
I'm making a stat algorithm. I pre process the data using PCA method (sklearn.decomposition.PCA) and then apply a classification model (MLP for example, from sklearn.neural_network.MLPClassifier ) to predict the category. I first fit the model and test it. It works well. Then I save the model using pickle module
with open(path+'/methodes/PCA_fitted_model.sav','wb') as file:
pickle.dump(pca,file)
file.close()
with open(path+'/methodes/MLP_fitted_model.sav','wb') as file:
pickle.dump(mlp,file)
file.close()
I have a problem when I reload the model to predict the category of new data. I know the category of it (it's a test data) and the predict category is the exact opposite of the true category (binary classification). I've checked, the pre processing of the data using the PCA is good. Is it due to pickle or is it something else ?
I reload the classifier using :
with open(path+'/classifier/'MLP_trained_model.sav','rb') as file:
MLP = pickle.load(file)
file.close()
and then the prediction using :
prediction=MLP.predict(pca_data)
where pca_data is the data after the pca preprocessing
I trained a Pytorch model, saved the testing error, and saved the complete model using torch.save(model, 'model.pt')
I loaded the model to test it on another dataset and found the error to be higher, so I tested it on the exact same dataset as before, and found the results to be different:
Here, there is not a lot of difference in the predicted values which tells me that the model is correct, but somehow different.
1 difference is that originally the model was trained on GPUs with nn.DataParallel, and while testing after loading, I am evaluating it on CPU.
model = torch.load('model.pt')
model = model.module # To remove the DataParallel module
model.eval()
with torch.no_grad():
x = test_loader.dataset.tensors[0].cuda()
pred = model(x)
mae_loss = torch.nn.L1Loss(reduction='mean')
mae = mae_loss(pred, y)
What could be causing this difference in model evaluation? Thank you in advance
I have a pre-trained tensorflow h5 saved model to classify images.
here is the block of code :
import tensorflow as tf
model_version = "1"
model_name = "fresh-rotten-model"
model_path = os.path.join(model_name, model_version)
tf.saved_model.save(model, model_path)
model.save("fresh-rotten-model.h5")
I built a back-end that will upload new images every week using a schedule to a node server
Is there any way to add these images as a new data to train the model and build a new model without having to train the whole data set again ?
You can't just add data to a model file, a model file contains only weights, not the data used in it, so you'll have to have something (supposedly your backend server) do the training every time you want to update the model. You can kind of update the model by loading it and then doing more training epochs with it with the extended dataset, but other than that, there's not much you can do.
I have already trained a deep learning model with some data and it performs well with the test data. Now, how do I retrain this model when I get new data?
You can Save your model using
keras.model.save(yourModel, 'fileName.hdf5')
After you got the data you can load your saved model
model = keras.model.load_model('fileName.hdf5')
model.fit()
The training will continue from last saved weights, optimizer and loss.
I have trained a cnn model using tf.estimator and tf.data.TFRecordDataset, which define a model in model_fn funcition and input in input_fn function. Also using an one-shot iterator to get one batch examples at a time.
Now I have trained model files(ckpt, meta, index) in a directory. What I want to do is predicting a image's label based on the trained model without training and evaluation again. The image can be numpy array but not possible a TFRecords file(which used when traing).
I can't find an effictive solution after trying all day. I only can get the value of weights and biases and don't know how to make my predicting image and model compatible.
FYI, my training code is here.
The similar question is Prediction from model saved with tf.estimator.Estimator in Tensorflow
, but no accepted answer and my model input is using the dataset api.
So reaaally need help. Thanks.
I have answered a similar question here.
To make predictions using a custom input, you need to use the built-in predict method of Estimators:
estimator = tf.estimator.Estimator(model_fn, ...)
predict_input_fn = ... # define this using tf.data
predict_results = estimator.predict(predict_input_fn)
for idx, prediction in enumerate(predict_results):
print(idx)
for key in prediction:
print("...{}: {}".format(key, prediction[key]))