How to use trained model in future [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am new to data science and I am still learning machine learning. I know we can use Regression, Classification, Clustering, ANN, CNN, RNN models and so on according to the application.
These models we code, training and predict some data in PC. Some models take to much time training also. After that, we shut down the PC.
If I want the same model with the same data set after some days, again open the PC and training same model.
I want to know how to use trained modal in future without training, again and again, every time PC open. I asking mostly for ANN, CNN, RNN models.
Also, I want to know where the weights values are stored for modal because weights are not stored in the variable. How can I find it and can not I use those trained weight data to give ANN in future.

Usually, simple models (e.g. Logistic Regression, Decision Tree) don't take a huge amount of time training, obviously depending on the size of the data you train them on.
On the other hand, deep learning models, tend to have high training time. A common technique is to save the trained model(s) using the HDF5 file format. In case you are interested, you can check this link for further info on the format.
Probably the most simple way to achieve this is by using Keras's built-in function model.save:
from keras.models import load_model
model = train_neural_network() # Train your model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
model = load_model('my_model.h5') # Load your saved model and use it on whatever data you want
Since you are a data science begginer, if you have some basic knowledge on the area and want to jump directly onto deep learning, I recommend using Google's Colaboratory (link).
Each user is assigned a virtual machine with hardware built specifically for tasks involving deep learning. It contains most dependencies you need to run neural networks.

Saving a fully-functional model is very useful—you can load them. In TensorFlow, you can save the entire model to a file that contains the weight values, the model's configuration, and even the optimizer's configuration.
You may do:
model.save('ModelName.model')
Also, Keras provides a basic save format using the HDF5 standard.
#Save entire model to a HDF5 file
model.save('my_model.h5')
For more details check the documentation
For weights, for example, if you have a binary classifier with twice as many in the 0 label as the 1 label, you may set them when fitting the model like this:
Class_Weights = {0: 1., 1: 2} #twice as many 0 as 1
#fit model and pass weights
model.fit(X, y, class_weight=Class_Weights
batch_size=20, epochs=5, validation_split=0.3,)

If you just use TensorFlow, you can use the SavedModel API. This is Tom's answer. Also, you can find an example in github by Wen

Related

How to split a CNN model into two and merge them? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I want to create two separate CNN models from a single CNN model. Let me name it as CNN-A and CNN-B.
i.e., Original CNN model = CNN-A model + CNN-B model
During the prediction, the raw input dataset are feed to the CNN-A. The output of the CNN-A are feed as input to the CNN-B. The original output of the original model is the output of CNN-B model.
To implement the above architecture, I would like get your suggestions and idea if any, please?
The implementation seems redundant. The reason is the input to each CNN should be an image. And let's say the output of the first CNN network is an image and you feed that to the second CNN, this is the same as stacking multiple convolution layers with additional dropouts and whatnot for the One CNN model.
So after all implementing a deep CNN will mimic the architecture you want.
You can also take a look at transfer learning, which is able to utilize a pre-trained model's layers and you are capable to add your own final layers and make adjustments. This is also similar to what you are talking about.
================ After Comment =====================
You could use a model architecture like mobilenet for a model to be deployed on your mobile.
You could also apply transfer learning to existing pre-trained mobilenet models which will save a lot of time and resources.
Lastly, you could deploy the model(used for computers) on a server using Flask. Then create an API that will provide predictions when you send the relevant data to the server via a POST request. This is commonly used to reduce the load on mobiles and this the approach that I would prefer. This method is relatively efficient and is easily scalable.

Using tensorflow classification for feature extraction

I am currently working on a system that extracts certain features out of 3D-objects (Voxelgrids to be precise), and i would like to compare those features to automatically made features when it comes to performance (classification) in a tensorflow cNN with some other data, but that is not the point here, just for background.
My idea now was, to take a dataset (modelnet10), train a tensorflow cNN to classify them, and then use what it learned there on my dataset - not to classify, but to extract features.
So i want to throw away everything the cnn does,except for what it takes from the objects.
Is there anyway to get these features? and how do i do that? i certainly have no idea.
Yes, it is possible to train models exclusively for feature extraction. This is called transfer learning where you can either train your own model and then extract the features or you can extract features from pre-trained models and then use it in your task if your task is similar in nature to that of what the pre-trained model was trained for. You can of course find a lot of material online for these topics. However, I am providing some links below which give details on how you can go about it:
https://keras.io/api/applications/
https://keras.io/guides/transfer_learning/
https://machinelearningmastery.com/how-to-use-transfer-learning-when-developing-convolutional-neural-network-models/
https://www.pyimagesearch.com/2019/05/27/keras-feature-extraction-on-large-datasets-with-deep-learning/
https://www.kaggle.com/angqx95/feature-extractor-fine-tuning-with-keras

Tensor flow Model saving and Calculating Average of Models [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am trying to implement and reproduce the results of federated Bert pertaining in paper
Federated pretraining and fine-tuning of BERT using clinical notes from multiple silos.
I prefer to use TensorFlow
code of Bert pretraining.
For training in a federated way, initially, I had divided dataset into 3 different silos(each of that contains discharge summary of 50 patients, using mimic-3 data). and then pretrained the Bert model for each dataset using TensorFlow
implementation of Bert pretraining from the official release of Bert.
Now I have three different models that are pretrained from a different dataset. for model aggregation, I need to take an average of all three models. since the number of notes in each silo is equal, for averaging I need to do sum all models and divide by three.
How to take avg of models as did in the paper? somebody, please give me some insights to code this correctly. The idea of averaging the model weight is taken from the paper FEDERATED LEARNING: STRATEGIES FOR IMPROVING
COMMUNICATION EFFICIENCY
.
I am very new to deep learning
and TensorFlow
. so someone please help me to figure out the issue and suggest some reading material for TensorFlow
.
In the paper, it is mentioned that It is a good option to overcome privacy and regulatory issues while sharing of clinical data. My question is
is it possible to get sensitive data from this model.ckpt files? Then how?
Any help would be appreciated. Thanks...
Model averaging can be done in many ways. The simplest is to have a complete copy of each architecture in each silo, and take a (weighted) average of their parameter scores, and use this as the parameters for the full model. However there are a number of practical issues (latency, network speed, computational power of device) which may prohibit this, and so more complex solutions where silos are only trained on subsets of variables etc are used (as in the paper you cite).
It is not generally possible to retrieve information (sensitive of otherwise) from a dataset purely from the parameter updates to a model fine-tuned on it.

Choosing layers to be trained and adding skip connections in the trained Inception model in Keras [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to train a model with CT Grayscale images such as this one:
For certain classes of diseases my training set is limited , e.g 2,000 positives with 98,000 negatives.
I am thinking to use transfer learning to avoid overfitting and boost the effectiveness of my model but I also realize that I should fine tune the model since the kind of images I am feeding the model are very different from the kind of images with which the Inception model I will use has been trained.
My problem then is that I am not sure how many layers I should keep fixed and how many trainable.
I am thinking then to use skip connections to apply a stochastic depth letting the network to learn how many layers trully needs.
So I am thinking to implement the following architecture.
I.e. I will add skip connections between the layers of the pretrained inception model that comes with Keras (Tensorflow 2.0).
I would welcome advice on how to implement these ideas. In particular how to split the network into three parts, leave the first part untouched (untrainable) and train the second part after adding the skip connections. The implementation should be in Keras.
Transfer learning is indeed the right approach. This allows you to make use of the trained weights to take care of the 'generic' tasks of DL image processing, such as shape recognition, edge detection, etc, and, in a manner of speaking, conserve your data (input and labels) for retooling the existing the neural network for your specific task.
As a rule of thumb, the closer the weights are to the input, the more generic their function and the less you want to retrain them. Conversely, the closer the weights are to the output, the more task specific is their function and the more retraining they require.
I suggest training the endpoint classifier before retraining existing weights. 1-2 fully connected layers (read: Dense) with your favorite activation function + 1 fully connected layer with the softmax activation (as we want the output to be the predicted probability of each disease) should probably do the trick. When training this (the endpoint classifier) be sure to freeze all other layers (or use bottleneck features - see how in the link below).
Only then you should retrain existing weights - this is called fine-tuning. I suggest unfreezing the first inception module*, then allowing it (and the already trained endpoint classifier from the last step!) to retrain. then maybe unfreeze the next inception module, and allow it to train as well (again, while also allowing the first inception module and the new endpoint classifier to retrain as well. When retraining a segment, always allow the weights downstream to retrain as well).
Be advised that fine-tuning should use a slow training rate.
To the best of my understanding skip connections DO NOT "let the network to learn how many layers truly needs.". They mostly allow the circumvention of the diminishing gradient problem. The "skipped" layers do not become "optional", they will participate in generating the output. And, since the weights used do not account for inputs that will be added via skip connections, I believe it will render the training of the weights irrelevant, requiring the retraining of the entire network and thus preventing transfer learning.
It would be an interesting experiment, but try it at your own peril.
If you really want skip connections, I suggest you make use of a model that already has them (and whose weights are therefore adjusted to it), such as ResNet.
Please view this link for more ideas regarding the utilization of transfer learning, bottle neck features, and fine tuning. And augmentation, while your at it.
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
*By that I mean first one from the prescriptive of the output, that is, the one highest in your image.

Is it possible to add another dataset on my trained model in tensorflow?

I already trained a custom person detector which doesn't work well on detecting perons on aerial footage which is because my dataset lacks on aerial images of a person. Can I continue the training using the latest checkpoint and add another dataset (different tfrecord file) for my model or do I have to restart the training using the updated dataset?
I retrained the Inception model to detect persons only since there no other way to remove the other 89 objects from the pretrained model.
You can definitely start from a checkpoint using another dataset. However, it might not be a good idea to train only on a subset of your data due to the tendency of neural nets to forget what they'e already learned (a problem known as catastrophic forgetting). It's probably a better idea to create a new dataset that includes both your old and new data, and either pick up from the checkpoint using those data (similar to how you fine-tuned the Inception model), or start the training process over.

Categories