How to split a CNN model into two and merge them? [closed] - python

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I want to create two separate CNN models from a single CNN model. Let me name it as CNN-A and CNN-B.
i.e., Original CNN model = CNN-A model + CNN-B model
During the prediction, the raw input dataset are feed to the CNN-A. The output of the CNN-A are feed as input to the CNN-B. The original output of the original model is the output of CNN-B model.
To implement the above architecture, I would like get your suggestions and idea if any, please?

The implementation seems redundant. The reason is the input to each CNN should be an image. And let's say the output of the first CNN network is an image and you feed that to the second CNN, this is the same as stacking multiple convolution layers with additional dropouts and whatnot for the One CNN model.
So after all implementing a deep CNN will mimic the architecture you want.
You can also take a look at transfer learning, which is able to utilize a pre-trained model's layers and you are capable to add your own final layers and make adjustments. This is also similar to what you are talking about.
================ After Comment =====================
You could use a model architecture like mobilenet for a model to be deployed on your mobile.
You could also apply transfer learning to existing pre-trained mobilenet models which will save a lot of time and resources.
Lastly, you could deploy the model(used for computers) on a server using Flask. Then create an API that will provide predictions when you send the relevant data to the server via a POST request. This is commonly used to reduce the load on mobiles and this the approach that I would prefer. This method is relatively efficient and is easily scalable.

Related

Tensor flow Model saving and Calculating Average of Models [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am trying to implement and reproduce the results of federated Bert pertaining in paper
Federated pretraining and fine-tuning of BERT using clinical notes from multiple silos.
I prefer to use TensorFlow
code of Bert pretraining.
For training in a federated way, initially, I had divided dataset into 3 different silos(each of that contains discharge summary of 50 patients, using mimic-3 data). and then pretrained the Bert model for each dataset using TensorFlow
implementation of Bert pretraining from the official release of Bert.
Now I have three different models that are pretrained from a different dataset. for model aggregation, I need to take an average of all three models. since the number of notes in each silo is equal, for averaging I need to do sum all models and divide by three.
How to take avg of models as did in the paper? somebody, please give me some insights to code this correctly. The idea of averaging the model weight is taken from the paper FEDERATED LEARNING: STRATEGIES FOR IMPROVING
COMMUNICATION EFFICIENCY
.
I am very new to deep learning
and TensorFlow
. so someone please help me to figure out the issue and suggest some reading material for TensorFlow
.
In the paper, it is mentioned that It is a good option to overcome privacy and regulatory issues while sharing of clinical data. My question is
is it possible to get sensitive data from this model.ckpt files? Then how?
Any help would be appreciated. Thanks...
Model averaging can be done in many ways. The simplest is to have a complete copy of each architecture in each silo, and take a (weighted) average of their parameter scores, and use this as the parameters for the full model. However there are a number of practical issues (latency, network speed, computational power of device) which may prohibit this, and so more complex solutions where silos are only trained on subsets of variables etc are used (as in the paper you cite).
It is not generally possible to retrieve information (sensitive of otherwise) from a dataset purely from the parameter updates to a model fine-tuned on it.

How to generate new image using deep learning, from new features [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
If i have a dataset consisting by a list of images each associated with a series of features; there is a model that, once trained, generates new images upon entering a new list of features?
I think you are looking for GAN(Generative Adversarial Networks) which is proposed in this paper.
GAN are the type of algorithm which contains two different model so that one model named Discriminator tries to learn to determine if it's input data comes from the data set or not and the other one named Generator tries to learn how to generate data so that the Discriminator wrongly recognize that it comes from the data set.
You can find more details from the following links:
generative adversarial network (GAN)
Generative Adversarial Networks (GANs): Engine and Applications
GAN by Example using Keras on Tensorflow Backend

Tensorflow Object Detection - Best practice [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
As mentioned in my other thread (Tensorflow Object Detection - Avoid overlapping boxes) I'm new to machine learning and I have to implement an algorithm for detecting traffic lights.
Regarding Tensorflow and it's possibilities, I've got a whole bunch of questions I don't know where to ask except Stack Overflow.
For a quick start I've downloaded a pre trained model and started training it using the Bosch Traffic Lights Dataset. Using a pre trained model is fine but every now and then I wonder if it's possible to modify this model (add or remove a layer) or if it would be best to use keras on top of tf for better customization possibilities.
Additionally I wonder how I should know what are the possible configurations in the pipeline.config file inside every pre trained model. Is there any documentation or do I have to dig into the python files to get into it? In other words, does it even make any sense to change the configuration?
For documentation purpose we're using tensorboard. Unfortunately there's no accuracy (but loss) documented out of the box - how do we get the accuracy displayed as an additional graph?
You should use configuration to tune all the aspects. As mentioned in Tensorflow object detection config files documentation, configuration parameters can be browser in the protocol buffers message definitions. For example, for the model, if you are using faster RCNN, have a look at the different fields of the FasterRcnn message. You could export a trained model, load it in a regular TensorFlow script and add anything you want to it for whatever purpose, but the object detection framework is meant to be configuration-driven.
For the metrics, have a look at Supported object detection evaluation protocols. In the EvalConfig message, there is a metrics_set that you can set to different values for different evaluation metrics.

How to use trained model in future [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am new to data science and I am still learning machine learning. I know we can use Regression, Classification, Clustering, ANN, CNN, RNN models and so on according to the application.
These models we code, training and predict some data in PC. Some models take to much time training also. After that, we shut down the PC.
If I want the same model with the same data set after some days, again open the PC and training same model.
I want to know how to use trained modal in future without training, again and again, every time PC open. I asking mostly for ANN, CNN, RNN models.
Also, I want to know where the weights values are stored for modal because weights are not stored in the variable. How can I find it and can not I use those trained weight data to give ANN in future.
Usually, simple models (e.g. Logistic Regression, Decision Tree) don't take a huge amount of time training, obviously depending on the size of the data you train them on.
On the other hand, deep learning models, tend to have high training time. A common technique is to save the trained model(s) using the HDF5 file format. In case you are interested, you can check this link for further info on the format.
Probably the most simple way to achieve this is by using Keras's built-in function model.save:
from keras.models import load_model
model = train_neural_network() # Train your model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
model = load_model('my_model.h5') # Load your saved model and use it on whatever data you want
Since you are a data science begginer, if you have some basic knowledge on the area and want to jump directly onto deep learning, I recommend using Google's Colaboratory (link).
Each user is assigned a virtual machine with hardware built specifically for tasks involving deep learning. It contains most dependencies you need to run neural networks.
Saving a fully-functional model is very useful—you can load them. In TensorFlow, you can save the entire model to a file that contains the weight values, the model's configuration, and even the optimizer's configuration.
You may do:
model.save('ModelName.model')
Also, Keras provides a basic save format using the HDF5 standard.
#Save entire model to a HDF5 file
model.save('my_model.h5')
For more details check the documentation
For weights, for example, if you have a binary classifier with twice as many in the 0 label as the 1 label, you may set them when fitting the model like this:
Class_Weights = {0: 1., 1: 2} #twice as many 0 as 1
#fit model and pass weights
model.fit(X, y, class_weight=Class_Weights
batch_size=20, epochs=5, validation_split=0.3,)
If you just use TensorFlow, you can use the SavedModel API. This is Tom's answer. Also, you can find an example in github by Wen

what are the best methods to classify the user gender based on names? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
If you check my github, I have successfully implemented CNN, KNN for classifying signal faults. For that, I have taken the signal with little preprocessing for dimensionality reduction and provided it to the network, using its class information I trained the network, later the trained network is tested with testing samples to determine the class and computed the accuracy.
My question here how do I input the text information to CNN or any other network. For inputs, I took the Twitter database from kaggle, I have selected 2 columns which have names and gender information. I have gone through some algorithms which classify gender based on their blog data. I wasn't clear how I implement to my data (In my case, if I just want to classify using only names alone).
In some examples, which I understood I saw computing sparse matrix for the text, but for 20,000 samples the sparse matrix is huge to give as input. I have no problem in implementing the CNN architectures(I want to implement because no features are required) or any other network. I am stuck here, how to input data to the network. What kind of conversations can I make so that I take the names and gender information can be considered to train the network?
If my method of thinking is wrong please provide me suggestion which algorithm is the best way. Deep learning or any other methods are ok!
You could use character-level embeddings (i.e. your input classes are the different characters, so 'a' is class 1, 'b' is class 2 etc..). One-hot encoding the classes and then passing them through an embedding layer will yield unique representations for each character. A string can then be treated as a character-sequence (or equally a vector-sequence), which can be used as an input for either a recurrent or convolutional network. If you feel like reading, this paper by Kim et al. will provide you all the necessary theoretical backbone.

Categories