How to increment training Theano saved models? - python

I have a trained model by Theano, and there are new training data I want to increase the mode, How could I do this?

Initialize the model with the pre-trained weights and perform gradient updates for the new examples, but you do have to take care of the learning rate and other parameters (depending on your optimizer). You may also try storing optimizer's parameter as well, initializing the optimizer with those values of parameters to make sure new training data does not drastically change the trained model.

Related

How to prevent Weights & Biases from saving unnecessary parameters

I am using Weights & Biases (link) to manage hyperparameter optimization and log the results. I am training using Keras with a Tensorflow backend, and I am using the out-of-the-box logging functionality of Weights & Biases, in which I run
wandb.init(project='project_name', entity='username', config=config)
and then add a WandbCallback() to the callbacks of classifier.fit(). By default, Weights & Biases appears to save the model parameters (i.e., the model's weights and biases) and store them in the cloud. This eats up my account's storage quota, and it is unnecessary --- I only care about tracking the model loss/accuracy as a function of the hyperparameters.
Is it possible for me to train a model and log the loss and accuracy using Weights & Biases, but not store the model parameters in the cloud? How can I do this?
In order to not save the trained model weights during hyperparam optimization you do something like this:
classifier.fit(..., callbacks=[WandbCallback(.., save_model=False)]
This will only track the metrics (train/validation loss/acc, etc.).

Tensorflow 2x: What exactly does the parameter include_optimizer affect in tensorflow.keras.save_model

I have been browsing the documentation for the tensorflow.keras.save_model() API and I came across the parameter include_optimizer and I am wondering what would be the advantage of not including the optimizer, or perhaps what problems could arise if the optimizer isn't saved with the model?
To give more context for my specific use-case, I want to save a model and then use the generated .pb file with Tensorflow Serving. Is there any reason I would need to save the optimizer state, would not saving it reduce the overall size of the resultant file? If I don't save it is it possible that the model will not work correctly in TF serving?
Saving the optimizer state will require more space, as the optimizer has parameters that are adjusted during training. For some optimizers, this space can be significant, as several meta-parameters are saved for each tuned model parameter.
Saving the optimizer parameters allows you to restart training in exactly the same state as you saved the checkpoint, whereas without saving the optimizer state, even the same model parameters might result in a variety of training outcomes with different optimizer parameters.
Thus, if you plan on continuing to train your model from the saved checkpoint, you'd probably want to save the optimizer's state as well. However, if you're instead saving the model state for future use only for inference, you don't need the optimizer state for anything. Based on your description of wanting to deploy the model on TF Serving, it sounds like you'll only be doing inference with the saved model, so are safe to exclude the optimizer.

How to partial training on the additional data for pre-trained model?

In my case, I would like to weekly tune/adjust the model parameters value.
I have pre-trained the model by using the 100K data rows, Keras, and saved the model.
Then, as the new data collection (10K data rows), I need to tune the model parameter but don't want to retrain the whole dataset (110K).
How can I just partially fit the data on the model? load model -> model.fit(10K_data)?
Yes, that is correct you will train only on the new dataset (10k) model.fit(10K_data). I will recommend to change the learning rate for the retraining (reducing the learning rate) as you will just want to do a minor update to the parameters while keeping the earlier learning intact (or trying to leavarage the earlier learning).

tf.estimator.LinearClassifier output weights interpretation

I am new to tensorflow and machine learning and I am training a tf.estimator.LinearClassifier on the classic MNIST data set.
After the training process I am reading the output weights and biases using classifier.get_variable_names() I get:
"['global_step', 'linear/linear_model/bias_weights', 'linear/linear_model/bias_weights/part_0/Adagrad', 'linear/linear_model/pixels/weights', 'linear/linear_model/pixels/weights/part_0/Adagrad']"
My question is: what is the difference between linear/linear_model/bias_weights and linear/linear_model/bias_weights/part_0/Adagrad? They are both of the same size.
The only explanation I can imagine is that linear/linear_model/bias_weights and linear/linear_model/bias_weights/part_0/Adagrad represent respectively the weights at the beginning and at the end of the training process.
However, I'm not sure about that and I can't find anything on line.
linear/linear_model/bias_weights are your trained model weights.
linear/linear_model/bias_weights/part_0/Adagrad comes from you using the AdaGrad optimizer. The special feature of this optimizer is that it keeps a "memory" of past gradients and uses this to rescale the gradients at each training step. See the AdaGrad paper if you want to know more (very mathy).
The important part is that linear/linear_model/bias_weights/part_0/Adagrad stores this "memory". It is returned because it is technically a tf.Variable in your program, however it is not an actual variable/weight in your model. Only linear/linear_model/bias_weights is. Of course the same holds for linear/linear_model/pixels/weights.

training the same model with different data sets in tensorflow

The problem:
I have a model that I would like to train with independent data sets. Afterwards, I would like to extract the weights of each model (the model is the same for each instance but trained using different datasets) and finally, compute and average of these weights. Basically, my intention is to mimic tensorflow running on multiple devices and then average their weights so that they are used by one model.
My solution:
I added this model multiple times to tensorflow and am currently training each of these models separately with its unique dataset.. but this is using GBs of memory, and am wondering if there is a better way to do this?
One of the possible solutions is that you can fine-tune your network weights with other similar networks(similar datasets, i.e, if your dataset is images, you can use AlexNet weights)don't afraid if your network has no same architecture, you can simply load weights of layers as much as you need by 'load_with_skip' function of
https://github.com/joelthchao/tensorflow-finetune-flickr-style/blob/master/network.py
Fine-tuning takes much less than train networks from scratch.

Categories