Missed setting `training=True` call method in tensorflow, any problem? - python

I have trained a model in tensroflow for 4 days, and achieved good test and train loss, convereged well. But later realised that, I haven't forwarded the training=True argument in the call() method of custom written keras model code, during training. I have batch_normalization layers in my neural network.
So, using what mean and variance are my batch_normalization layers trained , when training=None?
Will the mean, variance of these batch_normalization layers used in the prediction time be random/dynamic now?

Related

Passing `training=true` when using doing tensorflow training

TensorFlow's official tutorial says that we should pass base_model(trainin=False) during training in order for the BN layer not to update mean and variance. my question is: why? why we don't need to update mean and variance, I mean BN has imagenet mean and variance and why it is useful to use imagenet's mean and variance, and not update them on new data? even during fine tunning, in this case whole model updates weights but BN layer still is going to have imagenet mean and variance.
edit: i am using this tutorial :https://www.tensorflow.org/tutorials/images/transfer_learning
When model is trained from initialization, batchnorm should be enabled to tune their mean and variance as you mentioned. Finetuning or transfer learning is a bit different thing: you already has a model that can do more than you need and you want to perform particular specialization of pre-trained model to do your task/work on your data set. In this case part of weights are frozen and only some layers closest to output are changed. Since BN layers are used all around model you should froze them as well. Check again this explanation:
Important note about BatchNormalization layers Many models contain
tf.keras.layers.BatchNormalization layers. This layer is a special
case and precautions should be taken in the context of fine-tuning, as
shown later in this tutorial.
When you set layer.trainable = False, the BatchNormalization layer
will run in inference mode, and will not update its mean and variance
statistics.
When you unfreeze a model that contains BatchNormalization layers in
order to do fine-tuning, you should keep the BatchNormalization layers
in inference mode by passing training = False when calling the base
model. Otherwise, the updates applied to the non-trainable weights
will destroy what the model has learned.
Source: transfer learning, details regarding freeze

How to make a neural network generalizes better?

I designed a neural network model with large number of output predicted by softmax function. However, I want categorize all the outputs into 5 outputs without modifying the architecture of other layers. The model performs well in the first case but when I decrease the number of output it loses accuracy and get a bad generalization. My question is : Is there a method to make my model performs well even if there is just 5 outputs ? for example : adding dropout layer before output layer, using other activation function, etc.
If it is a plain neural network then yeah definitely use the RelU activation function in the hidden layers and add dropout layer for each hidden layer. Also you can normalize you data before feeding them to the network.

Naming layers in keras

I am using a pre-trained keras model ( Convolutional network) and I am retraining this model again on my dataset.
Now, I need to get the output of some layers, to visualize the gradient activation. I just found out that every trained model has different naming of layers. for example, the input layer in one model is: input_7 (InputLayer) and in another model is input_5 (InputLayer).
Do you know how to prevent this bad behavior? How can I keep the same naming without the need to manually name all the layers, as I have more than 53 convolutional layers?

Multi-class multi-label classification in Keras

I am trying to train a multi-task multi-label classifier using Keras. The output layer is a fork of two outputs. The task of each output layer is to predict the categories of its task. The y vectors are OneHot encoded.
I am using a custom generator for my data that yields the y arrays in a list to the fit_generator function
I am using a categorigal_crossentropy loss function at each layer
fork1.compile(loss={'O1': 'categorical_crossentropy', 'O2': 'categorical_crossentropy'},
optimizer=optimizers.Adam(lr=0.001),
metrics=['accuracy'])
The problem: The loss doesn't decrease with this setup. However, if I train each task separately, I have low loss and high accuracy. So what could be the problem ?
To perform multilabel categorical classification (where each sample can have several classes), end your stack of layers with a Dense layer with a number of units equal to the number of classes and a sigmoid activation, and use binary_crossentropy as the loss. Your targets should be k-hot encoded.
Regarding the multi-output model, training such a model requires the ability to specify different loss functions for different heads of the network requiring a different training procedure.
You should provide more info in order to give a clear indication of what you want to achieve.

Unset trainable attributes for parameters in lasagne / nolearn neural networks

I'm implementing a convolutional neural network using lasagne nolearn.
I'd like to fix some parameters that prelearned.
How can I set some layers untrainable?
Actually, though I removed 'trainable' attribute of some layers,
the number shown in the layer information before fitting, namely, such as
Neural Network with *** learnable parameters never change.
Besides, I'm afraid that the greeting function
in 'handers.py'
def _get_greeting(nn):
shapes = [param.get_value().shape for param in
nn.get_all_params() if param]
should be
nn.get_all_params(trainable=True) if param]
but I'm not sure how it affect on training.

Categories