I am using Keras 2.0.8 with Tensorflow 1.3.0 in Ubuntu 16.04 with Cuda 8.0 and cuDNN 6.
I am using two BatchNormalization layers( keras layers ) in my model and training using tensorflow pipeline.
I am facing two problems here -
BatchNorm layer population parameters( mean and variance ) are not being updated while training even after setting K.learning_phase to True. As a result, inference is failing completely. I need some advice on how to update these parameters between training steps manually.
Secondly, after saving the trained model using tensorflow saver op, when I try to load it, the results cannot be reproduced. It seems the weights are changing. Is there a way to keep the weights same in save-load operation?
I ran into the same problem a few weeks ago. Internally, keras layers can add additional update operations to a model (e.g. batchnorm). So you need to run these additional ops explicitly. For the batchnorm these updates seem to be just some assign_ops which swap the current mean/variance with the new values. If you do not create a keras model this might work; assuming x is a tensor you like to normalize
bn = keras.layers.BatchNormalization()
x = bn(x)
....
sess.run([minimizer_op,bn.updates],K.learning_phase(): 1)
In my workflow, I am creating a keras model (w/o compiling it) and then run the following
model = keras.Model(inputs=inputs, outputs=prediction)
sess.run([minimizer_op,model.updates],K.learning_phase(): 1)
where inputs can be something like
inputs = [keras.layers.Input(tensor=input_variables)]
and outputs is a list of tensorflow tensors. The model seems to aggregate all additional updates operations between inputs and outputs automatically.
Related
I am trying to use torch.utils.tensorboard to log my neural network's structure. But, I am having the following error when I use the add_graph function of the writer:
Cannot insert a Tensor that requires grad as a constant. Consider making a parameter or input, or detaching the gradient.
Then it prints a Tensor of shape (512, 512), which is equal to one of the hidden layers of the model's input and output dimensions.
The code I was using is as follows:
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()
input_tensor = torch.Tensor(...., require_grads=False)
writer.add_graph(model, input_tensor)
I Googled around but only found several posts with the same error message but completely different causes.
The versions of the libraries are:
Python 3.8.12
pytorch 1.10.1 py3.8_cuda11.3_cudnn8.2.0_0 pytorch
cudatoolkit 11.3.1
tensorboard 2.7.0
CUDA Version: 11.2
It is hard to answer this without seeing your model. What I believe is happening is that your model has hanging layers with (for example) convolutional tensors in them that have not been passed to the gpu, and so are not parameters.
Try running:
model.cuda()
model.parameters()
to see if you get the model you expect.
TensorFlow's official tutorial says that we should pass base_model(trainin=False) during training in order for the BN layer not to update mean and variance. my question is: why? why we don't need to update mean and variance, I mean BN has imagenet mean and variance and why it is useful to use imagenet's mean and variance, and not update them on new data? even during fine tunning, in this case whole model updates weights but BN layer still is going to have imagenet mean and variance.
edit: i am using this tutorial :https://www.tensorflow.org/tutorials/images/transfer_learning
When model is trained from initialization, batchnorm should be enabled to tune their mean and variance as you mentioned. Finetuning or transfer learning is a bit different thing: you already has a model that can do more than you need and you want to perform particular specialization of pre-trained model to do your task/work on your data set. In this case part of weights are frozen and only some layers closest to output are changed. Since BN layers are used all around model you should froze them as well. Check again this explanation:
Important note about BatchNormalization layers Many models contain
tf.keras.layers.BatchNormalization layers. This layer is a special
case and precautions should be taken in the context of fine-tuning, as
shown later in this tutorial.
When you set layer.trainable = False, the BatchNormalization layer
will run in inference mode, and will not update its mean and variance
statistics.
When you unfreeze a model that contains BatchNormalization layers in
order to do fine-tuning, you should keep the BatchNormalization layers
in inference mode by passing training = False when calling the base
model. Otherwise, the updates applied to the non-trainable weights
will destroy what the model has learned.
Source: transfer learning, details regarding freeze
I want to train a model in a sequential manner. That is I want to train the model initially with a simple architecture and once it is trained, I want to add a couple of layers and continue training. Is it possible to do this in Keras? If so, how?
I tried to modify the model architecture. But until I compile, the changes are not effective. Once I compile, all the weights are re-initialized and I lose all the trained information.
All the questions in web and SO I found are either about loading a pre-trained model and continuing training or modifying the architecture of pre-trained model and then only test it. I didn't find anything related to my question. Any pointers are also highly appreciated.
PS: I'm using Keras in tensorflow 2.0 package.
Without knowing the details of your model, the following snippet might help:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
# Train your initial model
def get_initial_model():
...
return model
model = get_initial_model()
model.fit(...)
model.save_weights('initial_model_weights.h5')
# Use Model API to create another model, built on your initial model
initial_model = get_initial_model()
initial_model.load_weights('initial_model_weights.h5')
nn_input = Input(...)
x = initial_model(nn_input)
x = Dense(...)(x) # This is the additional layer, connected to your initial model
nn_output = Dense(...)(x)
# Combine your model
full_model = Model(inputs=nn_input, outputs=nn_output)
# Compile and train as usual
full_model.compile(...)
full_model.fit(...)
Basically, you train your initial model, save it. And reload it again, and wrap it together with your additional layers using the Model API. If you are not familiar with Model API, you can check out the Keras documentation here (afaik the API remains the same for Tensorflow.Keras 2.0).
Note that you need to check if your initial model's final layer's output shape is compatible with the additional layers (e.g. you might want to remove the final Dense layer from your initial model if you are just doing feature extraction).
I'm trying to freeze some layers in a neural network so that the weights do not change. How can I confirm that my weights are not being updated?
Is there some way I can view the weights or plot them? Perhaps in Tensorboard?
Using Python and Tensorflow 1.x by the way.
Please help me. I am using Tensorflow 2.0 GPU.
I train the model and save in .h5 format
model = keras.Sequential()
model.add(layers.Bidirectional(layers.CuDNNLSTM(self._window_size, return_sequences=True),
input_shape=(self._window_size, x_train.shape[-1])))
model.add(layers.Dropout(rate=self._dropout, seed=self._seed))
model.add(layers.Bidirectional(layers.CuDNNLSTM((self._window_size * 2), return_sequences=True)))
model.add(layers.Dropout(rate=self._dropout, seed=self._seed))
model.add(layers.Bidirectional(layers.CuDNNLSTM(self._window_size, return_sequences=False)))
model.add(layers.Dense(units=1))
model.add(layers.Activation('linear'))
model.summary()
model.compile(
loss='mean_squared_error',
optimizer='adam'
)
# обучаем модель
history = model.fit(
x_train,
y_train,
epochs=self._epochs,
batch_size=self._batch_size,
shuffle=False,
validation_split=0.1
)
model.save('rts.h5')
Then I load this model and use it for forecasting and everything works.
model = keras.models.load_model('rts.h5')
y_hat = model.predict(x_test)
But the question arose of using a trained model in Tensorflow Serving. And the model in .h5 format is not accepted.
I run:
sudo docker run --gpus 1 -p 8501:8501 --mount type=bind,source=/home/alex/PycharmProjects/TensorflowServingTestData/RtsModel,target=/models/rts_model -e MODEL_NAME=rts_model -t tensorflow/serving:latest-gpu
But the question arose of using a trained model in Tensorflow Serving. And the model in .h5 format is not accepted.
I run:
And I get the error:
tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:267] No versions of servable rts_model found under base path /models/rts_model
I try to save the trained model as described here, https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators:
And I get the error:
ValueError: Layer has 2 states but was passed 0 initial states.
I tried to save the model as follows, https://www.tensorflow.org/api_docs/python/tf/keras/models/save_model:
And have the same error:
ValueError: Layer has 2 states but was passed 0 initial states.
The only thing that works to save the model in the format for Tensorflow Serving is:
keras.experimental.export_saved_model(model, 'saved_model/1/')
Saved model work in Serving. But I get a warning that this method is deprecated and will be removed in a future version.
Instructions for updating:
Please use `model.save(..., save_format="tf")` or `tf.keras.models.save_model(..., save_format="tf")`.
And it closed me.
When I try to use these methods, it gives an error.
When I use what works, writes that it is deprecated.
Please, help.
How to save a trained model in Tensorflow 2.0. so that it can be used for Tensorflow Serving.
I was trying to fix this too!
According to the answer here the normal LSTM (i.e. tf.keras.layers.LSTM) will use GPU, and should be used in general over the cuDNNLSTM class unless you specifically need the original implementation (not sure why you would).
According to docs the normal LSTM will use cuDNN implementation if some requirements are met (see below).
When using this LSTM layer, I could successfully save to the tf output type, just using model.save_model('output_path', save_format='tf')
Requirements for LSTM using cuDNN are as follows (note that all the requirements are met with the defaults):
If a GPU is available and all the arguments to the layer meet the requirement of the CuDNN kernel (see below for details), the layer will use a fast cuDNN implementation.
The requirements to use the cuDNN implementation are:
activation == tanh
recurrent_activation == sigmoid
recurrent_dropout == 0
unroll is False
use_bias is True
Inputs are not masked or strictly right padded.