I'm use keras to build a model, and write optimizing codes and all the others in tensorflow. When I was using quite simple layers like Dense or Conv2D, everything was straightforward. But adding BatchNormalization layer into my keras model makes problem complicated.
Since BatchNormalization layer behaves differently in training phase and testing phase, I figured out that I need K.learning_phase():True in my feed_dict. But following code is not working well. It runs with no error, but the model's performance isn't getting any better.
import keras.backend as K
...
x_train, y_train = get_data()
sess.run(train_op, feed_dict={x:x_train, y:y_train, K.learning_phase():True})
When I tried training keras model with keras fit function, it worked well.
What should I do to train a keras model with BatchNormalization layer in tensorflow?
Actually I duplicated this question that I hadn't seen.
I found the answer here, it just consists in passing an special argument to the BatchNormalization layer call
Related
Should the inference model in a chatbot model with keras lstm, have the same amount of layers as the main model or it doesnt matter?
I don't know what you exactly mean by inference model.
The number of layers of a model is an hyperparameter that you tune during training. Let's say that you train an LSTM model with 3 layers, then the model used for inference must have the same number of layers and use the weights resulting from the training.
Otherwise, if you add non trained layer when inference, the results won't make any sense.
Hope this helps
I am working on a multi-target regression problem in Keras and I would like the predicted values in the last layer to be sorted. I am currently implementing something like this:
# Lambda layer
import tensorflow as tf
def sort_layer(tensor):
return tf.sort(tensor)
# Training model on train set
from keras.models import Sequential
from keras.layers import Dense,Lambda
model = Sequential()
model.add(Dense(100,input_dim=X_train.shape[1],activation="relu"))
model.add(Dense(150,activation="relu"))
model.add(Dense(50,activation="relu"))
model.add(Dense(y_train.shape[1],activation="linear"))
model.add(Lambda(sort_layer))
model.compile(loss="mse", optimizer="adam")
model.fit(X_train,y_train, epochs=100,batch_size=10, verbose=0)
This doesn't seem to be working every time as some predictions don't come out sorted. Can anyone explain what I am doing wrong and suggest a good fix?
Thank you!
I want to train a model in a sequential manner. That is I want to train the model initially with a simple architecture and once it is trained, I want to add a couple of layers and continue training. Is it possible to do this in Keras? If so, how?
I tried to modify the model architecture. But until I compile, the changes are not effective. Once I compile, all the weights are re-initialized and I lose all the trained information.
All the questions in web and SO I found are either about loading a pre-trained model and continuing training or modifying the architecture of pre-trained model and then only test it. I didn't find anything related to my question. Any pointers are also highly appreciated.
PS: I'm using Keras in tensorflow 2.0 package.
Without knowing the details of your model, the following snippet might help:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
# Train your initial model
def get_initial_model():
...
return model
model = get_initial_model()
model.fit(...)
model.save_weights('initial_model_weights.h5')
# Use Model API to create another model, built on your initial model
initial_model = get_initial_model()
initial_model.load_weights('initial_model_weights.h5')
nn_input = Input(...)
x = initial_model(nn_input)
x = Dense(...)(x) # This is the additional layer, connected to your initial model
nn_output = Dense(...)(x)
# Combine your model
full_model = Model(inputs=nn_input, outputs=nn_output)
# Compile and train as usual
full_model.compile(...)
full_model.fit(...)
Basically, you train your initial model, save it. And reload it again, and wrap it together with your additional layers using the Model API. If you are not familiar with Model API, you can check out the Keras documentation here (afaik the API remains the same for Tensorflow.Keras 2.0).
Note that you need to check if your initial model's final layer's output shape is compatible with the additional layers (e.g. you might want to remove the final Dense layer from your initial model if you are just doing feature extraction).
I'm a newbie in a Neural Networks. I'm doing my university NN project in Keras. I assembled and trained the one-layer sequential model using SGD optimizer:
[...]
nn_model = Sequential()
nn_model.add(Dense(32, input_dim=X_train.shape[1], activation='tanh'))
nn_model.add(Dense(1, activation='tanh'))
sgd = keras.optimizers.SGD(lr=0.001, momentum=0.25)
nn_model.compile(loss='mean_squared_error', optimizer=sgd, metrics=['accuracy'])
history = nn_model.fit(X_train, Y_train, epochs=2000, verbose=2, validation_data=(X_test, Y_test))
[...]
I've tried various learning rate, momentum, neurons and get satisfying accuracy and error results. But, I have to know how keras works. So could you please explain me how exactly fitting in keras works, because I can't find it in Keras documentation?
How does Keras update weights? Does it use a backpropagation algorithm? (I'm 95% sure.)
How SGD algorithm is implemented in Keras? Is it similar to Wikipedia explanation?
How Keras exactly calculate a gradient?
Thank you kindly for any information.
Let's try to break it down and I'll cover only Keras specific bits:
How does Keras update weights? Using an Optimiser which is a base class for different optimisers. Each optimiser calculates the new weights, under a function get_updates which returns a list of functions when run applies the updates.
Back-propagation? Yes, but Keras doesn't implement it directly, it leaves it for the backend tensor libraries to perform automatic differentiation. For example K.gradients calls tf.gradients in the Tensorflow backend.
SGD algorithm? It is implemented as expected on Wikipedia in the SGD class with the basic extensions such as momentum. You can follow the code easily and how it calculates the updates.
How gradient is calculated? Using back-propagation.
I am using Keras 2.0.8 with Tensorflow 1.3.0 in Ubuntu 16.04 with Cuda 8.0 and cuDNN 6.
I am using two BatchNormalization layers( keras layers ) in my model and training using tensorflow pipeline.
I am facing two problems here -
BatchNorm layer population parameters( mean and variance ) are not being updated while training even after setting K.learning_phase to True. As a result, inference is failing completely. I need some advice on how to update these parameters between training steps manually.
Secondly, after saving the trained model using tensorflow saver op, when I try to load it, the results cannot be reproduced. It seems the weights are changing. Is there a way to keep the weights same in save-load operation?
I ran into the same problem a few weeks ago. Internally, keras layers can add additional update operations to a model (e.g. batchnorm). So you need to run these additional ops explicitly. For the batchnorm these updates seem to be just some assign_ops which swap the current mean/variance with the new values. If you do not create a keras model this might work; assuming x is a tensor you like to normalize
bn = keras.layers.BatchNormalization()
x = bn(x)
....
sess.run([minimizer_op,bn.updates],K.learning_phase(): 1)
In my workflow, I am creating a keras model (w/o compiling it) and then run the following
model = keras.Model(inputs=inputs, outputs=prediction)
sess.run([minimizer_op,model.updates],K.learning_phase(): 1)
where inputs can be something like
inputs = [keras.layers.Input(tensor=input_variables)]
and outputs is a list of tensorflow tensors. The model seems to aggregate all additional updates operations between inputs and outputs automatically.