I'm working on supporting automatic model detection/logging for Keras/Tensorflow models for our Machine learning platform https://iko.ai and I have some questions:
What are the different ways we can define a tf/keras model ?
tf.keras.Model
tf.Estimator
tensorflow_estimator
Any other ways I'm not aware of? Why are there so many ways to do the same thing?
What are the proper functions to save/load them?
How could we differentiate TF/Keras model instances from other non-model objects? I want to be able to write a function that checks if an object is a TF/Keras model, something like
def is_tf_or_keras_model(obj):
# check somehow if the obj is a TF/Keras model
pass
Regarding questions 1 and 2:
Another way to represent a neural network model is by using tf.keras.Sequential. It allows you to easily create a model that follows a sequential structure, for example:
import tensorflow as tf
# tf.__version__ == 2.4
model_seq = tf.keras.Sequential()
model_seq.add(tf.keras.Input(shape=(64,1)))
model_seq.add(tf.keras.layers.Conv1D(32, 2, activation='relu'))
model_seq.add(tf.keras.layers.Dense(16, activation='relu'))
model_seq.add(tf.keras.layers.Dense(4, activation='softmax'))
This is the same as using the tf.keras functional API. This allows to build more complex networks that do not follow a sequential structure, but of course you can build sequential models anyway:
i = tf.keras.Input(shape=(64,1))
x = tf.keras.layers.Conv1D(32, 2, activation='relu')(i)
x = tf.keras.layers.Dense(16, activation='relu')(x)
x = tf.keras.layers.Dense(4, activation='softmax')(x)
model_func = tf.keras.Model(inputs=i, outputs=x)
tf.estimator.Estimator is used to train prebuilt networks, so you only have to define some hyperparameter values and train out of the box model.
In conclusion, the method used to build a model depends on how complex/ad-hoc the network is.
Regarding question 3:
The tf.keras implementation allows you to either save the weights of a model (model.save_weights) or save the whole model (model.save). To load the model weights afterwards you need to create the model first, following the same structure of the model used to train those weights.
Regarding question 4:
model_classes = [tf.keras.Model, keras.Model, tf.estimator.Estimator]
def is_keras_or_tf_model(obj, model_classes):
return isinstance(obj, model_clases)
All I have said here is in the TensorFlow documentation.
Related
I want to use ResNet model architecture and want to change last few layers; how can I only use model architecture from model zoo in Tensorflow?
To use a ResNet model, you can choose a select few from tensorflow.keras.applications including ResNet50, ResNet101, and ResNet152. Then, you will need to change a few of the default arguments if you want to do transfer learning. For your question, you will need to set the weights parameter equal to None. Otherwise, 'imagenet' weights are provided. Also, you need to set include_top to be False since the number of classes for your problem will likely be different from ImageNet. Finally, you will need to provide the shape of your data in input_shape. This would look something like this.
base = tf.keras.applications.ResNet50(include_top=False, weights=None, input_shape=shape)
To get a summary of the model, you can do
base.summary()
To add your own head, you can use the Functional API. You will need to add an Input layer and your own Dense layer which will correspond to your task. This could be
input = tf.keras.layers.Input(shape=shape)
base = base(input)
out = tf.keras.layers.Dense(num_classes, activation='softmax')(base)
Finally, to construct a model, you can do
model = tf.keras.models.Model(input, out)
The Model constructor takes 2 arguments. The first being the inputs to your model, and the second being the outputs. Note that calling model.summary() will show the ResNet base as a separate layer. To view all layers of the ResNet base, you can do model.layers[1].summary(), or you can modify the code on how you built your model. The second way would be
out = tf.keras.layers.Dense(num_classes, activation='softmax')(base.output)
model = tf.keras.models.Model(base.input, out)
Now you can view all layers with just model.summary().
I want to train a model in a sequential manner. That is I want to train the model initially with a simple architecture and once it is trained, I want to add a couple of layers and continue training. Is it possible to do this in Keras? If so, how?
I tried to modify the model architecture. But until I compile, the changes are not effective. Once I compile, all the weights are re-initialized and I lose all the trained information.
All the questions in web and SO I found are either about loading a pre-trained model and continuing training or modifying the architecture of pre-trained model and then only test it. I didn't find anything related to my question. Any pointers are also highly appreciated.
PS: I'm using Keras in tensorflow 2.0 package.
Without knowing the details of your model, the following snippet might help:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
# Train your initial model
def get_initial_model():
...
return model
model = get_initial_model()
model.fit(...)
model.save_weights('initial_model_weights.h5')
# Use Model API to create another model, built on your initial model
initial_model = get_initial_model()
initial_model.load_weights('initial_model_weights.h5')
nn_input = Input(...)
x = initial_model(nn_input)
x = Dense(...)(x) # This is the additional layer, connected to your initial model
nn_output = Dense(...)(x)
# Combine your model
full_model = Model(inputs=nn_input, outputs=nn_output)
# Compile and train as usual
full_model.compile(...)
full_model.fit(...)
Basically, you train your initial model, save it. And reload it again, and wrap it together with your additional layers using the Model API. If you are not familiar with Model API, you can check out the Keras documentation here (afaik the API remains the same for Tensorflow.Keras 2.0).
Note that you need to check if your initial model's final layer's output shape is compatible with the additional layers (e.g. you might want to remove the final Dense layer from your initial model if you are just doing feature extraction).
I am working on a binary classification model in Tensorflow 2.0. I have a few different types of model, but I was thinking about using a Boosted Tree model to do the classification. However, I need my model to output class probabilities. So basically the probability that an output is 0 or 1, based upon the tree. The usual in such a case might be logistic regression, but in this case I wanted to try a tree based method.
I know that I can use a boosted tree model in Tensorflow 2.0 using the tf.Estimators api, but to get the class probabilities I usually need a final tf.keras Dense layer with a sigmoid activation. So in theory I would like something like the code below. The code below does not work, but is essentially pseudocode since I don't believe that the Keras models don't work with tf.Estimators, but this is what I would like to do.
import tensorflow as tf
from tensorflow import keras
model = keras.Sequential([
tf.estimator.BoostedTreesClassifier(feature_columns, **params), # <-- HOW TO DO THIS?
keras.layers.Dense(1, activation='sigmoid',
bias_initializer=output_bias),
])
Does anyone know how to do this in Tensorflow 2.0?
I have a subclassed model that instantiates a few custom layers via subclassing. I tried using keras.utils.plot_model() but all it does is print the model block, none of the layers appeared.
Can a Tensorflow expert comment on this? Will this feature ever be implemented in the future? If not, what is the next best alternative to examine the computation graph? Note that model.summary() only gives a summary of the parameters of the custom layer, within which contains two dense layers. Ideally, I like to see all the computations, if that is not asking too much...
Update: I dug into the source, looks like plot_model() first check for the _is_graph_network attribute. Graph Networks are used in Functional and Sequential APIs. From the source:
Two types of Networks exist: Graph Networks and Subclass Networks. Graph
networks are used in the Keras Functional and Sequential APIs. Subclassed
networks are used when a user subclasses the Model class. In general,
more Keras features are supported with Graph Networks than with Subclassed
Networks, specifically:
Model cloning (keras.models.clone())
Serialization (model.get_config()/from_config(), model.to_json()/to_yaml())
Whole-model saving (model.save())
(custom graph component)
Naturally, I like to know if I can build a graph network component, so my subclassed model/layer can work with these features. Does that involve a lot of effort?
(tf.function graph visualization)
Can someone let me know if graph visualization via Tensorboard works with Tensorflow2 tf.functions? In Tensorflow 1.x, one defines a name scope for a logical group of ops (e.g. generator/discriminator in GAN, encoder/decoder in VAE and loss/metrics), they are then displayed as a high-level block in the graph visualization. Can I define something similar for tf.functions?
Accodring to the officical documentation https://www.tensorflow.org/tensorboard/graphs,
you can
use TensorFlow Summary Trace API to log autographed functions for
visualization in TensorBoard
Here is a simple example for visuliazing a subclassed model:
import tensorflow as tf
from tensorflow.keras import layers
class MyModel(tf.keras.Model):
def __init__(self, num_classes=10):
super(MyModel, self).__init__(name='my_model')
self.num_classes = num_classes
self.dense_1 = layers.Dense(32, activation='relu')
self.dense_2 = layers.Dense(num_classes, activation='sigmoid')
def call(self, inputs):
x = self.dense_1(inputs)
return self.dense_2(x)
model = MyModel(num_classes=10)
model.compile(optimizer=tf.keras.optimizers.RMSprop(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))
#tf.function
def trace():
data = np.random.random((1, 32))
model(data)
logdir = "trace_log"
writer = tf.summary.create_file_writer(logdir)
tf.summary.trace_on(graph=True, profiler=True)
# Forward pass
trace()
with writer.as_default():
tf.summary.trace_export(name="model_trace", step=0, profiler_outdir=logdir)
Then you can use Tensorboard to examine the computation graph:
tensorboard --logdir trace_log
I'm trying to implement an adversarial loss in keras.
The model consists of two networks, one auto-encoder (the target model) and one discriminator. The two models share the encoder.
I created the adversarial loss of the auto-encoder by setting a keras variable
def get_adv_loss(d_loss):
def loss(y_true, y_pred):
return some_loss(y_true, y_pred) - d_loss
return loss
discriminator_loss = K.variable()
L = get_adv_loss(discriminator_loss)
autoencoder.compile(..., loss=L)
and during training I interleave train_on_batch of discriminator and autoencoder to update discriminator_loss
d_loss = disciminator.train_on_batch(x, y_domain)
discriminator_loss.assign(d_loss)
a_loss, ... = self.segmenter.train_on_batch(x, y_target)
However, I found out that the value of these variables is frozen when the model is compiled. I tried to recompile the model during training but that raise the error
Node 'IsVariableInitialized_13644': Unknown input node
'training_12/Adam/Variable'
which I guess it means i cant recompile during training? any suggestion on how i can inject the discriminator loss in the autoencoder?
Keras model supports multiple outputs. So just include your discirminator into your keras model and freeze the discrminator layers, if the discriminator should not be trained.
The next question would be how to combine autoencoder loss and discriminator loss. Luckily keras model.compile supports loss weights. If autoencoder is your first output and discriminator is your second you could do something like loss_weights=[1, -1]. So a better discriminator is worse for the autoencoder.
Edit: Here is an example, how to implement an Adversary Network:
# Build your architecture
auto_encoder_input = Input((5,))
auto_encoder_net = Dense(10)(auto_encoder_input)
auto_encoder_output = Dense(5)(auto_encoder_net)
discriminator_net = Dense(20)(auto_encoder_output)
discriminator_output = Dense(5)(discriminator_net)
# Define outputs of your model
train_autoencoder_model = Model(auto_encoder_input, [auto_encoder_output, discriminator_output])
train_discriminator_model = Model(auto_encoder_input, discriminator_output)
# Compile the models (compile the first model and then change the trainable attribute for the second)
for layer_index, layer in enumerate(train_autoencoder_model.layers):
layer.trainable = layer_index < 3
train_autoencoder_model.compile('Adam', loss=['mse', 'mse'], loss_weights=[1, -1])
for layer_index, layer in enumerate(train_discriminator_model.layers):
layer.trainable = layer_index >= 3
train_discriminator_model.compile('Adam', loss='mse')
# A simple example how a training can look like
for i in range(10):
auto_input = np.random.sample((10,5))
discrimi_output = np.random.sample((10,5))
train_discriminator_model.fit(auto_input, discrimi_output, steps_per_epoch=5, epochs=1)
train_autoencoder_model.fit(auto_input, [auto_input, discrimi_output], steps_per_epoch=1, epochs=1)
As you can see there is no much magic behind building an Adversary Model with keras.
Unless you decide to go deep in the keras source code, I don't think you can do this easily. Before writing your own adversarial module, you should check the existing works carefully. As far as I know, keras-adversarial is still used by many people. Of course, it only supports old keras versions, e.g. 2.0.8.
Several other things:
be careful when you freeze your model weights. If you first compile a model and then freeze some weights, these weights are still trainable, because when the train function is generated during compiling. So you should freeze weights first then compile.
keras-adversarial does this job in a more elegant way. Instead of making two models, shared weights but freeze some weights in different ways, it creates two train functions, one for each player.