how to make a custom loss function which use model in keras - python

I'm trying to make a custom loss function for keras NN model.
Normally, loss functions have y_prediction and y_true for arguments.
But, I need to use model in the custom loss function like
y_prediction = model(X_train) to use tf. GradientTape.
So what I want to know is how to use the latest model(on the way to fit) in the custom loss function.
If you have an idea about that, tell me, please.
(Sorry for my bad English)

You can create a model class as and implement the train_step method:
class YourModel(Model):
def __init__(self):
super(YourModel, self).__init__()
# define your model architecture here as an attribute of the class
def train_step(data):
with tf.GradientTape() as tape:
# foward pass data through the architecture
# compute loss (y_true, y_pred, any other param)
# weight update
gradients = tape.gradient(loss, self.trainable_variables)
self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))
return {
'loss': loss
# other losses
}
def call(self, x):
# your forward pass implementation
return # output
More information can be found here: https://www.tensorflow.org/tutorials/quickstart/advanced

Related

When are Model call() and train_step() called?

I am going through this tutorial on how to customize the training loop
https://colab.research.google.com/github/tensorflow/docs/blob/snapshot-keras/site/en/guide/keras/customizing_what_happens_in_fit.ipynb#scrollTo=46832f2077ac
The last example shows a GAN implemented with a custom training, where only __init__, train_step, and compile methods are defined
class GAN(keras.Model):
def __init__(self, discriminator, generator, latent_dim):
super(GAN, self).__init__()
self.discriminator = discriminator
self.generator = generator
self.latent_dim = latent_dim
def compile(self, d_optimizer, g_optimizer, loss_fn):
super(GAN, self).compile()
self.d_optimizer = d_optimizer
self.g_optimizer = g_optimizer
self.loss_fn = loss_fn
def train_step(self, real_images):
if isinstance(real_images, tuple):
real_images = real_images[0]
...
What happens if my model also has a call() custom function? Does train_step() overrides call()?
Aren't call() and train_step() both called by fit() and what is the difference between both ?
Below another piece of code "I" wrote where I wonder what is called into fit(), call() or train_step():
class MyModel(tf.keras.Model):
def __init__(self, vocab_size, embedding_dim, rnn_units):
super().__init__(self)
self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
self.gru = tf.keras.layers.GRU(rnn_units,
return_sequences=True,
return_state=True,
reset_after=True
)
self.dense = tf.keras.layers.Dense(vocab_size)
def call(self, inputs, states=None, return_state=False, training=False):
x = inputs
x = self.embedding(x, training=training)
if states is None:
states = self.gru.get_initial_state(x)
x, states = self.gru(x, initial_state=states, training=training)
x = self.dense(x, training=training)
if return_state:
return x, states
else:
return x
#tf.function
def train_step(self, inputs):
# unpack the data
inputs, labels = inputs
with tf.GradientTape() as tape:
predictions = self(inputs, training=True) # forward pass
# Compute the loss value
# (the loss function is configured in `compile()`)
loss=self.compiled_loss(labels, predictions, regularization_losses=self.losses)
# compute the gradients
grads=tape.gradient(loss, model.trainable_variables)
# Update weights
self.optimizer.apply_gradients(zip(grads, model.trainable_variables))
# Update metrics (includes the metric that tracks the loss)
self.compiled_metrics.update_state(labels, predictions)
# Return a dict mapping metric names to current value
return {m.name: m.result() for m in self.metrics}
These are different concepts and are used like this:
train_step is called by fit. Basically, fit loops over the dataset and provide each batch to train_step (and then handles metrics, bookkeeping, etc., of course).
call is used when you, well, call the model. To be precise, writing model(inputs) or in your case self(inputs) will use the function __call__, but the Model class has that function defined such that it will in turn use call.
Those are the technical aspects. Intuitively:
call should define the forward-pass of your model. i.e. how is the input transformed to the output.
train_step defines the logic of a training step, usually with gradient descent. It will often make use of call since the training step tends to include a forward pass of the model to compute gradients.
As for the GAN tutorial you linked, I would say that can actually be considered incomplete. It works without defining call because the custom train_step explicitly calls the generator/discriminator fields (as these are predefined models, they can be called as usual). If you tried to call the GAN model like gan(inputs), I would assume you get an error message (I did not test this). So you would always have to call gan.generator(inputs) to generate, for example.
Finally (this part may be a bit confusing), note that you can subclass a Model to define a custom training step, but then initialize it via the functional API (like model = Model(inputs, outputs)), in which case you can make use of call in the training step without ever defining it yourself because the functional API takes care of that.

How can I specify a loss function to be quadratic weighted kappa in Keras?

My understanding is that keras requires loss functions to have the signature:
def custom_loss(y_true, y_pred):
I am trying to use sklearn.metrics.cohen_kappa_score, which takes
(y1, y2, labels=None, weights=None, sample_weight=None)`
If I use it as is:
model.compile(loss=metrics.cohen_kappa_score,
optimizer='adam', metrics=['accuracy'])
Then the weights won't be set. I want to set that to quadtratic. Is there some what to pass this through?
There are two steps in implementing a parameterized custom loss function (cohen_kappa_score) in Keras. Since there are implemented function for your needs, there is no need for you to implement it yourself. However, according to TensorFlow Documentation, sklearn.metrics.cohen_kappa_score does not support weighted matrix.
Therefore, I suggest TensorFlow's implementation of cohen_kappa. However, using TensorFlow in Keras is not that easy...
According to this Question, they used control_dependencies to use a TensorFlow metric in Keras. Here is a example:
import keras.backend as K
def _cohen_kappa(y_true, y_pred, num_classes, weights=None, metrics_collections=None, updates_collections=None, name=None):
kappa, update_op = tf.contrib.metrics.cohen_kappa(y_true, y_pred, num_classes, weights, metrics_collections, updates_collections, name)
K.get_session().run(tf.local_variables_initializer())
with tf.control_dependencies([update_op]):
kappa = tf.identity(kappa)
return kappa
Since Keras loss functions take (y_true, y_pred) as parameters, you need a wrapper function that returns another function. Here is some code:
def cohen_kappa_loss(num_classes, weights=None, metrics_collections=None, updates_collections=None, name=None):
def cohen_kappa(y_true, y_pred):
return -_cohen_kappa(y_true, y_pred, num_classes, weights, metrics_collections, updates_collections, name)
return cohen_kappa
Finally, you can use it as follows in Keras:
# get the loss function and set parameters
model_cohen_kappa = cohen_kappa_loss(num_classes=3,weights=weights)
# compile model
model.compile(loss=model_cohen_kappa,
optimizer='adam', metrics=['accuracy'])
Regarding using the Cohen-Kappa metric as a loss function. In general it is possible to use weighted kappa as a loss function. Here is a paper using weighted kappa as a loss function for multi-class classification.
You can define it as a custom loss and yes you are right that keras accepts only two arguments in the loss function. Here is how you can define your loss:
def get_cohen_kappa(weights=None):
def cohen_kappa_score(y_true, y_pred):
"""
Define your code here. You can now use `weights` directly
in this function
"""
return score
return cohen_kappa_score
Now you can pass this function to your model as:
model.compile(loss=get_cohen_kappa_score(weights=weights),
optimizer='adam')
model.fit(...)

Custom Accuracy/Loss for each Output in Multiple Output Model in Keras

I am trying to define custom loss and accuracy functions for each output in a two output neural network model using Keras. Let's call the two outputs: A and B.
My objectives are:
Give the accuracy/loss functions for one of the outputs names such that they can be reported on the same graphs in tensorboard as the same corresponding output from older/existing models I have laying around. So for example, accuracies and losses for output A in this two output network should be viewable in the same graph in tensorboard as output A of some older model that I have. More specifically, these older models all output A_output_acc, val_A_output_acc, A_output_loss and val_A_output_loss. So I want the corresponding metric readouts for the A output in this new model to have those names as well so that they are viewable/comparable on the same graph in tensorboard.
Allow for easy configuration of accuracy/loss functions so that I can swap at whim different losses/accuracies for each output without hard coding them.
I have a Modeler class that constructs and compiles a network. The relevant code follows.
class Modeler(BaseModeler):
def __init__(self, loss=None,accuracy=None, ...):
"""
Returns compiled keras model.
"""
self.loss = loss
self.accuracy = accuracy
model = self.build()
...
model.compile(
loss={ # we are explicit here and name the outputs even though in this case it's not necessary
"A_output": self.A_output_loss(),#loss,
"B_output": self.B_output_loss()#loss
},
optimizer=optimus,
metrics= { # we need to tie each output to a specific list of metrics
"A_output": [self.A_output_acc()],
# self.A_output_loss()], # redundant since it's already reported via `loss` param,
# ends up showing up as `A_output_loss_1` since keras
# already reports `A_output_loss` via loss param
"B_output": [self.B_output_acc()]
# self.B_output_loss()] # redundant since it's already reported via `loss` param
# ends up showing up as `B_output_loss_1` since keras
# already reports `B_output_loss` via loss param
})
self._model = model
def A_output_acc(self):
"""
Allows us to output custom train/test accuracy/loss metrics to desired names e.g. 'A_output_acc' and
'val_A_output_acc' respectively so that they may be plotted on same tensorboard graph as the accuracies from
other models that same outputs.
:return: accuracy metric
"""
acc = None
if self.accuracy == TypedAccuracies.BINARY:
def acc(y_true, y_pred):
return self.binary_accuracy(y_true, y_pred)
elif self.accuracy == TypedAccuracies.DICE:
def acc(y_true, y_pred):
return self.dice_coef(y_true, y_pred)
elif self.accuracy == TypedAccuracies.JACARD:
def acc(y_true, y_pred):
return self.jacard_coef(y_true, y_pred)
else:
logger.debug('ERROR: undefined accuracy specified: {}'.format(self.accuracy))
return acc
def A_output_loss(self):
"""
Allows us to output custom train/test accuracy/loss metrics to desired names e.g. 'A_output_acc' and
'val_A_output_acc' respectively so that they may be plotted on same tensorboard graph as the accuracies from
other models that same outputs.
:return: loss metric
"""
loss = None
if self.loss == TypedLosses.BINARY_CROSSENTROPY:
def loss(y_true, y_pred):
return self.binary_crossentropy(y_true, y_pred)
elif self.loss == TypedLosses.DICE:
def loss(y_true, y_pred):
return self.dice_coef_loss(y_true, y_pred)
elif self.loss == TypedLosses.JACARD:
def loss(y_true, y_pred):
return self.jacard_coef_loss(y_true, y_pred)
else:
logger.debug('ERROR: undefined loss specified: {}'.format(self.accuracy))
return loss
def B_output_acc(self):
"""
Allows us to output custom train/test accuracy/loss metrics to desired names e.g. 'A_output_acc' and
'val_A_output_acc' respectively so that they may be plotted on same tensorboard graph as the accuracies from
other models that same outputs.
:return: accuracy metric
"""
acc = None
if self.accuracy == TypedAccuracies.BINARY:
def acc(y_true, y_pred):
return self.binary_accuracy(y_true, y_pred)
elif self.accuracy == TypedAccuracies.DICE:
def acc(y_true, y_pred):
return self.dice_coef(y_true, y_pred)
elif self.accuracy == TypedAccuracies.JACARD:
def acc(y_true, y_pred):
return self.jacard_coef(y_true, y_pred)
else:
logger.debug('ERROR: undefined accuracy specified: {}'.format(self.accuracy))
return acc
def B_output_loss(self):
"""
Allows us to output custom train/test accuracy/loss metrics to desired names e.g. 'A_output_acc' and
'val_A_output_acc' respectively so that they may be plotted on same tensorboard graph as the accuracies from
other models that same outputs.
:return: loss metric
"""
loss = None
if self.loss == TypedLosses.BINARY_CROSSENTROPY:
def loss(y_true, y_pred):
return self.binary_crossentropy(y_true, y_pred)
elif self.loss == TypedLosses.DICE:
def loss(y_true, y_pred):
return self.dice_coef_loss(y_true, y_pred)
elif self.loss == TypedLosses.JACARD:
def loss(y_true, y_pred):
return self.jacard_coef_loss(y_true, y_pred)
else:
logger.debug('ERROR: undefined loss specified: {}'.format(self.accuracy))
return loss
def load_model(self, model_path=None):
"""
Returns built model from model_path assuming using the default architecture.
:param model_path: str, path to model file
:return: defined model with weights loaded
"""
custom_objects = {'A_output_acc': self.A_output_acc(),
'A_output_loss': self.A_output_loss(),
'B_output_acc': self.B_output_acc(),
'B_output_loss': self.B_output_loss()}
self.model = load_model(filepath=model_path, custom_objects=custom_objects)
return self
def build(self, stuff...):
"""
Returns model architecture. Instead of just one task, it performs two: A and B.
:return: model
"""
...
A_conv_final = Conv2D(1, (1, 1), activation="sigmoid", name="A_output")(up_conv_224)
B_conv_final = Conv2D(1, (1, 1), activation="sigmoid", name="B_output")(up_conv_224)
model = Model(inputs=[input], outputs=[A_conv_final, B_conv_final], name="my_model")
return model
The training works fine. However, when I go to load the model for inference later, using the above load_model() function, Keras complains that it doesn't know about the custom metrics I have given it:
ValueError: Unknown loss function:loss
What seems to be happening is that Keras is appending the returned function created in each of the custom metric functions above (def loss(...), def acc(...)) to the dictionary key given in the metrics parameter of the model.compile() call.
So, for example the key is A_output and we call the custom accuracy function, A_output_acc() for it, which returns a function called acc. So the result is A_output + acc = A_output_acc. This means that I can't name those returned functions: acc/loss something else, because that will mess up the reporting/graphs.
This is all fine and well, BUT I don't know how to write my load function with a properly defined custom_objects parameter (or define/name my custom metrics functions for that matter) so that Keras knows which custom accuracy/loss functions are to be loaded with each output head.
More to the point, it seems to be wanting a custom_objects dictionary of the following form in load_model() (which won't work for obvious reasons):
custom_objects = {'acc': self.A_output_acc(),
'loss': self.A_output_loss(),
'acc': self.B_output_acc(),
'loss': self.B_output_loss()}
instead of:
custom_objects = {'A_output_acc': self.A_output_acc(),
'A_output_loss': self.A_output_loss(),
'B_output_acc': self.B_output_acc(),
'B_output_loss': self.B_output_loss()}
Any insights or work-arounds?
Thanks!
EDIT:
I've confirmed the reasoning above about key/function name concatenation IS correct for the metrics parameter of Keras' model.compile() call. HOWEVER, for the loss parameter in model.compile(), Keras just concatenates the key with the word loss, yet expects the name of the custom loss function in the custom_objects parameter of model.load_model()...go figure.
Remove the () at the end of your losses and metrics and that should be it. It'll look like this instead
loss={
"A_output": self.A_output_loss,
"B_output": self.B_output_loss
}

How does pytorch compute the gradients for a simple linear regression model?

I am using pytorch and trying to understand how a simple linear regression model works.
I'm using a simple LinearRegressionModel class:
class LinearRegressionModel(nn.Module):
def __init__(self, input_dim, output_dim):
super(LinearRegressionModel, self).__init__()
self.linear = nn.Linear(input_dim, output_dim)
def forward(self, x):
out = self.linear(x)
return out
model = LinearRegressionModel(1, 1)
Next I instantiate a loss criterion and an optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
Finally to train the model I use the following code:
for epoch in range(epochs):
if torch.cuda.is_available():
inputs = Variable(torch.from_numpy(x_train).cuda())
if torch.cuda.is_available():
labels = Variable(torch.from_numpy(y_train).cuda())
# Clear gradients w.r.t. parameters
optimizer.zero_grad()
# Forward to get output
outputs = model(inputs)
# Calculate Loss
loss = criterion(outputs, labels)
# Getting gradients w.r.t. parameters
loss.backward()
# Updating parameters
optimizer.step()
My question is how does the optimizer get the loss gradient, computed by loss.backward(), to update the parameters using the step() method? How are the model, the loss criterion and the optimizer tied together?
PyTorch has this concept of tensors and variables. When you use nn.Linear the function creates 2 variables namely W and b.In pytorch a variable is a wrapper that encapsulates a tensor , its gradient and information about its create function. you can directly access the gradients by
w.grad
When you try it before calling the loss.backward() you get None. Once you call the loss.backward() it will contain now gradients. Now you can update these gradient manually with the below simple steps.
w.data -= learning_rate * w.grad.data
When you have a complex network ,the above simple step could grow complex. So optimisers like SGD , Adam takes care of this. When you create the object for these optimisers we pass in the parameters of our model. nn.Module contains this parameters() function which will return all the learnable parameters to the optimiser. Which can be done using the below step.
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
loss.backward()
calculates the gradients and store them in the parameters.
And you pass in the paremeters that are needed to be tuned here:
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

How to store variable in loss function into instance variable

I am using Keras with Tensorflow.
Since I want to create LSTM-CRF model, I defined my own loss function using tf.contrib.crf.crf_log_likelihood:
def loss(self, y_true, y_pred):
sequence_lengths = ... # calc from y_true
log_likelihood, transition_params = tf.contrib.crf.crf_log_likelihood(y_pred, y_true, sequence_lengths)
loss = tf.reduce_mean(-log_likelihood)
self.transition_params = transition_params
return loss
As you know, CRF needs transition params on prediction phase. So I stored transition_params into instance variables, self.transition_params.
The problem is that self.transition_params has never been updated during minibatch. According to my observation, it seems to be stored only once when compiling the model.
Is there any way to store variable in loss function into instance variable in Keras?
The problem is the wrong function signature tf.contrib.crf.crf_log_likelihood, you need to pass the transition_params with your current transition params. Following changes will solve the same.
log_likelihood, transition_params =
tf.contrib.crf.crf_log_likelihood(y_pred, y_true, sequence_lengths,
transition_params=self.transition_params)

Categories