I have a VAE model that I've broken down into the encoder and decoder parts, and implemented a custom loss. A simplified example is as below
input = Input(shape=(self.image_height, self.image_width, self.image_channel))
encoded = build_encoder(input)
decoded = build_decoder(encoded)
model = Model(input, decoded)
The loss (simplified) is
loss = K.mean(decoded[0] + decoded[1] + encoded[0]**2)
model.add_loss(loss)
model.compile(optimizer=self.optimizer)
My main problem is that I want to use Keras' modelcheckpoint function, which would then require me to set custom metrics. However, everything I have seen online is similar to https://keras.io/metrics/#custom_metrics. This only takes in y_true and y_pred, and modify the validation loss from there. How would I implement it in my example model, where the loss is calculated from multiple inputs, not only the final output of "decoded"?
Well apparently you can still use the variables (keras layers) without passing those into the custom loss function.
So for my example, the loss can be calculated as
def custom_loss(y_true, y_pred):
return K.mean(decoded[0] + decoded[1] + encoded[0]**2)
model.compile(optimizer=self.optimizer, loss=custom_loss)
y_true and y_pred is never used, but the actual required inputs can still be called (as long as they are in the same scope as the custom loss function of course).
Related
I want to create a custom loss which gets the output of the net and multiple arguments from a data generator.
I found this article, which describes how to calculate one loss from multiple layers with one label. But I want to calculate the loss from a single layer with multiple labels using the fit_generator. My problem is that Keras expects the output and the label to be of the same shape.
example:
Regular custom loss:
def custom_loss(y_pred, y_label):
return K.mean(y_pred - y_label)
An example for the type of custom loss I want to use:
def custom_loss(y_pred, y_label, y_weights):
loss = K.mean(y_pred - y_label)
return tf.compat.v1.losses.compute_weighted_loss(loss, y_weights)
This is just an example my original code is a little more complicated. I just want to be able to give the loss function two parameters (y_label and y_weights) instead of only one (y_label).
Does anyone know how to solve this problem?
I am not sure what exactly you are asking but maybe you can use this. You can try something like a custom function that returns a loss function.
def custom_loss(y_weights):
# Create a loss function that calculates what you want
def example_loss(y_true,y_pred):
loss = K.mean(y_pred - y_label)
return tf.compat.v1.losses.compute_weighted_loss(loss, y_weights)
# Return a function
return example_loss
# Compile the model
model.compile(optimizer='adam',
loss=custom_loss(y_weights), # Call the loss function with the preferred weights
metrics=['accuracy'])
You can also take a look at this question
I have CNN that I have built using on Tensor-flow 2.0. I need to access outputs of the intermediate layers. I was going over other stackoverflow questions that were similar but all had solutions involving Keras sequential model.
I have tried using model.layers[index].output but I get
Layer conv2d has no inbound nodes.
I can post my code here (which is super long) but I am sure even without that someone can point to me how it can be done using just Tensorflow 2.0 in eager mode.
I stumbled onto this question while looking for an answer and it took me some time to figure out as I use the model subclassing API in TF 2.0 by default (as in here https://www.tensorflow.org/tutorials/quickstart/advanced).
If somebody is in a similar situation, all you need to do is assign the intermediate output you want, as an attribute of the class. Then keep the test_step without the #tf.function decorator and create its decorated copy, say val_step, for efficient internal computation of validation performance during training. As a short example, I have modified a few functions of the tutorial from the link accordingly. I'm assuming we need to access the output after flattening.
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
self.intermediate=x #assign it as an object attribute for accessing later
x = self.d1(x)
return self.d2(x)
#Remove #tf.function decorator from test_step for prediction
def test_step(images, labels):
predictions = model(images, training=False)
t_loss = loss_object(labels, predictions)
test_loss(t_loss)
test_accuracy(labels, predictions)
return
#Create a decorated val_step for object's internal use during training
#tf.function
def val_step(images, labels):
return test_step(images, labels)
Now when you run model.predict() after training, using the un-decorated test step, you can access the intermediate output using model.intermediate which would be an EagerTensor whose value is obtained simply by model.intermediate.numpy(). However, if you don't remove the #tf_function decorator from test_step, this would return a Tensor whose value is not so straightforward to obtain.
Thanks for answering my earlier question. I wrote this simple example to illustrate how what you're trying to do might be done in TensorFlow 2.x, using the MNIST dataset as the example problem.
The gist of the approach:
Build an auxiliary model (aux_model in the example below), which is so-called "functional model" with multiple outputs. The first output is the output of the original model and will be used for loss calculation and backprop, while the remaining output(s) are the intermediate-layer outputs that you want to access.
Use tf.GradientTape() to write a custom training loop and expose the detailed gradient values on each individual variable of the model. Then you can pick out the gradients that are of interest to you. This requires that you know the ordering of the model's variables. But that should be relatively easy for a sequential model.
import tensorflow as tf
(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()
# This is the original model.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=[28, 28, 1]),
tf.keras.layers.Dense(100, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")])
# Make an auxiliary model that exposes the output from the intermediate layer
# of interest, which is the first Dense layer in this case.
aux_model = tf.keras.Model(inputs=model.inputs,
outputs=model.outputs + [model.layers[1].output])
# Define a custom training loop using `tf.GradientTape()`, to make it easier
# to access gradients on specific variables (the kernel and bias of the first
# Dense layer in this case).
cce = tf.keras.losses.CategoricalCrossentropy()
optimizer = tf.optimizers.Adam()
with tf.GradientTape() as tape:
# Do a forward pass on the model, retrieving the intermediate layer's output.
y_pred, intermediate_output = aux_model(x_train)
print(intermediate_output) # Now you can access the intermediate layer's output.
# Compute loss, to enable backprop.
loss = cce(tf.one_hot(y_train, 10), y_pred)
# Do backprop. `gradients` here are for all variables of the model.
# But we know we want the gradients on the kernel and bias of the first
# Dense layer, which happens to be the first two variables of the model.
gradients = tape.gradient(loss, aux_model.variables)
# This is the gradient on the first Dense layer's kernel.
intermediate_layer_kerenl_gradients = gradients[0]
print(intermediate_layer_kerenl_gradients)
# This is the gradient on the first Dense layer's bias.
intermediate_layer_bias_gradients = gradients[1]
print(intermediate_layer_bias_gradients)
# Update the variables of the model.
optimizer.apply_gradients(zip(gradients, aux_model.variables))
The most straightforward solution would go like this:
mid_layer = model.get_layer("layer_name")
you can now treat the "mid_layer" as a model, and for instance:
mid_layer.predict(X)
Oh, also, to get the name of a hidden layer, you can use this:
model.summary()
this will give you some insights about the layer input/output as well.
I want to develop a neural network with three inputs pos,anc,neg and three outputs pos_out,anc_out,neg_out. While calculating loss in my customized loss function in keras, I want to access pos_out, anc_out, neg_out in y_pred. I can access y_pred as a whole. But how to access individual part pos_out, anc_out and neg_out
I have applied max function to y_pred. It calculates max value correctly. If I am passing only one output in Model as Model(input=[pos,anc,neg], output=pos_out) then also it calculates max value correctly. But when it comes to accessing max values form pos_out, anc_out and neg_out separately in customized function, it does not work.
def testmodel(input_shape):
pos = Input(shape=(14,300))
anc = Input(shape=(14,300))
neg = Input(shape=(14,300))
model = Sequential()
model.add(Flatten(batch_input_shape=(1,14,300)))
pos_out = model(pos)
anc_out = model(anc)
neg_out = model(neg)
model = Model(input=[pos,anc,neg], output=[pos_out,anc_out,neg_out])
return model
def customloss(y_true,y_pred):
print((K.int_shape(y_pred)[1]))
#loss = K.max((y_pred))
loss = K.max[pos_out]
return loss
You can create a loss function that contains a closure that lets you access the model and thus the targets and the model layer outputs.
class ExampleCustomLoss(object):
""" The loss function can access model.inputs, model.targets and the outputs
of specific layers. These are all tensors and will have the expected results
for the batch.
"""
def __init__(self, model):
self.model = model
def loss(self, y_true, y_pred, **kwargs):
...
return loss
model = Model(..., ...)
loss_calculator = ExampleCustomLoss(model)
model.compile('adam', loss_calculator.loss)
However, it may be simpler to do the inverse. i.e. have a single model output
out = Concatenate(axis=1)([pos_out, anc_out, neg_out])
And then in the loss function slice y_true and y_pred.
From the names of variables, it looks as if you are trying to use a triplet loss. You may find this other question useful:
How to deal with triplet loss when at time of input i have only two files i.e. at time of testing
Your loss function gets 2 arguments, model output and true label, your model output will have the shape that you define when you define the net. Your loss function needs to output a single difference value, between your model's output and the true value of the label while training.
Also please add some trainable layers to your model, because your custom loss function will be useless otherwise.
I am trying to implement an autoencoder in Keras that not only minimizes the reconstruction error but its constructed features should also maximize a measure I define. I don't really have an idea of how to do this.
Here's a snippet of what I have so far:
corrupt_data = self._corrupt(self.data, 0.1)
# define encoder-decoder network structure
# create input layer
input_layer = Input(shape=(corrupt_data.shape[1], ))
encoded = Dense(self.encoding_dim, activation = "relu")(input_layer)
decoded = Dense(self.data.shape[1], activation="sigmoid")(encoded)
# create autoencoder
dae = Model(input_layer, decoded)
# define custom multitask loss with wlm measure
def multitask_loss(y_true, y_pred):
# extract learned features from hidden layer
learned_fea = Model(input_layer, encoded).predict(self.data)
# additional measure I want to optimize from an external function
wlm_measure = wlm.measure(learned_fea, self.labels)
cross_entropy = losses.binary_crossentropy(y_true, y_pred)
return wlm_measure + cross_entropy
# create optimizer
dae.compile(optimizer=self.optimizer, loss=multitask_loss)
dae.fit(corrupt_data, self.data,
epochs=self.epochs, batch_size=20, shuffle=True,
callbacks=[tensorboard])
# separately create an encoder model
encoder = Model(input_layer, encoded)
Currently this does not work properly... When I viewed the training history the model seems to ignore the additional measure and train only based on the cross entropy loss. Also if I change the loss function to consider only wlm measure, I get the error "numpy.float64" object has no attribute "get_shape" (I don't know if changing my wlm function's return type to a tensor will help).
There are a few places that I think may have gone wrong. I don't know if I am extracting the outputs of the hidden layer correctly in my custom loss function. Also I don't know if my wlm.measure function is outputting correctly—whether it should output numpy.float32 or a 1-dimensional tensor of type float32.
Basically a conventional loss function only cares about the output layer's predicted labels and the true labels. In my case, I also need to consider the hidden layer's output (activation), which is not that straightforward to implement in Keras.
Thanks for the help!
You don't want to define your learned_fea Model inside your custom loss function. Rather, you could define a single model upfront with two outputs: the output of the decoder (the reconstruction) and the output of the endoder (the feature representation):
multi_output_model = Model(inputs=input_layer, outputs=[decoded, encoded])
Now you can write a custom loss function that only applies to the output of the encoder:
def custom_loss(y_true, y_pred):
return wlm.measure(y_pred, y_true)
Upon compiling the model, you pass a list of loss functions (or a dictionary if you name your tensors):
model.compile(loss=['binary_crossentropy', custom_loss], optimizer=...)
And fit the model by passing a list of outputs:
model.fit(X=X, y=[data_to_be_reconstructed,labels_for_wlm_measure])
I am using Keras with Tensorflow.
Since I want to create LSTM-CRF model, I defined my own loss function using tf.contrib.crf.crf_log_likelihood:
def loss(self, y_true, y_pred):
sequence_lengths = ... # calc from y_true
log_likelihood, transition_params = tf.contrib.crf.crf_log_likelihood(y_pred, y_true, sequence_lengths)
loss = tf.reduce_mean(-log_likelihood)
self.transition_params = transition_params
return loss
As you know, CRF needs transition params on prediction phase. So I stored transition_params into instance variables, self.transition_params.
The problem is that self.transition_params has never been updated during minibatch. According to my observation, it seems to be stored only once when compiling the model.
Is there any way to store variable in loss function into instance variable in Keras?
The problem is the wrong function signature tf.contrib.crf.crf_log_likelihood, you need to pass the transition_params with your current transition params. Following changes will solve the same.
log_likelihood, transition_params =
tf.contrib.crf.crf_log_likelihood(y_pred, y_true, sequence_lengths,
transition_params=self.transition_params)