TensorFlow costum loss: Implement Mix of L1 loss an SSIM loss - python

I want to implement a similar loss function like in this paper: https://arxiv.org/pdf/1511.08861.pdf
They are combining here the l1 (Mean Average Error) and the MS-SSIM Loss like in following equation:
L_Mix = α · L_MSSSIM + (1 − α) · GaussFilter· L_1
There is a caffe code available on GitHub: https://github.com/NVlabs/PL4NN/blob/master/src/loss.py
But I dont know how to use this in TF. Is there already a similar existing code for TF?
I started trying this:
def ms_ssim_loss(y_true, y_pred):
ms_ssim = tf.image.ssim_multiscale(y_true, y_pred, 1.0)
loss = 1-ms_ssim
return loss
def mix_loss(y_true, y_pred):
alpha = 0.84
ms_ssim = ms_ssim_loss(y_true, y_pred)
l1 = tf.keras.losses.MeanAbsoluteError(y_true, y_pred)
gauss = gaussian(...)
loss = ms_ssim * alpha + (1-alpha) * gauss * l1
return loss
But don't know how to implement and use the gaussian filter here.
Thanks in advance and best regards!

Related

How to create Hybrid loss consisting from dice loss and focal loss [Python]

I'm trying to implement the Multiclass Hybrid loss function in Python from following article https://arxiv.org/pdf/1808.05238.pdf for my semantic segmentation problem using an imbalanced dataset. I managed to get my implementation correct enough to start while training the model, but the results are very poor. Model architecture - U-net, learning rate in Adam optimizer is 1e-5. Mask shape is (None, 512, 512, 3), with 3 classes (in my case forest, deforestation, other). The formula I used to implement my loss:
The code I created:
def build_hybrid_loss(_lambda_=1, _alpha_=0.5, _beta_=0.5, smooth=1e-6):
def hybrid_loss(y_true, y_pred):
C = 3
tversky = 0
# Calculate Tversky Loss
for index in range(C):
inputs_fl = tf.nest.flatten(y_pred[..., index])
targets_fl = tf.nest.flatten(y_true[..., index])
#True Positives, False Positives & False Negatives
TP = tf.reduce_sum(tf.math.multiply(inputs_fl, targets_fl))
FP = tf.reduce_sum(tf.math.multiply(inputs_fl, 1-targets_fl[0]))
FN = tf.reduce_sum(tf.math.multiply(1-inputs_fl[0], targets_fl))
tversky_i = (TP + smooth) / (TP + _alpha_ * FP + _beta_ * FN + smooth)
tversky += tversky_i
tversky += C
# Calculate Focal loss
loss_focal = 0
for index in range(C):
f_loss = - (y_true[..., index] * (1 - y_pred[..., index])**2 * tf.math.log(y_pred[..., index]))
# Average over each data point/image in batch
axis_to_reduce = range(1, 3)
f_loss = tf.math.reduce_mean(f_loss, axis=axis_to_reduce)
loss_focal += f_loss
result = tversky + _lambda_ * loss_focal
return result
return hybrid_loss
The prediction of the model after the end of an epoch (I have a problem with swapped colors, so the red in the prediction is actually green, which means forest, so the prediction is mostly forest and not deforestation):
The question is what is wrong with my hybrid loss implementation, what needs to be changed to make it work?
To simplify things a little, I have divided the Hybrid loss into four separate functions: Tversky's loss, Dice coefficient, Dice loss, Hybrid loss. You can see the code below.
def TverskyLoss(targets, inputs, alpha=0.5, beta=0.5, smooth=1e-16, numLabels=3):
tversky = 0
for index in range(numLabels):
inputs_fl = tf.nest.flatten(inputs[..., index])
targets_fl = tf.nest.flatten(targets[..., index])
#True Positives, False Positives & False Negatives
TP = tf.reduce_sum(tf.math.multiply(inputs_fl, targets_fl))
FP = tf.reduce_sum(tf.math.multiply(inputs_fl, 1-targets_fl[0]))
FN = tf.reduce_sum(tf.math.multiply(1-inputs_fl[0], targets_fl))
tversky_i = (TP + smooth) / (TP + alpha*FP + beta*FN + smooth)
tversky += tversky_i
return numLabels - tversky
def dice_coef(y_true, y_pred, smooth=1e-16):
y_true_f = tf.nest.flatten(y_true)
y_pred_f = tf.nest.flatten(y_pred)
intersection = tf.math.reduce_sum(tf.math.multiply(y_true_f, y_pred_f))
return (2. * intersection + smooth) / (tf.math.reduce_sum(y_true_f) + tf.math.reduce_sum(y_pred_f) + smooth)
def dice_coef_multilabel(y_true, y_pred, numLabels=3):
dice=0
for index in range(numLabels):
dice -= dice_coef(y_true[..., index], y_pred[..., index])
return numLabels + dice
def build_hybrid_loss(_lambda_=0.5, _alpha_=0.5, _beta_=0.5, smooth=1e-16, C=3):
def hybrid_loss(y_true, y_pred):
tversky = TverskyLoss(y_true, y_pred, alpha=_alpha_, beta=_beta_)
dice = dice_coef_multilabel(y_true, y_pred)
result = tversky + _lambda_ * dice
return result
return hybrid_loss
Adding the loss=build_hybrid_loss() during model compilation will add Hybrid loss as the loss function of the model.
After a short research, I came to the conclusion that in my particular case, a Hybrid loss with _lambda_ = 0.2, _alpha_ = 0.5, _beta_ = 0.5 would not be much better than a single Dice loss or a single Tversky loss. Neither IoU (intersection over union) nor the standard accuracy metric are much better with Hybrid loss. But I believe it is not a rule of thumb that such a Hybrid loss will be worser or at the same level of performance as single loss at all cases.
link to Accuracy graph
link to IoU graph

No gradients provided for any variable for custom loss function

I have created a custom loss function in Keras as follows:
import tensorflow as tf
import numpy as np
def custom_loss(y_true, y_pred):
cce = tf.keras.losses.CategoricalCrossentropy()
loss = cce(y_true, y_pred).numpy()
epsilon = np.finfo(np.float32).eps
confidence = np.clip(y_true.numpy(), epsilon, 1.-epsilon)
sample_entropy = -1. * np.sum(np.multiply(confidence, np.log(confidence) / np.log(np.e)), axis=-1)
entropy = np.mean(sample_entropy)
penalty = 0.1 * -entropy
return loss + penalty
When I use this custom loss function I'm getting the error message
ValueError: No gradients provided for any variable
Somehow now gradients can be calculated. How needs the loss function be changed so that gradients can be calculated?
Tensorflow need tensor to store the dependency info to let gradients flow backwards, if you convert tensor to numpy array in loss function then you break this dependency thus no gradients provided for any variable, so you need change every np operation in loss function to corresponding tf or backend operation, e.g:
import tensorflow as tf
import numpy as np
cce = tf.keras.losses.CategoricalCrossentropy()
epsilon = np.finfo(np.float32).eps
def custom_loss(y_true, y_pred):
loss = cce(y_true, y_pred)
confidence = tf.clip_by_value(y_true, epsilon, 1.-epsilon)
sample_entropy = -1. * tf.reduce_sum(tf.math.multiply(confidence, tf.math.log(confidence) / tf.math.log(np.e)), axis=-1)
entropy = tf.reduce_mean(sample_entropy)
penalty = 0.1 * -entropy
return loss + penalty

Dynamically combining loss functions in tensorflow keras

I'm working on a multi-label classification problem where instead of each target index representing a distinct class, it represents some amount of time into the future. On top of wanting my predicted label to match the target label, I want an extra term to enforce some temporal aspect of the learning.
E.g.:
y_true = [1., 1., 1., 0.]
y_pred = [0.75, 0.81, 0.93, 0.65]
Above, the truth label implies something occurring during the first three indices.
I want to easily be able to mix and match loss functions.
I have a couple custom loss functions for overall accuracy, each wrapped within functions for adjustable arguments:
def weighted_binary_crossentropy(pos_weight=1):
def weighted_binary_crossentropy_(Y_true, Y_pred):
...
return tf.reduce_mean(loss, axis=-1)
return weighted_binary_crossentropy_
def mean_squared_error(threshold=0.5):
def mean_squared_error_(Y_true, Y_pred):
...
return tf.reduce_mean(loss, axis=-1)
return mean_squared_error
I also have a custom loss function to enforce the predicted label ending at the same time as the truth label (I haven't made use of the threshold argument here yet):
def end_time_error(threshold=0.5):
def end_time_error_(Y_true, Y_pred):
_, n_times = K.int_shape(Y_true)
weights = K.arange(1, n_times + 1, dtype=float)
Y_true = tf.multiply(Y_true, weights)
argmax_true = K.argmax(Y_true, axis=1)
argmax_pred = K.argmax(Y_pred, axis=1)
loss = tf.math.squared_difference(argmax_true, argmax_pred)
return tf.reduce_mean(loss, axis=-1)
Sometimes I might want to combine end_time_error with weighted_binary_crossentropy, sometimes with mean_squared_error, and I have plenty of other loss functions to experiment with. I don't want to have to code a new combined loss function for each pair.
Attempt at solution 1
I've tried making a meta-loss function that combines loss functions (globally defined in the same script).
def combination_loss(loss_dict, combine='add', weights=[]):
losses = []
if not weights:
weights = [1] * len(loss_dict)
for (loss_func, loss_args), weight in zip(loss_dict.items(), weights):
assert loss_func in globals().keys()
loss_func = eval(loss_func)
loss = loss_func(loss_args)
losses.append(loss * weight)
if combine == 'add':
loss = sum(losses)
elif combine == 'multiply':
loss = np.prod(losses)
return loss
To use this:
loss_args = {'loss_dict':
{'weighted_binary_crossentropy': {'pos_weight': 1},
'end_time_error': {}},
'combine': 'add',
'weights': [0.75, 0.25]}
model.compile(loss=combination_loss(**loss_args), ...)
Error:
File "C:\...\losses.py", line 165, in combination_loss
losses.append(loss * weight)
TypeError: unsupported operand type(s) for *: 'function' and 'float'
I'm playing loose with functions, so I'm not surprised this failed. But I'm not sure how to get what I want.
How can I combine functions with weights in combination_loss?
Or should I be doing that directly in the model.compile() call using a lambda function?
--EDIT
Attempt at solution 2
Ditching combination_loss:
losses = []
for loss_, loss_args_ in loss_args['loss_dict'].items():
losses.append(get_loss(loss_)(**loss_args_))
loss = lambda y_true, y_pred: [l(y_true, y_pred) * w for l, w
in zip(losses, loss_args['weights'])]
model.compile(loss=loss, ...)
Error:
File "C:\...\losses.py", line 139, in end_time_error_
weights = K.arange(1, n_times + 1, dtype=float)
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
Probably because y_true, y_pred won't work as arguments for wrapped loss functions.
Let's simplify your use case for only two losses:
loss = alpha * loss1 + (1-alpha) * loss2
Then you can do:
def generate_loss(alpha):
def combination_loss(y_true, y_pred):
return alpha * loss1(y_true, y_pred) + (1-alpha) * loss2(y_true, y_pred)
return combination_loss
Obviously, loss1 and loss2 would be your respective loss functions.
You can use this to generate different loss functions for different alphas:
alpha = 0.7
combination_loss = generate_loss(alpha)
model.compile(loss=combination_loss, ...)
If alpha is supposed to be static, you can also get rid of the outer function generate_loss.
Finally, you can also define this as a lambda function:
model.compile(loss=lambda y_true, y_pred: alpha * loss1(y_true, y_pred) + (1-alpha) * loss2(y_true, y_pred), ...)
I'm not sure where your bug is (I assume it's the eval but I can't debug it) but if you simplify it enough like this or use this as a working example to introduce your losses and weights, it should work.

Beta-Variational AutoEncoder can't disentangle

I am working on a dummy example with generated heartbeats, and want to first use a VAE to encode the heartbeats and afterwards a simple classifier.
Problem is when i increase the beta above 0.01, the reconstructions become nonsense (see the first image).
And when the beta is low i get a normal autoencoder output with no disentanglement (second image).
I believe the problem may be in my KL divergence or VAE loss function, but i can't seem to find it.
In my encoder i do the reparameterization as such:
enc = self.encoder(x,batch_size, x_lenghts)
mu = self.enc2mean(enc)
logv = self.enc2logv(enc)
std = torch.exp(0.5*logv)
z = torch.randn([batch_size,1, self.encoder_hidden_sizes[-1] * (int(self.bidirectional)+1)]).to(self.device)
z = z * std + mu
And i define the VAE loss as:
def VAE_loss(x, reconstruction, mu, logvar, batch_size, latent_dim, beta=0):
mse = F.mse_loss(x, reconstruction)
KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
KLD /= (batch_size * latent_dim)
return mse + beta*KLD
Full standalone code to reproduce the results is here.
Any insights are appreciated!

Tensorflow - adding L2 regularization loss simple example

I am familiar with machine learning, but I am learning Tensorflow on my own by reading some slides from universities. Below I'm setting up the loss function for linear regression with only one feature. I'm adding an L2 loss to the total loss, but I am not sure if I'm doing it correctly:
# Regularization
reg_strength = 0.01
# Create the loss function.
with tf.variable_scope("linear-regression"):
W = tf.get_variable("W", shape=(1, 1), initializer=tf.contrib.layers.xavier_initializer())
b = tf.get_variable("b", shape=(1,), initializer=tf.constant_initializer(0.0))
yhat = tf.matmul(X, W) + b
error_loss = tf.reduce_sum(((y - yhat)**2)/number_of_examples)
#reg_loss = reg_strength * tf.nn.l2_loss(W) # reg 1
reg_loss = reg_strength * tf.reduce_sum(W**2) # reg 2
loss = error_loss + reg_loss
# Set up the optimizer.
opt_operation = tf.train.GradientDescentOptimizer(0.001).minimize(loss)
My specific questions are:
I have two lines (commented as reg 1 and reg 2) that compute the L2 loss of the weight W. The line marked with reg 1 uses the Tensorflow built-in function. Are these two L2 implementations equivalent?
Am I adding the regularization loss reg_loss correctly to the final loss function?
Almost
According to the L2Loss operation code
output.device(d) = (input.square() * static_cast<T>(0.5)).sum();
It multiplies also for 0.5 (or in other words it divides by 2)
Are these two L2 implementations equivalent?
Almost, as #fabrizioM pointed out, you can see here for the introduction to the l2_loss in TensorFlow docs.
Am I adding the regularization loss reg_loss correctly to the final loss function?
So far so good : )

Categories