Weight different misclassifications differently keras - python

I want my model to increase the loss for a false positive prediction when training by creating a custom loss function.
The class_weight parameter in model.fit() does not work for this issue. The class_weight is already set to { 0: 1, 1:23 } as I have skewed training data where there are 23 times as many non-true labels as there are true labels.
I am not too experienced when working with the keras backend. I have mostly worked with the functional model.
What I want to create is:
def weighted_binary_crossentropy(y_true, y_pred):
#where y_true == 0 and y_pred == 1:
# weight this loss and make it 50 times larger
#return loss
I can do simple stuff with the tensors such as getting the mean squared error but I have no idea how to do logical stuff.
I have tried to do some hacky solution which doesnt work and feels totally wrong:
def weighted_binary_crossentropy(y_true, y_pred):
false_positive_weight = 50
thresh = 0.5
y_pred_true = K.greater_equal(thresh,y_pred)
y_not_true = K.less_equal(thresh,y_true)
false_positive_tensor = K.equal(y_pred_true,y_not_true)
loss_weights = K.ones_like(y_pred) + false_positive_weight*false_positive_tensor
return K.binary_crossentropy(y_true, y_pred)*loss_weights
I am using python 3 with keras 2 and tensorflow as backend.
Thanks in advance!

I think you're almost there...
from keras.losses import binary_crossentropy
def weighted_binary_crossentropy(y_true, y_pred):
false_positive_weight = 50
thresh = 0.5
y_pred_true = K.greater_equal(thresh,y_pred)
y_not_true = K.less_equal(thresh,y_true)
false_positive_tensor = K.equal(y_pred_true,y_not_true)
#changing from here
#first let's transform the bool tensor in numbers - maybe you need float64 depending on your configuration
false_positive_tensor = K.cast(false_positive_tensor,'float32')
#and let's create it's complement (the non false positives)
complement = 1 - false_positive_tensor
#now we're going to separate two groups
falsePosGroupTrue = y_true * false_positive_tensor
falsePosGroupPred = y_pred * false_positive_tensor
nonFalseGroupTrue = y_true * complement
nonFalseGroupPred = y_pred * complement
#let's calculate one crossentropy loss for each group
#(directly from the keras loss functions imported above)
falsePosLoss = binary_crossentropy(falsePosGroupTrue,falsePosGroupPred)
nonFalseLoss = binary_crossentropy(nonFalseGroupTrue,nonFalseGroupPred)
#return them weighted:
return (false_positive_weight*falsePosLoss) + nonFalseLoss


Why PyTorch optimizer might fail to update its parameters?

I am trying to do a simple loss-minimization for a specific variable coeff using PyTorch optimizers. This variable is supposed to be used as an interpolation coefficient for two vectors w_foo and w_bar to find a third vector, w_target.
w_target = `w_foo + coeff * (w_bar - w_foo)
With w_foo and w_bar set as constant, at each optimization step I calculate w_target for the given coeff. Loss is determined from w_target using a fairly complex process beyond the scope of this question.
# w_foo.shape = [1, 16, 512]
# w_bar.shape = [1, 16, 512]
# num_layers = 16
# num_steps = 10000
vgg_loss = VGGLoss()
coeff = torch.randn([num_layers, ])
optimizer = torch.optim.Adam([coeff], lr=initial_learning_rate)
for step in range(num_steps):
w_target = w_foo + torch.matmul(coeff, (w_bar - w_foo))
target_image = generator.synthesis(w_target)
processed_target_image = process(target_image)
loss = vgg_loss(processed_target_image, source_image)
However, when running this optimizer, I am met with query_opt not changing from one step to another, optimizer being essentially useless. I would like to ask for some advice on what I am doing wrong here.
As suggested, I will try to elaborate on the loss function. Essentially, w_target is used to generate an image, and VGGLoss uses VGG feature extractor to compare this synthetic image with a certain exemplar source image.
class VGGLoss(torch.nn.Module):
def __init__(self, device, vgg):
for param in self.parameters():
param.requires_grad = True
self.vgg = vgg # VGG16 in eval mode
def forward(self, source, target):
loss = 0
source_features = self.vgg(source, resize_images=False, return_lpips=True)
target_features = self.vgg(target, resize_images=False, return_lpips=True)
loss += (source_features - target_features).square().sum()
return loss

Class Weight not supported for 3+ dimensional targets - Python Tensorflow [duplicate]

Here's the code I'm working with (pulled from Kaggle mostly):
outputs = Conv2D(4, (1, 1), activation='sigmoid') (c9)
model = Model(inputs=[inputs], outputs=[outputs])
model.compile(optimizer='adam', loss='dice', metrics=[mean_iou])
results = model.fit(X_train, Y_train, validation_split=0.1, batch_size=8, epochs=30, class_weight=class_weights)
I have 4 classes that are very imbalanced. Class A equals 70%, class B = 15%, class C = 10%, and class D = 5%. However, I care most about class D. So I did the following type of calculations: D_weight = A/D = 70/5 = 14 and so on for the weight for class B and A. (if there are better methods to select these weights, then feel free)
In the last line, I'm trying to properly set class_weights and I'm doing it as so: class_weights = {0: 1.0, 1: 6, 2: 7, 3: 14}.
However, when I do this, I get the following error.
class_weight not supported for 3+ dimensional targets.
Is it possible that I add a dense layer after the last layer and just use it as a dummy layer so I can pass the class_weights and then only use the output of the last conv2d layer to do the prediction?
If this is not possible, how would I modify the loss function (I'm aware of this post, however, just passing in the weights in to the loss function won't cut it, because the loss function is called separately for each class) ? Currently, I'm using the following loss function:
def dice_coef(y_true, y_pred):
smooth = 1.
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def bce_dice_loss(y_true, y_pred):
return 0.5 * binary_crossentropy(y_true, y_pred) - dice_coef(y_true, y_pred)
But I don't see any way in which I can input class weights. If someone wants the full working code see this post. But remember to change the final conv2d layer's num classes to 4 instead of 1.
You can always apply the weights yourself.
The originalLossFunc below you can import from keras.losses.
The weightsList is your list with the weights ordered by class.
def weightedLoss(originalLossFunc, weightsList):
def lossFunc(true, pred):
axis = -1 #if channels last
#axis= 1 #if channels first
#argmax returns the index of the element with the greatest value
#done in the class axis, it returns the class index
classSelectors = K.argmax(true, axis=axis)
#if your loss is sparse, use only true as classSelectors
#considering weights are ordered by class, for each class
#true(1) if the class index is equal to the weight index
classSelectors = [K.equal(i, classSelectors) for i in range(len(weightsList))]
#casting boolean to float for calculations
#each tensor in the list contains 1 where ground true class is equal to its index
#if you sum all these, you will get a tensor full of ones.
classSelectors = [K.cast(x, K.floatx()) for x in classSelectors]
#for each of the selections above, multiply their respective weight
weights = [sel * w for sel,w in zip(classSelectors, weightsList)]
#sums all the selections
#result is a tensor with the respective weight for each element in predictions
weightMultiplier = weights[0]
for i in range(1, len(weights)):
weightMultiplier = weightMultiplier + weights[i]
#make sure your originalLossFunc only collapses the class axis
#you need the other axes intact to multiply the weights tensor
loss = originalLossFunc(true,pred)
loss = loss * weightMultiplier
return loss
return lossFunc
For using this in compile:
model.compile(loss= weightedLoss(keras.losses.categorical_crossentropy, weights),
optimizer=..., ...)
Changing the class balance directly on the input data
You can change the balance of the input samples too.
For instance, if you have 5 samples from class 1 and 10 samples from class 2, pass the samples for class 5 twice in the input arrays.
Using the sample_weight argument.
Instead of working "by class", you can also work "by sample".
Create an array of weights for each sample in your input array: len(x_train) == len(weights)
And fit passing this array to the sample_weight argument.
(If it's fit_generator, the generator will have to return the weights along with the train/true pairs: return/yield inputs, targets, weights)

How can I get accuracy from confusion matrix in tensorflow or Keras in the form of a tensor?

I want to get UAR (unweighted accuracy) from confusion matrix to monitor UAR of validation data. However, it is difficult to deal with tensor.
I did refer to this site and try to create my own metrics in Keras.
I am making the metrics by using the first method to use both ModelCheckpoint and EarlyStopping supported by Keras.
model.compile(loss='categorical_crossentropy',optimizer=adam, metrics=['accuracy', uar_accuracy])
However, I don't know how to define the uar_accuracy function.
def uar_accuracy(y_true, y_pred):
# Calculate the label from one-hot encoding
pred_class_label = K.argmax(y_pred, axis=-1)
true_class_label = K.argmax(y_true, axis=-1)
cf_mat = tf.confusion_matrix(true_class_label, pred_class_label )
diag = tf.linalg.tensor_diag_part(cf_mat)
uar = K.mean(diag)
return uar
This result returns the average of the number of data right for each class.
But I do not want the average of the number of correct data, but I want the average of the correct probabilities for each class.
How can I implement it?
I have implemented the following for the numpy type, not the Tensor type using sklearn.metrics and collections library
def get_accuracy_and_cnf_matrix(label, predict):
uar = 0
accuracy = []
cnf_matrix = confusion_matrix(label, predict)
for index,i in enumerate(diag):
# cnf_marix (Number of corrects -> Accuracy)
cnf_matrix = np.transpose(cnf_matrix)
cnf_matrix = cnf_matrix*100 / cnf_matrix.astype(np.int).sum(axis=0)
cnf_matrix = np.transpose(cnf_matrix).astype(float)
cnf_matrix = np.around(cnf_matrix, decimals=2)
test_weighted_accuracy = np.sum(label==predict)/len(label)*100
test_unweighted_accuracy = uar/len(cnf_matrix)*100
return np.around(np.array(accuracy),decimals=2), cnf_matrix
You can use tf.reduce_sum to compute the sum of each row in your confusion matrix. This corresponds to the total number of data points for each class. Then you divide the diagonal elements with this row sum to compute the ratio of correctly predicted examples per class.
def non_nan_average(x):
# Computes the average of all elements that are not NaN in a rank 1 tensor
nan_mask = tf.debugging.is_nan(x)
x = tf.boolean_mask(x, tf.logical_not(nan_mask))
return K.mean(x)
def uar_accuracy(y_true, y_pred):
# Calculate the label from one-hot encoding
pred_class_label = K.argmax(y_pred, axis=-1)
true_class_label = K.argmax(y_true, axis=-1)
cf_mat = tf.confusion_matrix(true_class_label, pred_class_label )
diag = tf.linalg.tensor_diag_part(cf_mat)
# Calculate the total number of data examples for each class
total_per_class = tf.reduce_sum(cf_mat, axis=1)
acc_per_class = diag / tf.maximum(1, total_per_class)
uar = non_nan_average(acc_per_class)
return uar

Only evaluate non-zero values of tf.Tensor

I am training to train a Neural Network using Keras and I am using my own metric function as the loss function. The reason for this is that the actual values in the test set have a lot of NaN values. Let me give an example of the actual values in the test set:
In the preprocessing of my data, I replaced all the NaN values with zeros, so the above example contains zeros on each NaN row.
The Neural Network produces an output like this:
I only want to calculate the root mean squared error between the non-zero values. So for the example above, it should only calculate the RMSE for rows 1, 5 and 8. To do this, I created the following function:
from sklearn.metrics import mean_squared_error
from math import sqrt
def evaluation_metric(y_true, y_pred):
y_true = y_true[np.nonzero(y_true)]
y_pred = y_pred[np.nonzero(y_true)]
error = sqrt(mean_squared_error(y_true, y_pred))
return error
When you test the function by hand, by feeding the actual values from the test set and an output from the neural network that is initialized with random weights, it works well an produces an error value. I am able to optimize the weights using an Evolutionary approach, and I am able to optimize this error measure by adjusting the weights of the network.
Now, I want to train the network with evaluation_metric as the loss function using the model.compile function from Keras. When I run:
model.compile(loss=evaluation_metric, optimizer='rmsprop', metrics=[evaluation_metric])
I get the following error:
TypeError: Using a tf.Tensor as a Python bool is not allowed. Use if t is not None: instead of if t: to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
I think this has to do with the usage of np.nonzero. Since I am working with Keras, I should probably use a function of the Keras Backend, or using something like tf.cond to check for the non zero values of y_true.
Can someone help me with this?
The code works after applying the following fix:
def evaluation_metric(y_true, y_pred):
y_true = y_true * (y_true != 0)
y_pred = y_pred * (y_true != 0)
error = root_mean_squared_error(y_true, y_pred)
return error
Along with the following function for calculating the RMSE of a tf object:
def root_mean_squared_error(y_true, y_pred):
return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))
Yes, indeed the problem lies in using numpy function. Here is a quick fix:
def evaluation_metric(y_true, y_pred):
y_true = y_true * (y_true != 0)
y_pred = y_pred * (y_true != 0)
error = sqrt(mean_squared_error(y_true, y_pred))
return error
I would write the metric in tensorflow on my own like:
import tensorflow as tf
import numpy as np
data = np.array([0, 1, 2, 0, 0, 3, 7, 0]).astype(np.float32)
pred = np.random.randn(8).astype(np.float32)
gt = np.random.randn(8).astype(np.float32)
data_op = tf.convert_to_tensor(data)
pred_op = tf.convert_to_tensor(pred)
gt_op = tf.convert_to_tensor(gt)
expected = np.sqrt(((gt[data != 0] - pred[data != 0]) ** 2).mean())
def nonzero_mean(gt_op, pred_op, data_op):
mask_op = 1 - tf.cast(tf.equal(data_op, 0), tf.float32)
actual_op = ((gt_op - pred_op) * mask_op)**2
actual_op = tf.reduce_sum(actual_op) / tf.cast(tf.count_nonzero(mask_op), tf.float32)
actual_op = tf.sqrt(actual_op)
return actual_op
with tf.Session() as sess:
actual = sess.run(nonzero_mean(gt_op, pred_op, data_op))
print actual, expected
The y_true != 0 is not possible in plain Tensorflow. Not sure, if keras does some magic here.

Custom loss function implementation

I'm trying to implement a new loss function of my own.
When I tried to debug it (or print in it) I've noticed it is called only once at the model creating section of the code.
How can I know what y_pred and y_true contains (shapes, data etc..) if I cannot run my code into this function while fitting the model?
I wrote this loss function:
def my_loss(y_true, y_pred):
# run over the sequence, jump by 3
# calc the label
# if the label incorrect punish
y_pred = K.reshape(y_pred, (1, 88, 3))
y_pred = K.argmax(y_pred, axis=1)
zero_count = K.sum(K.clip(y_pred, 0, 0))
one_count = K.sum(K.clip(y_pred, 1, 1))
two_count = K.sum(K.clip(y_pred, 2, 2))
zero_punish = 1 - zero_count / K.count_params(y_true)
one_punish = 1- one_count/ K.count_params(y_true)
two_punish = 1- two_count/ K.count_params(y_true)
false_arr = K.not_equal(y_true, y_pred)
mask0 = K.equal(y_true, K.zeros_like(y_pred))
mask0_miss = K.dot(false_arr, mask0) * zero_punish
mask1 = K.equal(y_true, K.ones_like(y_pred))
mask1_miss = K.dot(false_arr, mask1) * one_punish
mask2 = K.equal(y_true, K.zeros_like(y_pred)+2)
mask2_miss = K.dot(false_arr, mask2) * two_punish
return K.sum(mask0_miss) + K.sum(mask1_miss) + K.sum(mask2_miss)
It fails on:
theano.gof.fg.MissingInputError: A variable that is an input to the graph was
neither provided as an input to the function nor given a value. A chain of
variables leading from this input to an output is [/dense_1_target, Shape.0].
This chain may not be unique
Backtrace when the variable is created:
How can I fix it?
You have to understand that Theano is a symbolic language. For example, when we define the following loss function in Keras:
def myLossFn(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)
Theano is just making a symbolic rule in a computational graph, which would be executed when it gets values i.e. when you train the model with some mini-batches.
As far as your question on how to debug your model goes, you can use theano.function for that. Now, you want to know if your loss calculation is correct. You do the following.
You can implement the python/numpy version of your loss function. Pass two random vectors to your numpy-loss-function and get a number. To verify if theano gives nearly identical result, define something as follows.
import theano
from theano import tensor as T
from keras import backend as K
Y_true = T.frow('Y_true')
Y_pred = T.fcol('Y_pred')
out = K.mean(K.abs(Y_pred - Y_true), axis=-1)
f = theano.function([Y_true, Y_pred], out)
# creating some values
y_true = np.random.random((10,))
y_pred = np.random.random((10,))
numpy_loss_result = np.mean(np.abs(y_true-y_pred))
theano_loss_result = f(y_true, y_pred)
# check if both are close enough
print numpy_loss_result-theano_loss_result # should be less than 1e-5
Basically, theano.function is a way to put values and evaluate those symbolic expressions. I hope this helps.
