Trouble with loss function tf.nn.weighted_cross_entropy_with_logits

Trouble with loss function tf.nn.weighted_cross_entropy_with_logits - python

I am trying to train a u-net network with binary targets. The usual Binary Cross Entropy loss does not perform well, since the lables are very imbalanced (many more 0 pixels than 1s). So I want to punish false negatives more. But tensorflow doesn't have a ready-made weighted binary cross entropy. Since I didn't want to write a loss from scratch, I'm trying to use tf.nn.weighted_cross_entropy_with_logits. to be able to easily feed the loss to model.compile function, I'm writing this wrapper:
def loss_wrapper(y,x):
x = tf.cast(x,'float32')
loss = tf.nn.weighted_cross_entropy_with_logits(y,x,pos_weight=10)
return loss
However, regardless of casting x to float, I'm still getting the error:
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.
when the tf loss is called. Can someone explain what's happening?

If x represents your predictions. It probably already has the type float32. I think you need to cast y, which is presumably your labels. So:
loss = tf.nn.weighted_cross_entropy_with_logits(tf.cast(y, dtype=tf.float32),x,pos_weight=10)

Related

1. Weighted Loss in CrossEntropyLoss() 2. Combination of WeightedRandomSampler and subsampler

I wanted to implement class weights to my 3 class classification problem.
Tried by just directly adding the weights, which gives me an error when passing my model output and the labels to my loss
criterion = nn.CrossEntropyLoss(weight=torch.tensor([1,2,2]))
The error:
loss = criterion(out, labels)
expected scalar type Float but found Long
So I print dtypes and change them to float but it still gives me the same error
labels = labels.float()
print("Labels Training", labels, labels.dtype)
print("Out Training ", out, out.dtype)
loss = criterion(out, labels)
>>Labels Training tensor([2.]) torch.float32
>>Out Training tensor([[ 0.0540, -0.1439, -0.0070]], grad_fn=<AddmmBackward0>) torch.float32
>>expected scalar type Float but found Long
I also tried to change it to float64(), but it tells me that tensor Object has no attribute float64
Problem: I Havent tried this one out but I have seen that the more used approach would be the RandomWeightedSampler. My problem is that I use CV with K-Fold and use a SubSampler for that. Is it possible to use both? Havent foudn anything related to that.

For the first Problem, nn.CrossEntropyLoss requires the output to be of type float, label of type long, and weight of type float. Therefore, you should change the optional parameter of nn.CrossEntropyLoss "weight" to be float by:
criterion = nn.CrossEntropyLoss(weight=torch.tensor([1.0,2.0,2.0]))
loss = criterion(out, labels.long())

Can I train a Tensorflow keras model with complex input/output?

I am trying to train a very simple model which only have one convolution layer.
def kernel_model(filters=1, kernel_size=3):
input_layer = Input(shape=(250,1))
conv_layer = Conv1D(filters=filters,kernel_size=kernel_size,padding='same',use_bias = False)(input_layer)
model = Model(inputs=input_layer,output=conv_layer)
return model
But the input(X), prediction output(y_pred) and true_output(y_true) are all complex number. When I call the function model.fit(X,y_true)
There is the error
TypeError: Gradients of complex tensors must set grad_ys (y.dtype = tf.complex64)
Does that means I have to write the back-propagation by hand?
What should I do to solve this problem? thanks

Your DNN needs to mininimize the Loss-function through back-propagation. To minimize something, it naturally needs to have an ordering. Complex numbers are not ordered, while Reals are.
So you generally need a loss function L: Complex -> Reals
Change your complex-valued loss function from simple square:
error = K.cast(K.mean(K.square(y_pred_propgation - y_true)),tf.complex64)
to a real-valued magnitude ||.||^2 of the complex number:
error = K.mean(K.square(K.abs(y_true-y_pred)))

Compute cross entropy loss for classification in pytorch

I am trying to build two neural network for classification. One for Binary and the second is for multi-class classification. I am trying to use the torch.nn.CrossEntropyLoss() as a loss function, but I try to train my first neural network I get the following error:
multi-target not supported at /opt/conda/conda-bld/pytorch_1565272271120/work/aten/src/THNN/generic/ClassNLLCriterion.c:22
From my analysis, I found that the my dataset has two problems that caused the error.
My data set is one hot encoded. I used one hot encoding to pre processes my dataset. The first target Y_binary variable has the shape of torch.Size([125973, 1]) full of 0s and 1 indicating classes 'No' and 'Yes'.
My data has the wrong dimensions? I found that I can't use a simple vector with the cross entropy loss function. Some people used the following code to reshape their target vector before feeding to the loss function.
out = out.permute(0, 2, 3, 1).contiguous().view(-1, class_number)
But I didn't really understand the reasoning behind this code. But it seems for my that I need to keep track of the following variables: Class_Number, Batch_size, Dimension_Output. For my code here are the dimensions
X_train.shape: (125973, 122)
Y_train2.shape: (125973, 1)
batch_size = 64
K = len(set(Y_train2)) # Binary classification For multi class classification use K = len(set(Y_train5))
Should the target value be one hot encoded? If not, how I can feed a nominal feature to the loss function?
If I use reshape the output, can you help me do this for my code ?
I am trying to use this loss function for both my neural networks.
Thank you in advance,

The error is due to the usage of torch.nn.CrossEntropyLoss() which can be used if you want to predict 1 class out of N classes. For multiclass classification, you should use torch.nn.BCEWithLogitsLoss() which combines a Sigmoid layer and the BCELoss in one single class.
In case of multi-class, and if you use Sigmoid + BCELoss, then you need the target to be one-hot encoding, i.e. something like this per sample: [0 1 0 0 0 1 0 0 1 0], where 1 will be at the locations of classes present.

Tensorflow: Loss function for Binary classification (without one hot labels)

I am trying to use binary cross entropy for binary classification problem and keep running into following error, I have tried type casting as well as reshaping the tensor to shape [-1, 1], but nothing seems to work out.
My Last 2 Layers are defined as,
dense_fin2 = tf.layers.dense(inputs = dense_fin, units = 128, name = "dense_fin2")
logits = tf.sigmoid(tf.layers.dense(inputs = dense_fin2, units = 1, name = "logits"))
Loss function,
loss = labels * -tf.log(logits) + (1 - labels) * -tf.log(1 - logits)
loss = tf.reduce_mean(loss)
Error thrown by tensorflow,
ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("Neg:0", shape=(?, 1), dtype=float32)'
Extra information,
I am using Estimator API coupled with Dataset API. I have integer labels i.e. 0 or 1. They are NOT one-hot encoded. I understand this is doable by one hot encoding my labels but I do not want to take that path.

This error likely comes from trying to multiply integer-type labels with float-type logits. You can explicity cast the labels to float via tf.cast(labels, dtype=tf.float32). Unfortunately your question does not reveal whether you tried this specific casting.
However, for reasons of numerical stability I would advise you to use tf.nn.sigmoid_cross_entropy_with_logits instead (or tf.losses.sigmoid_cross_entropy). This is also a good idea for correctness; cross-entropy uses log-probabilities, but you are already putting in logits (which are log unnormalized probabilities) so the extra tf.log is actually wrong. You could also add a tf.nn.sigmoid activation to the output layer to make it correct, however the built-in functions are still preferrable for stability.

Tensorflow loss function no gradient error when provided with scalar

This question is about the tf.losses.huber_loss() function and how it can be applied on scalars rather than vectors. Thank you for your time!
My model is similar to a classification problem like MNIST. I based my code on the TensorFlow layers tutorial and made changes where I saw fit. I do not think the exact code is needed for my question.
I have lables that take integer values in {0,..,8}, that are converted into onehot labels like this:
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=n_classes)
The last layer in the model is
logits = tf.layers.dense(inputs=dense4, units=n_classes)
which is converted into predictions like this:
predictions = {"classes": tf.argmax(input=logits, axis=1), "probabilities": tf.nn.softmax(logits, name="softmax_tensor")}
From the tutorial, I started with the tf.losses.softmax_cross_entropy() loss function. But in my model, I am predicting in which discretized bin a value will fall. So I started looking for a loss function that would translate that a prediction of one bin off is less of a problem than two bins off. Something like the absolute_difference or Huber function.
The code
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=n_classes)
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits)
in combination with the optimizer:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=ps.learning_rate)
works without any errors. When changing to the Huber function:
loss = tf.losses.huber_loss(labels=onehot_labels, predictions=logits)
there are still no errors. But at this point I am unsure about what exactly happens. Based on the reduction definition I expect that the Huber function is applied pairwise for elements of the vectors and then summed up or averaged.
I would like to apply the Huber function only on the label integer (in {0,...,9}) and predicted value:
preds = tf.argmax(input=logits, axis=1)
So this is what I tried:
loss = tf.losses.huber_loss(labels=indices, predictions=preds)
This is raising the error
ValueError: No gradients provided for any variable
I have found two common causes that I do not think are happening in my situation:
This where there is no path between tf.Variable objects and the loss function. But since my prediction code is often used and the labels were provided as integers, I do not think this applies here.
The function is not derivable into a gradient. But the Huber function does work when vectors are used as input, so I do not think this is the case.
My question is: what code lets me use the Huber loss function on my two integer tensors (labels and predictions)?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Trouble with loss function tf.nn.weighted_cross_entropy_with_logits - python

If x represents your predictions. It probably already has the type float32. I think you need to cast y, which is presumably your labels. So: loss = tf.nn.weighted_cross_entropy_with_logits(tf.cast(y, dtype=tf.float32),x,pos_weight=10)

Related

1. Weighted Loss in CrossEntropyLoss() 2. Combination of WeightedRandomSampler and subsampler

Can I train a Tensorflow keras model with complex input/output?

Compute cross entropy loss for classification in pytorch

Tensorflow: Loss function for Binary classification (without one hot labels)

Tensorflow loss function no gradient error when provided with scalar

Categories

Resources