I wanted to implement class weights to my 3 class classification problem.
Tried by just directly adding the weights, which gives me an error when passing my model output and the labels to my loss
criterion = nn.CrossEntropyLoss(weight=torch.tensor([1,2,2]))
The error:
loss = criterion(out, labels)
expected scalar type Float but found Long
So I print dtypes and change them to float but it still gives me the same error
labels = labels.float()
print("Labels Training", labels, labels.dtype)
print("Out Training ", out, out.dtype)
loss = criterion(out, labels)
>>Labels Training tensor([2.]) torch.float32
>>Out Training tensor([[ 0.0540, -0.1439, -0.0070]], grad_fn=<AddmmBackward0>) torch.float32
>>expected scalar type Float but found Long
I also tried to change it to float64(), but it tells me that tensor Object has no attribute float64
Problem: I Havent tried this one out but I have seen that the more used approach would be the RandomWeightedSampler. My problem is that I use CV with K-Fold and use a SubSampler for that. Is it possible to use both? Havent foudn anything related to that.
For the first Problem, nn.CrossEntropyLoss requires the output to be of type float, label of type long, and weight of type float. Therefore, you should change the optional parameter of nn.CrossEntropyLoss "weight" to be float by:
criterion = nn.CrossEntropyLoss(weight=torch.tensor([1.0,2.0,2.0]))
loss = criterion(out, labels.long())
Related
I am trying to train a u-net network with binary targets. The usual Binary Cross Entropy loss does not perform well, since the lables are very imbalanced (many more 0 pixels than 1s). So I want to punish false negatives more. But tensorflow doesn't have a ready-made weighted binary cross entropy. Since I didn't want to write a loss from scratch, I'm trying to use tf.nn.weighted_cross_entropy_with_logits. to be able to easily feed the loss to model.compile function, I'm writing this wrapper:
def loss_wrapper(y,x):
x = tf.cast(x,'float32')
loss = tf.nn.weighted_cross_entropy_with_logits(y,x,pos_weight=10)
return loss
However, regardless of casting x to float, I'm still getting the error:
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.
when the tf loss is called. Can someone explain what's happening?
If x represents your predictions. It probably already has the type float32. I think you need to cast y, which is presumably your labels. So:
loss = tf.nn.weighted_cross_entropy_with_logits(tf.cast(y, dtype=tf.float32),x,pos_weight=10)
I am playing around with a keras.io example where a Variational Autoencoder is build, which can be found here: https://keras.io/examples/generative/vae/#variational-autoencoder
There I am trying to replace the binary_crossentropy-loss with a MeanSquaredError-loss but I am getting a TypeError. What do I have to do, to get it run?
# reconstruction_loss = tf.reduce_mean(tf.reduce_sum(keras.losses.binary_crossentropy(data, reconstruction), axis=(1, 2)))
reconstruction_loss = tf.reduce_mean(tf.reduce_sum(keras.losses.MeanAbsoluteError(data, reconstruction), axis=(1, 2)))
Error message:
TypeError: Expected float32 passed to parameter 'y' of op 'Equal', got 'auto' of type 'str' instead. Error: Expected float32, got 'auto' of type 'str' instead.
Furthermore I don't understand why a binary_crossentropy-loss is used, because I understood this loss as an classification-loss but here I am comparing the values of my original data with its reconstruction, which is rather something like a regression rather than a classification. So, why is it still appropriate to use the crossentropy-loss?
When I am running the code with the crossentropy-loss and looking on my losses, then the KL-loss and the reconstruction-loss do not sum up to the total-loss. I mean the total-loss is always unequal to the sum out of KL-loss and reconstruction-loss, although in the train_step-method it is correctly programmed. So, why is there a deviation?
Concerning 3.: There is a deviation because each loss is averaged over all samples within a batch. If each sample is considered, than the reconstruction_loss and the KL_loss do sum up to the total_loss
Concerning 1.: I think, that the tf.reduce_sum command is not correct, because it is summing up all pixel losses. I think it should be rather tf.reduce_mean, which takes the average loss over all pixels. I think it is important in order to have the reconstruction_loss in balance to the KL_loss. This unfortuntley leed to bad training results, which I interpret as "posterior collapse" (still not sure, if this is really posterior collapse). The following plot shows the bad distributed latent space. For further explanation see: https://keras.io/examples/generative/vae/#variational-autoencoder
[
A solution that I have found is to weight the losses differently like this:
total_loss = 1000*reconstruction_loss + kl_loss
This led to correct results:
Are my interpretations about taking the mean rather than the sum and about the posterior collapse correct? Is it approxpriate to weight my losses?
I am creating a custom loss function to use with Keras in a CNN-architecture for segmentation. The loss should be a binary-cross-entropy-loss with each pixel weighted by the distance to the boundary of the GT foreground.
This distance is easily calculated with the scipy function scipy.ndimage.morphology.distance_transform_edt, but this functions requires a numpy-array as an input. For the loss function I only have "y_true" and "y_pred" which are both tensors.
I have tried converting "y_true" to a numpy array using np_y_true = y_true.eval(), but get the following error:
('conv3d_19_target' is the name for the placeholder of y_true. The shape of this is unknown to the program at this stage, though it is always (1,64,64,64,2).
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'conv3d_19_target' with dtype float and shape [?,?,?,?,?]
I have also tried np_y_true = y_true.numpy(), with the following result:
AttributeError: 'Tensor' object has no attribute 'numpy'
I believe there is two issues:
y_true is just a placeholder, and is therefore unknown when the loss function is first read.
Keras/tensorflow believes that the gradient should pass through all parts that are dependent on y_true. This is however not
necessary here, as this is just a weight parameter to be calculated
at each pass.
A first attempt on how I thought of my loss function:
def DFweighted_entropy():
def weighted_loss(y_true,y_pred):
np_ytrue = y_true.numpy() #OR
#np_y_true = K.eval(y_true)
#Calculate distance-field:
df_inside = distance_transform_edt(np_ytrue[:,:,:,1]) #Background
df_outside = distance_transform_edt(np_ytrue[:,:,:,0]) #Foreground
np_df = np_ytrue[:,:,:,1]*df_inside+np_ytrue[:,:,:,0]*df_outside #Combined
#Loss:
df_loss = (K.max(y_pred,0)-y_pred * y_true + K.log(1+K.exp((-1)*K.abs(y_pred))))*np_df
return df_loss
return weighted_loss
The loss function is used when the model is compiled:
model.compile(optimizer=keras.optimizers.Adam(lr=1e-4,beta_1=0.9, beta_2=0.999, epsilon=1e-08,decay=0.0),loss = DFweighted_entropy(), metrics=['acc',dice_coefficient])
Any ideas for solutions are appreciated!
I am trying to use binary cross entropy for binary classification problem and keep running into following error, I have tried type casting as well as reshaping the tensor to shape [-1, 1], but nothing seems to work out.
My Last 2 Layers are defined as,
dense_fin2 = tf.layers.dense(inputs = dense_fin, units = 128, name = "dense_fin2")
logits = tf.sigmoid(tf.layers.dense(inputs = dense_fin2, units = 1, name = "logits"))
Loss function,
loss = labels * -tf.log(logits) + (1 - labels) * -tf.log(1 - logits)
loss = tf.reduce_mean(loss)
Error thrown by tensorflow,
ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("Neg:0", shape=(?, 1), dtype=float32)'
Extra information,
I am using Estimator API coupled with Dataset API. I have integer labels i.e. 0 or 1. They are NOT one-hot encoded. I understand this is doable by one hot encoding my labels but I do not want to take that path.
This error likely comes from trying to multiply integer-type labels with float-type logits. You can explicity cast the labels to float via tf.cast(labels, dtype=tf.float32). Unfortunately your question does not reveal whether you tried this specific casting.
However, for reasons of numerical stability I would advise you to use tf.nn.sigmoid_cross_entropy_with_logits instead (or tf.losses.sigmoid_cross_entropy). This is also a good idea for correctness; cross-entropy uses log-probabilities, but you are already putting in logits (which are log unnormalized probabilities) so the extra tf.log is actually wrong. You could also add a tf.nn.sigmoid activation to the output layer to make it correct, however the built-in functions are still preferrable for stability.
This question is about the tf.losses.huber_loss() function and how it can be applied on scalars rather than vectors. Thank you for your time!
My model is similar to a classification problem like MNIST. I based my code on the TensorFlow layers tutorial and made changes where I saw fit. I do not think the exact code is needed for my question.
I have lables that take integer values in {0,..,8}, that are converted into onehot labels like this:
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=n_classes)
The last layer in the model is
logits = tf.layers.dense(inputs=dense4, units=n_classes)
which is converted into predictions like this:
predictions = {"classes": tf.argmax(input=logits, axis=1), "probabilities": tf.nn.softmax(logits, name="softmax_tensor")}
From the tutorial, I started with the tf.losses.softmax_cross_entropy() loss function. But in my model, I am predicting in which discretized bin a value will fall. So I started looking for a loss function that would translate that a prediction of one bin off is less of a problem than two bins off. Something like the absolute_difference or Huber function.
The code
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=n_classes)
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits)
in combination with the optimizer:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=ps.learning_rate)
works without any errors. When changing to the Huber function:
loss = tf.losses.huber_loss(labels=onehot_labels, predictions=logits)
there are still no errors. But at this point I am unsure about what exactly happens. Based on the reduction definition I expect that the Huber function is applied pairwise for elements of the vectors and then summed up or averaged.
I would like to apply the Huber function only on the label integer (in {0,...,9}) and predicted value:
preds = tf.argmax(input=logits, axis=1)
So this is what I tried:
loss = tf.losses.huber_loss(labels=indices, predictions=preds)
This is raising the error
ValueError: No gradients provided for any variable
I have found two common causes that I do not think are happening in my situation:
This where there is no path between tf.Variable objects and the loss function. But since my prediction code is often used and the labels were provided as integers, I do not think this applies here.
The function is not derivable into a gradient. But the Huber function does work when vectors are used as input, so I do not think this is the case.
My question is: what code lets me use the Huber loss function on my two integer tensors (labels and predictions)?