Tensor to numpy conversion without gradient dependence

Tensor to numpy conversion without gradient dependence - python

I am creating a custom loss function to use with Keras in a CNN-architecture for segmentation. The loss should be a binary-cross-entropy-loss with each pixel weighted by the distance to the boundary of the GT foreground.
This distance is easily calculated with the scipy function scipy.ndimage.morphology.distance_transform_edt, but this functions requires a numpy-array as an input. For the loss function I only have "y_true" and "y_pred" which are both tensors.
I have tried converting "y_true" to a numpy array using np_y_true = y_true.eval(), but get the following error:
('conv3d_19_target' is the name for the placeholder of y_true. The shape of this is unknown to the program at this stage, though it is always (1,64,64,64,2).
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'conv3d_19_target' with dtype float and shape [?,?,?,?,?]
I have also tried np_y_true = y_true.numpy(), with the following result:
AttributeError: 'Tensor' object has no attribute 'numpy'
I believe there is two issues:
y_true is just a placeholder, and is therefore unknown when the loss function is first read.
Keras/tensorflow believes that the gradient should pass through all parts that are dependent on y_true. This is however not
necessary here, as this is just a weight parameter to be calculated
at each pass.
A first attempt on how I thought of my loss function:
def DFweighted_entropy():
def weighted_loss(y_true,y_pred):
np_ytrue = y_true.numpy() #OR
#np_y_true = K.eval(y_true)
#Calculate distance-field:
df_inside = distance_transform_edt(np_ytrue[:,:,:,1]) #Background
df_outside = distance_transform_edt(np_ytrue[:,:,:,0]) #Foreground
np_df = np_ytrue[:,:,:,1]*df_inside+np_ytrue[:,:,:,0]*df_outside #Combined
#Loss:
df_loss = (K.max(y_pred,0)-y_pred * y_true + K.log(1+K.exp((-1)*K.abs(y_pred))))*np_df
return df_loss
return weighted_loss
The loss function is used when the model is compiled:
model.compile(optimizer=keras.optimizers.Adam(lr=1e-4,beta_1=0.9, beta_2=0.999, epsilon=1e-08,decay=0.0),loss = DFweighted_entropy(), metrics=['acc',dice_coefficient])
Any ideas for solutions are appreciated!

Related

Optimizing Values that are on GPU

I am trying to optimize a PyTorch tensor which I am also using it as input to a network. Lets call this tensor "shape". My optimizer is as follows:
optimizer = torch.optim.Adam(
[shape],
lr=0.0001
)
I also am getting vertice values using this "shape" tensor:
vertices = model(shape)
And my loss function calculates loss as in differences of inferenced vertices and ground truth vertices:
loss = torch.sqrt(((gt_vertices - vertices) ** 2).sum(2)).mean(1).mean()
So what I am doing is actually estimating shape value. I am only interested in shape values. This works perfectly fine when everything is on CPU. However, when I put my shape and models on GPU by calling to("cuda"), I am getting the classic non-leaf Tensor error:
ValueError: can't optimize a non-leaf Tensor
Calling .detach().cpu() on shape inside optimizer solves the issue, but then gradient's cannot flow as they should be and my values are not updated. How can I make this work?

When .to('cuda'), e.g. calling shape_p = shape.to('cuda'), you are making a copy of shape. While shape remains a leaf tensor, shape_p is not, because it's 'parent' tensor is shape. Therefore shape_p is not a leaf and returns the error when trying to optimize it.
Sending it to CUDA device after having set the optimizer, would solve the issue (there are certain instances when this can't be possible though, see here).
>>> optimizer = torch.optim.Adam([shape], lr=0.0001)
>>> shape = shape.cuda()
The best option though, in my opinion, is to send it directly on init:
>>> shape = torch.rand(1, requires_grad=True, device='cuda')
>>> optimizer = torch.optim.Adam([shape], lr=0.0001)
This way it remains a leaf tensor.

Triplet networks using keras for RNN

I am trying to write a custom loss function for triplet loss(using keras), which takes 3 arguments anchor,positive and negative. The triplets are generated using gru layer and the arguments for model.fit is provided through data generators.
The problem I am facing is while training :
TypeError: Cannot convert a symbolic Keras input/output to a numpy array.
This error may indicate that you're trying to pass a symbolic value to a NumPy
call, which is not supported. Or, you may be trying to pass Keras symbolic
inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically
converting the API call to a lambda layer in the Functional Model.
Implementation of loss function
def batch_hard_triplet_loss(self, anchor_embeddings, pos_embeddings, neg_embeddings, margin):
def loss(y_true, y_pred):
'''print(anchor_embeddings)
print(pos_embeddings)
print(neg_embeddings)'''
# distance between the anchor and the positive
pos_dist = K.sum(K.square(anchor_embeddings - pos_embeddings), axis=-1)
max_pos_dist = K.max(pos_dist)
# distance between the anchor and the negative
neg_dist = K.sum(K.square(anchor_embeddings - neg_embeddings), axis=-1)
max_neg_dist = K.min(neg_dist)
# compute loss
basic_loss = max_pos_dist - max_neg_dist + margin
tr_loss = K.maximum(basic_loss, 0.0)
return tr_loss
#return triplet_loss
return loss
Can this be because keras is expecting array as returned loss but I am providing a scalar value?

Can I train a Tensorflow keras model with complex input/output?

I am trying to train a very simple model which only have one convolution layer.
def kernel_model(filters=1, kernel_size=3):
input_layer = Input(shape=(250,1))
conv_layer = Conv1D(filters=filters,kernel_size=kernel_size,padding='same',use_bias = False)(input_layer)
model = Model(inputs=input_layer,output=conv_layer)
return model
But the input(X), prediction output(y_pred) and true_output(y_true) are all complex number. When I call the function model.fit(X,y_true)
There is the error
TypeError: Gradients of complex tensors must set grad_ys (y.dtype = tf.complex64)
Does that means I have to write the back-propagation by hand?
What should I do to solve this problem? thanks

Your DNN needs to mininimize the Loss-function through back-propagation. To minimize something, it naturally needs to have an ordering. Complex numbers are not ordered, while Reals are.
So you generally need a loss function L: Complex -> Reals
Change your complex-valued loss function from simple square:
error = K.cast(K.mean(K.square(y_pred_propgation - y_true)),tf.complex64)
to a real-valued magnitude ||.||^2 of the complex number:
error = K.mean(K.square(K.abs(y_true-y_pred)))

Tensorflow loss function no gradient error when provided with scalar

This question is about the tf.losses.huber_loss() function and how it can be applied on scalars rather than vectors. Thank you for your time!
My model is similar to a classification problem like MNIST. I based my code on the TensorFlow layers tutorial and made changes where I saw fit. I do not think the exact code is needed for my question.
I have lables that take integer values in {0,..,8}, that are converted into onehot labels like this:
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=n_classes)
The last layer in the model is
logits = tf.layers.dense(inputs=dense4, units=n_classes)
which is converted into predictions like this:
predictions = {"classes": tf.argmax(input=logits, axis=1), "probabilities": tf.nn.softmax(logits, name="softmax_tensor")}
From the tutorial, I started with the tf.losses.softmax_cross_entropy() loss function. But in my model, I am predicting in which discretized bin a value will fall. So I started looking for a loss function that would translate that a prediction of one bin off is less of a problem than two bins off. Something like the absolute_difference or Huber function.
The code
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=n_classes)
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits)
in combination with the optimizer:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=ps.learning_rate)
works without any errors. When changing to the Huber function:
loss = tf.losses.huber_loss(labels=onehot_labels, predictions=logits)
there are still no errors. But at this point I am unsure about what exactly happens. Based on the reduction definition I expect that the Huber function is applied pairwise for elements of the vectors and then summed up or averaged.
I would like to apply the Huber function only on the label integer (in {0,...,9}) and predicted value:
preds = tf.argmax(input=logits, axis=1)
So this is what I tried:
loss = tf.losses.huber_loss(labels=indices, predictions=preds)
This is raising the error
ValueError: No gradients provided for any variable
I have found two common causes that I do not think are happening in my situation:
This where there is no path between tf.Variable objects and the loss function. But since my prediction code is often used and the labels were provided as integers, I do not think this applies here.
The function is not derivable into a gradient. But the Huber function does work when vectors are used as input, so I do not think this is the case.
My question is: what code lets me use the Huber loss function on my two integer tensors (labels and predictions)?

TensorFlow pass gradient unchaned

Say I have some custom operation binarizer used in a neural network. The operation takes a Tensor and constructs a new Tensor. I would like to modify that operation such that it is only used in the forward pass. In the backward pass, when gradients are calculated, it should just pass through the gradients reaching it.
More concretly, say binarizer is:
def binarizer(input):
prob = tf.truediv(tf.add(1.0, input), 2.0)
bernoulli = tf.contrib.distributions.Bernoulli(p=prob, dtype=tf.float32)
return 2 * bernoulli.sample() - 1
and I setup my network:
# ...
h1_before_my_op = tf.nn.tanh(tf.matmul(x, W) + bias_h1)
h1 = binarizer(h1_before_b)
# ...
loss = tf.reduce_mean(tf.square(y - y_true))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
How do I tell TensorFlow to skip gradient calculation in the backward pass?
I tried defining a custom operation as described in this answer, however: py_func cannot return Tensors, that's not what it is made for – I get:
UnimplementedError (see above for traceback): Unsupported object type Tensor

You're looking for tf.stop_gradient(input, name=None):
Stops gradient computation.
When executed in a graph, this op outputs its input tensor as-is.
h1 = binarizer(h1_before_b)
h1 = tf.stop_gradient(h1)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Tensor to numpy conversion without gradient dependence - python

Related

Optimizing Values that are on GPU

Triplet networks using keras for RNN

Can I train a Tensorflow keras model with complex input/output?

Tensorflow loss function no gradient error when provided with scalar

TensorFlow pass gradient unchaned

Categories

Resources