How do I implement the Triplet Loss in Keras? - python

I'm trying to implement Google's Facenet paper:
First of all, is it possible to implement this paper using the Sequential API of Keras or should I go for the Graph API?
In either case, could you please tell me how do I pass the custom loss function tripletLoss to the model compile and how do I receive the anchor embedding, positive embedding and the negative embedding as parameters to calculate the loss?
Also, what should be the second parameter Y in model.fit(), I do not have any in this case...

This issue explains how to create a custom objective (loss) in Keras:
def dummy_objective(y_true, y_pred):
return 0.5 # your implem of tripletLoss here
model.compile(loss=dummy_objective, optimizer='adadelta')
Regarding the y parameter of .fit(), since you are the one handling it in the end (the y_true parameter of the objective function is taken from it), I would say you can pass whatever you need that can fit through Keras plumbing. And maybe a dummy vector to pass dimension checks if your really don't need any supervision.
Eventually, as to how to implement this particular paper, looking for triplet or facenet in Keras doc didn't return anything. So you'll probably have to either implement it yourself or find someone who has.

Related

Best Practice for Transforming y_pred in Tensorflow's Metric

In my previous project, I need to frame an image classification task as a regression problem. I implement the regression model using Tensorflow, with standard Sequential model with a 1 node Dense layer with no activation function as the last layer. In order to measure the performance, I need to use standard classification metrics, such as accuracy and cohen kappa.
However, I can't directly use those metrics because my model is a regression model, so I need to clip and round the output before feeding them to the metrics. I use a workaround by defining my own metric, however that workaround is not practical. Therefore, I'm thinking about contributing to Tensorflow by implementing a custom transformation_function to transform y_pred by a Tensor lambda function before storing them in the __update_state method. After reading the source code, I get doubts regarding this idea. So, I'm asking out to you, fellow Tensorflow user/contributors, what is the best practice of transforming y_pred before feeding it to a metric? Is this functionality already implemented in the newest version?
Thank you!

Visualizing custom loss in double-head model

Using an A2C agent from this article, how to get numerical values of value_loss, policy_loss and entropy_loss when weights are being updated?
The model I'm using is double-headed, both heads share the same trunk. The policy head output shape is [number of actions, batch size] and value head has a shape of [1, batch_size]. Compiling this model returns a size incompatibility error, when these loss functions are given as metrics:
self.model.compile(optimizer=self.optimizer,
metrics=[self._logits_loss, self._value_loss],
loss=[self._logits_loss, self._value_loss])
Both self._value_loss and self._policy_loss are executed as graphs, meaning that all variables inside them are only pointers to graph nodes. I found some examples where Tensor objects are evaluated (with eval()) to get the value out of nodes. I don't understand them because in order to eval() a Tensor object you need to give it a Session but in TensorFlow 2.x Sessions are deprecated.
Another lead, when calling train_on_batch() from Model API in Keras to train the model, the method returns losses. I don't understand why, but the only losses it returns are from the policy head. Losses from that head are calculated as policy_loss - entropy_loss but my goal is to get all three losses separately to visualize them in a graph.
Any help is welcome, I'm stuck.
I found the answer to my problem. In Keras, the metrics built-in functionality provides an interface for measuring performance and losses of the model, be it a custom or standard one.
When compiling a model as follows:
self.model.compile(optimizer=ko.RMSprop(lr=lr),
metrics=dict(output_1=self._entropy_loss),
loss=dict(output_1=self._logits_loss, output_2=self._value_loss))
... self.model.train_on_batch([...]) returns a list of [total_loss, logits_loss, value_loss, entropy_loss]. By making a calculation of logits_loss + entropy_loss the value of policy_loss can be calculated. Beware, that this solution results in calling self._entropy_loss() twice.

Difference - tf.gradients vs tf.keras.backend.gradients

Being new to Tensorflow, I am trying to understand the difference between underlying functionality of tf.gradients and tf.keras.backend.gradients.
The latter finds the gradient of input feature values w.r.t cost function.
But I couldn't get a clear idea on the former whether it computes the gradient over cost function or output probabilities (For example, consider the case of binary classification using a simple feed forward network. Output probability here is referred to the Sigmoid activation outcome of final layer with single neuron. Cost is given by Binary cross entropy)
I have referred the official documentation for tf.gradients, but it is short and vague (for me), and I did not get a clear picture - The documentation mentions it as just 'y' - is it cost or output probability?
Why I need the gradients?
To implement a basic gradient based feature attribution.
They are basically the same. tf.keras is TensorFlow's high-level API for building and training deep learning models. It's used for fast prototyping, state-of-the-art research, and production. tf.Keras basically uses Tensorflow in its backend. By looking at tf.Keras source code here, we can see that tf.keras.backend.gradients indeed uses tf.gradients:
# Part of Keras.backend.py
from tensorflow.python.ops import gradients as gradients_module
#keras_export('keras.backend.gradients')
def gradients(loss, variables):
"""Returns the gradients of `loss` w.r.t. `variables`.
Arguments:
loss: Scalar tensor to minimize.
variables: List of variables.
Returns:
A gradients tensor.
"""
# ========
# Uses tensorflow's gradient function
# ========
return gradients_module.gradients(
loss, variables, colocate_gradients_with_ops=True)

Train two consecutive models in tensorflow

I am trying to build a model in tensorflow, while I use two consecutive models. Unfortunately I can't include them within one model. The first Model is basically an encoder, the second returns my needed value.
out = Model_a(image_input)
value = Model_b(out)
loss = f(value)
I can train Model_b using the given loss, but would then need the gradients of the first layer (of Model_b) regarding the loss to proceed for the gradient calculation in Model_a. Furthermore I would need somehow a function that calculates the gradients based on these gradients, instead of a loss function. Does anyone have an idea if tensorflow already has such functionality or had to tackle similar problems?
Cheers
I found a working solution, for any who has similar problems. Using Tensorflow 2.0 and the keras eager-mode (using GradientTape) one can construct any loss function as desired, even including consecutive models. Important is that the predict function will not work, one needs to use the direct call.
Now the gradients can be calculated for each model regarding that loss function, which seems to work so far. Important is that the models itself are included in the loss function and not a copy of the output or at least the copy is generated within the Tape environment. An example is found below:
optimizer = tf.keras.optimizers.Adam(lr=0.1)
with tf.GradientTape(persistent=True) as tape:
error = (model2(model1(x)) - y)
loss_value = tf.reduce_mean(tf.square(error))
gradients1 = tape.gradient(loss_value, model1.variables)
gradients2 = tape.gradient(loss_value, model2.variables)
optimizer.apply_gradients(zip(gradients1, model1.variables))
optimizer.apply_gradients(zip(gradients2, model2.variables))
If anyone finds a more efficient or "prettier" solution I would be happy if he/she shares it.
Cheers

TensorFlow - How to minimize function of one variable?

I've been given a fully trained model by another researcher that has inputs as placeholders. Regarding it as a function f(x), I would like to find x to minimize my distance metric (loss function) dist(x, f(x)). This could be something like the euclidean distance between the two points.
I tried to use TensorFlow's built-in optimizer functions. The issue is that tf.train.AdamOptimizer(1e-4).minimize(loss, var_list[input_placeholder]) fails, complaining that input_placeholder isn't of a supported type. Thus, I cannot get gradients for my input.
How can I optimize a function in TensorFlow when the inputs have to be specified in this way? Unfortunately, these placeholders are not passed through a Variable first, and I have to treat that model as a black box.
Using the Keras functional API detailed in this question, I created a dense layer with no bias to sit right before the model I was given. Holding its input as a constant all 1's vector, I optimized the joined model using only the Variable in the dense layer, giving me the optimal vector as the output of that layer.
All TensorFlow Optimizer subclasses allow you to minimize while only modifying a particular set of Variables, which I got out of Keras fairly simply.

Categories