TensorFlow - How to minimize function of one variable? - python

I've been given a fully trained model by another researcher that has inputs as placeholders. Regarding it as a function f(x), I would like to find x to minimize my distance metric (loss function) dist(x, f(x)). This could be something like the euclidean distance between the two points.
I tried to use TensorFlow's built-in optimizer functions. The issue is that tf.train.AdamOptimizer(1e-4).minimize(loss, var_list[input_placeholder]) fails, complaining that input_placeholder isn't of a supported type. Thus, I cannot get gradients for my input.
How can I optimize a function in TensorFlow when the inputs have to be specified in this way? Unfortunately, these placeholders are not passed through a Variable first, and I have to treat that model as a black box.

Using the Keras functional API detailed in this question, I created a dense layer with no bias to sit right before the model I was given. Holding its input as a constant all 1's vector, I optimized the joined model using only the Variable in the dense layer, giving me the optimal vector as the output of that layer.
All TensorFlow Optimizer subclasses allow you to minimize while only modifying a particular set of Variables, which I got out of Keras fairly simply.

Related

Visualizing custom loss in double-head model

Using an A2C agent from this article, how to get numerical values of value_loss, policy_loss and entropy_loss when weights are being updated?
The model I'm using is double-headed, both heads share the same trunk. The policy head output shape is [number of actions, batch size] and value head has a shape of [1, batch_size]. Compiling this model returns a size incompatibility error, when these loss functions are given as metrics:
self.model.compile(optimizer=self.optimizer,
metrics=[self._logits_loss, self._value_loss],
loss=[self._logits_loss, self._value_loss])
Both self._value_loss and self._policy_loss are executed as graphs, meaning that all variables inside them are only pointers to graph nodes. I found some examples where Tensor objects are evaluated (with eval()) to get the value out of nodes. I don't understand them because in order to eval() a Tensor object you need to give it a Session but in TensorFlow 2.x Sessions are deprecated.
Another lead, when calling train_on_batch() from Model API in Keras to train the model, the method returns losses. I don't understand why, but the only losses it returns are from the policy head. Losses from that head are calculated as policy_loss - entropy_loss but my goal is to get all three losses separately to visualize them in a graph.
Any help is welcome, I'm stuck.
I found the answer to my problem. In Keras, the metrics built-in functionality provides an interface for measuring performance and losses of the model, be it a custom or standard one.
When compiling a model as follows:
self.model.compile(optimizer=ko.RMSprop(lr=lr),
metrics=dict(output_1=self._entropy_loss),
loss=dict(output_1=self._logits_loss, output_2=self._value_loss))
... self.model.train_on_batch([...]) returns a list of [total_loss, logits_loss, value_loss, entropy_loss]. By making a calculation of logits_loss + entropy_loss the value of policy_loss can be calculated. Beware, that this solution results in calling self._entropy_loss() twice.

Difference - tf.gradients vs tf.keras.backend.gradients

Being new to Tensorflow, I am trying to understand the difference between underlying functionality of tf.gradients and tf.keras.backend.gradients.
The latter finds the gradient of input feature values w.r.t cost function.
But I couldn't get a clear idea on the former whether it computes the gradient over cost function or output probabilities (For example, consider the case of binary classification using a simple feed forward network. Output probability here is referred to the Sigmoid activation outcome of final layer with single neuron. Cost is given by Binary cross entropy)
I have referred the official documentation for tf.gradients, but it is short and vague (for me), and I did not get a clear picture - The documentation mentions it as just 'y' - is it cost or output probability?
Why I need the gradients?
To implement a basic gradient based feature attribution.
They are basically the same. tf.keras is TensorFlow's high-level API for building and training deep learning models. It's used for fast prototyping, state-of-the-art research, and production. tf.Keras basically uses Tensorflow in its backend. By looking at tf.Keras source code here, we can see that tf.keras.backend.gradients indeed uses tf.gradients:
# Part of Keras.backend.py
from tensorflow.python.ops import gradients as gradients_module
#keras_export('keras.backend.gradients')
def gradients(loss, variables):
"""Returns the gradients of `loss` w.r.t. `variables`.
Arguments:
loss: Scalar tensor to minimize.
variables: List of variables.
Returns:
A gradients tensor.
"""
# ========
# Uses tensorflow's gradient function
# ========
return gradients_module.gradients(
loss, variables, colocate_gradients_with_ops=True)

reconstruct inputs from outputs in regression neural networks in tensorflow

Say we train a multilayer NN in tensorflow for a regression task (i.e. multi input and multi output case). Then we have new instances and we apply the trained model and of course we get the corresponding outputs. Is there a way to backpropagate the outputs and reconstruct the inputs in tensorflow in an easy/efficient manner? What I am thinking is to then use the difference of the original and the reconstructed inputs of the new instances as a QC measure i.e. if the reconstructed inputs are not close enough to the originals then we have a problem etc. I hope I am making myself clear.
No, unfortunately you cannot take a trained model and try to get the corresponding input. The reason for this is that you have infinite possible solutions for each output.
Furthermore, backpropagation is not passing an output backwards through the network. Its the idea of determining what parameters in the model are contributing to what extent to loss function. This will not give the inputs to these hidden layers, but the extent at which the weights affected your decision.

No gradients provided for any variable - optimizer error

I'm computing as follows:
#Compute the cost
cost = tf.reduce_mean(tf.square(y - out))
minimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
Upon running minimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost), I receive this error:
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'parameters:0' shape=(15,) dtype=float32_ref>", "<tf.Variable 'weights:0' shape=(6,) dtype=float32_ref>"] and loss Tensor("Mean_1:0", dtype=float32).
Where is this path wrong and why?
Short version: The problem causing the error message is that your model function dose not use any tensorflow variables.
Meaning: The only TF.variable that you are defining is w, which is not used in the model function. Thus, there are no wights in the model that tensorflow can optimize with regard to the loss function. If you want tensorflow to optimize the coefficients, use the variable w instead of the constant coefficients c in your model definition and make sure they have the same size.
Also, you are using non tensorflow functions in your model definition like the append function instead of tf.append. This adds to the problem.
There are a number of more problems in your code. For example, you double defined the global variable initializer and the session.
I guess the basic problem is that you did not yet understand the basic structure of the low level tensorflow API. Explicitly, the concept of the graph and session definition. You first need to define a graph containing the complete definition of your model and using only tensorflow functions. Only afterwards you start a session in which you initialize the wights and start to train it.

How do I implement the Triplet Loss in Keras?

I'm trying to implement Google's Facenet paper:
First of all, is it possible to implement this paper using the Sequential API of Keras or should I go for the Graph API?
In either case, could you please tell me how do I pass the custom loss function tripletLoss to the model compile and how do I receive the anchor embedding, positive embedding and the negative embedding as parameters to calculate the loss?
Also, what should be the second parameter Y in model.fit(), I do not have any in this case...
This issue explains how to create a custom objective (loss) in Keras:
def dummy_objective(y_true, y_pred):
return 0.5 # your implem of tripletLoss here
model.compile(loss=dummy_objective, optimizer='adadelta')
Regarding the y parameter of .fit(), since you are the one handling it in the end (the y_true parameter of the objective function is taken from it), I would say you can pass whatever you need that can fit through Keras plumbing. And maybe a dummy vector to pass dimension checks if your really don't need any supervision.
Eventually, as to how to implement this particular paper, looking for triplet or facenet in Keras doc didn't return anything. So you'll probably have to either implement it yourself or find someone who has.

Categories