Custom loss in Tensorflow 2.0.0 - python

I want to implement a custom loss which can be calculated by using each sample.
Calculation of the loss is a little complicated and requires me to use an external python file for this (or one can assume that we give the inputs to a function).
How can I implement this?
Is it possible to use #tf.function annotation and make it a graph?
This is how it is supposed to look
def loss(input,output):
loss = 0
for x, y in zip(input, output):
sim = Class(x)
a = sim.GetA()
b = sim.GetB()
loss = loss + np.linalg.norm(np.dot(a,b)+y)
return loss

An implementation of the same via PyTorch was possible as it supports dynamic computational graph

Related

Can I define two different call functions in Keras, one for a foward map and another for an inverse map

I'm trying to implement an INN (invertible neural network) with the structure as described in this paper.
I was wondering if it is possible to create a block (as proposed in the paper) as a custom keras layer with two different call functions.
The basic structur would look as follows:
import tensorflow as tf
import tensorflow.keras.layers as layers
class INNBlock(tf.keras.Model):
#inheriting from model instead of keras.layers.Layer, because I want manage the
#underlying layer as well
def __init__(self, size):
super(INNBlock, self).__init__(name='innblock')
#define layers
self.denseL1 = layers.Dense(size,activation='relu')
def call(self, inputs):
#define the relationship between the layers for a foward call
out = self.denseL1(inputs)
return out
def inverse_call(self, inputs):
#define inverse relationship between the layer
out = -self.denseL1(inputs) #use the same weights as the foward call
return out
class INN(tf.keras.Model):
def __init__(self,kenel_size,input_dim,min_clip,max_clip):
super(INN, self).__init__()
self.block_1 = INNBlock(size)
self.block_2 = INNBlock(size)
def call(self, inputs):
x = self.block_1(inputs)
x = self.block_2.inverse_call(y)
x = self.block_1.inverse_call(x)
return (y,x)
Solutions I already thought of (but don't particulary like):
Creating new layers for the inverse call and give them the same weights as the layers in the forward call.
Adding another dimension to inputs and have a variable in there, that determines whether or not the inverse call or the foward call is to be executed (but I don't know if this would even be allowed by keras)
I hope someone knows, if there is a way to implement this.
Thank you in advance :)
There is nothing wrong with your code. You can try it and it will run normally.
The call method is the standard method for when you simply do model_instance(input_tensor) or layer_instance(input_tensor).
But there is nothing wrong if you define another method and use that method inside the model's call method. What will happen is just:
If you use the_block(input_tensor), it will use the_block.call(input_tensor).
If you use the_block.inverse_call(input_tensor) somewhere outside a layer/model, it will fail to build a Keras model (nothing can be outside a layer)
If you use the_block.inverse_call(input_tensor) inside a layer/model (that's what you're doing), it is exactly the same as just writing the operations directly. You just wrapped it inside another function.
For Keras/Tensorflow, there will be nothing special about inverse_call. You can use it anywhere you could use any other keras/tensorflow function.
Will the gradients be updated twice?
Not exactly twice, but the operation will certainly be counted in. When the system calculates the gradient of the loss with relation to the weights, if the loss was built with inverse_call in the way, then it will participate in the gradient calculation.
But the update will be once per batch, as usual.

How to create a variable that persists over tf.estimator.train_and_evaluate evaluation steps?

TLDR: How to create a variable that holds the confusion matrix used for computing custom metrics, accumulating the values across all of the evaluation steps?
I have implemented custom metrics to use in the tf.estimator.train_and_evaluation pipeline, with a confusion matrix as the crux for them all. My goal is to make this confusion matrix persist over multiple evaluation steps in order to track the learning progress.
Using get_variable in the variable scope did not work, since it does not save the variable to the checkpoint (or so it seems).
This does not work:
#property
def confusion_matrix(self):
with tf.variable_scope(
f"{self.name}-{self.metric_type}", reuse=tf.AUTO_REUSE
):
confusion_matrix = tf.get_variable(
name="confusion-matrix",
initializer=tf.zeros(
[self.n_classes, self.n_classes],
dtype=tf.float32,
name=f"{self.name}/{self.metric_type}-confusion-matrix",
),
aggregation=tf.VariableAggregation.SUM,
)
return confusion_matrix
Just saving the matrix as a class attribute works, but it obviously does not persist over multple steps:
self.confusion_matrix = tf.zeros(
[self.n_classes, self.n_classes],
dtype=tf.float32,
name=f"{self.name}/{self.metric_type}-confusion-matrix",
)
You can look at the full example here.
I expect to have this confusion matrix persist from end to finish during evaluation, but I do not need to have it in the final SavedModel. Could you please tell me how I can achieve this? Do I need to just save the matrix to an external file, or there is a better way?
You can define a custom metric:
def confusion_matrix(labels, predictions):
matrix = ... # confusion matrix calculation
mean, update_op = tf.metrics.mean_tensor(matrix)
# do something with mean if needed
return {'confusion_matrix': (mean, udpate_op)}
and then add it to your estimator:
estimator = tf.estimator.add_metrics(estimator, confusion_matrix)
if you need sum instead of meen you can take insight from tf.metrics.mean_tensor implementation

Using tf.contrib.opt.ScipyOptimizerInterface with tf.keras.layers, loss not changing

I want to use the external optimizer interface within tensorflow, to use newton optimizers, as tf.train only has first order gradient descent optimizers. At the same time, i want to build my network using tf.keras.layers, as it is way easier than using tf.Variables when building large, complex networks. I will show my issue with the following, simple 1D linear regression example:
import tensorflow as tf
from tensorflow.keras import backend as K
import numpy as np
#generate data
no = 100
data_x = np.linspace(0,1,no)
data_y = 2 * data_x + 2 + np.random.uniform(-0.5,0.5,no)
data_y = data_y.reshape(no,1)
data_x = data_x.reshape(no,1)
# Make model using keras layers and train
x = tf.placeholder(dtype=tf.float32, shape=[None,1])
y = tf.placeholder(dtype=tf.float32, shape=[None,1])
output = tf.keras.layers.Dense(1, activation=None)(x)
loss = tf.losses.mean_squared_error(data_y, output)
optimizer = tf.contrib.opt.ScipyOptimizerInterface(loss, method="L-BFGS-B")
sess = K.get_session()
sess.run(tf.global_variables_initializer())
tf_dict = {x : data_x, y : data_y}
optimizer.minimize(sess, feed_dict = tf_dict, fetches=[loss], loss_callback=lambda x: print("Loss:", x))
When running this, the loss just does not change at all. When using any other optimizer from tf.train, it works fine. Also, when using tf.layers.Dense() instead of tf.keras.layers.Dense() it does work using the ScipyOptimizerInterface. So really the question is what is the difference between tf.keras.layers.Dense() and tf.layers.Dense(). I saw that the Variables created by tf.layers.Dense() are of type tf.float32_ref while the Variables created by tf.keras.layers.Dense() are of type tf.float32. As far as I now, _ref indicates that this tensor is mutable. So maybe that's the issue? But then again, any other optimizer from tf.train works fine with keras layers.
Thanks
After a lot of digging I was able to find a possible explanation.
ScipyOptimizerInterface uses feed_dicts to simulate the updates of your variables during the optimization process. It only does an assign operation at the very end. In contrast, tf.train optimizers always do assign operations. The code of ScipyOptimizerInterface is not that complex so you can verify this easily.
Now the problem is that assigining variables with feed_dict is working mostly by accident. Here is a link where I learnt about this. In other words, assigning variables via feed dict, which is what ScipyOptimizerInterface does, is a hacky way of doing updates.
Now this hack mostly works, except when it does not. tf.keras.layers.Dense uses ResourceVariables to model the weights of the model. This is an improved version of simple Variables that has cleaner read/write semantics. The problem is that under the new semantics the feed dict update happens after the loss calculation. The link above gives some explanations.
Now tf.layers is currently a thin wrapper around tf.keras.layer so I am not sure why it would work. Maybe there is some compatibility check somewhere in the code.
The solutions to adress this are somewhat simple.
Either avoid using components that use ResourceVariables. This can be kind of difficult.
Patch ScipyOptimizerInterface to do assignments for variables always. This is relatively easy since all the required code is in one file.
There was some effort to make the interface work with eager (that by default uses the ResourceVariables). Check out this link
I think the problem is with the line
output = tf.keras.layers.Dense(1, activation=None)(x)
In this format output is not a layer but rather the output of a layer, which might be preventing the wrapper from collecting the weights and biases of the layer and feed them to the optimizer. Try to write it in two lines e.g.
output = tf.keras.layers.Dense(1, activation=None)
res = output(x)
If you want to keep the original format then you might have to manually collect all trainables and feed them to the optimizer via the var_list option
optimizer = tf.contrib.opt.ScipyOptimizerInterface(loss, var_list = [Trainables], method="L-BFGS-B")
Hope this helps.

What is the internal mechanism when a Keras custom loss function access a global variable of python?

Whether does a Keras custom loss function accept global python variable?
I am building my own Keras custom loss function, which only accepts y_true and y_pred as arguments.But the loss function is quite complex and it depends on other variables.Currently in my implementation,the loss function just directly uses global variables in the same python code script.After training the model,if I want to use the model to do prediction,and then those global variables in the python environment will be changed. My question is that,do I need to compile the model again,to guarantee that the model has been updated with the latest version of those external global variables?
Rlist=....
def custom_loss(y_true,y_pred):
z = 0.0
#Rlist is the global variable
for j in Rlist:
z = z +K.log(K.sum(K.exp(K.gather(y_pred,j[0])))) \
- K.log(K.sum(K.exp(K.gather(y_pred,j))))
z = -z
return z
#below build the model and compile it with loss=custom_loss
model=...
model.compile(loss=custom_loss,....
model.fit(x=train_x,y=train_y,...)
#Rlist=... update Rlist which is adaptive to test dataset
#Do I need to recompile in the code below,or whether Rlist is updated
#in custom_loss when it is called?
model.predict(x=test_x,y=test_y,...)
In my loss function(actually this is the loss function for cox proportional hazard model),the loss is not additive among loss values for each samples.
Rlist is a global variable in the python environment of my Keras code
my question is that,after training the model,if I change this Rlist for
the test dataset,will Keras automatically update the Rlist,or it uses the old version of this variable Rlist when it compiles and builds the computation graph?
Is there any explanation that if I directly refer to a global variable from python environment in the loss function,then what will happen when Tensorflow builds its computation graph?
I know it's not a goop practice to use global variable.Better suggestions are also recommended.
What exactly do you mean by "python environment of my Keras code"? If you set the Rlist variable in your code while training to [1,2,3]. And then change it to [3,2,1] in prediction/production mode, you custom loss will see the [3,2,1] variable.
I'm not sure what you are trying to achieve, i suppose this could work:
A) Create a real ENV_Variable with RList
B) Create a JSON File with your RList (that way, you'll be able to use your RList data in production mode on server or cloud).
C) Create a Dict in your code like
RList={
'train': [1,2,3],
'test':[3,2,1],
'production':[4,5,6]
}

Tensorflow placeholder in Keras custom objective function

I need to implement a custom objective function for Keras where i need an additional tensorflow placeholder for computation. In tensorflow, i have it as following,
pre_cost1 = tf.multiply((self.input_R - self.Decoder) , self.input_mask_R)
cost1 = tf.square(self.l2_norm(pre_cost1))
where input_mask_R is the tensorflow placeholder. input_R and Decoder are the placeholders corresponding to y_true and y_pred for Keras loss function respectively. I have the Keras loss function implemented as,
def custom_objective(y_true, y_pred):
pre_cost1 = tf.multiply((y_true - y_pred))
cost1 = tf.square(l2_norm(pre_cost1))
return cost1
I need to add the additional information for input mask in the loss function for keras. (It needs to be tensorflow placeholder since its a mask for the input which is different for each row of the input data).
Use the keras backend:
import keras.backend as K
Most functions for tensors are there, such as:
input_mask_R = K.placeholder(shape=(yourshape))
But maybe, since you want a predefined mask, what you need is:
input_mask_R = K.constant(arrayWithValues, shape=(yourshape))
And you can actually multiply and square also with K.multiply and K.square. That way, if you ever think of changing the backend, everything will be ok. (Also I'm not sure if Keras will handle direct calls to tensorflow functions.....)
See documentation: https://keras.io/backend/

Categories