I have a "How can I do that" question with keras :
Assuming that I have a first neural network, say NNa which has 4 inputs (x,y,z,t) which is already trained.
If I have a second neural network, say NNb, and that its loss function depends on the first neural network.
The custom loss function of NNb customLossNNb calls the prediction of NNa with a fixed grid (x,y,z) and just modify the last variable t.
Here in pseudo-python-code what I would like to do to traine the second NN : NNb:
grid=np.mgrid[0:10:1,0:10:1,0:10:1].reshape(3,-1).T
Y[:,0]=time
Y[:,1]=something
def customLossNNb(NNa,grid):
def diff(y_true,y_pred):
for ii in range(y_true.shape[0]):
currentInput=concatenation of grid and y_true[ii,0]
toto[ii,:]=NNa.predict(currentInput)
#some stuff with toto
return #...
return diff
Then
NNb.compile(loss=customLossNNb(NNa,K.variable(grid)),optimizer='Adam')
NNb.fit(input,Y)
In fact the line that cause me troubles is currentInput=concatenation of grid and y_true[ii,0]
I tried to send to customLossNNb the grid as a tensor with K.variable(grid). But I can't defined a new tensor inside the loss function, something like CurrentY which has a shape (grid.shape[0],1) fill with y[ii,0](i.e. the current t) and then concatenate grid and currentY to build currentInput
Any ideas?
Thanks
You can include your custom loss function into the graph using functional API of keras. The model in this case can be used as a function, something like this:
for l in NNa.layers:
l.trainable=False
x=Input(size)
y=NNb(x)
z=NNa(y)
Predict method will not work, since loss function should be part of the graph, and predict method returns np.array
First, make NNa untrainable. Notice that you should do this recursively if your model has inner models.
def makeUntrainable(layer):
layer.trainable = False
if hasattr(layer, 'layers'):
for l in layer.layers:
makeUntrainable(l)
makeUntrainable(NNa)
Then you have two options:
Attach NNa to the end of your model (notice that both y_true and y_pred will be changed)
Then change your targets (predict with NNa) for correct results since your model is now expecting the output of NNa, not NNb.
Create a custom loss function that uses NNa inside it, without changing your targets
Option 1 - Attaching models
inputs = NNb.inputs
outputs = NNa(NNb.outputs) #make sure NNb is outputing 4 tensors to match NNa inputs
fullModel = Model(inputs,outputs)
#changing the targets:
newY_train = NNa.predict(oldY_train)
Option 2 - Creating a custom loss
Warning: please test whether NNa's weights are really frozen while training this configuration
from keras.losses import binary_crossentropy
def customLoss(true,pred):
true = NNa(true)
pred = NNa(pred)
#use some of the usual losses or create your own
binary_crossentropy(true,pred)
NNb.compile(optimizer=anything, loss = customLoss)
Related
Is there a way to pass a feature to a keras model as an input only to be accessed by a custom loss function without affecting the model as an input feature? I only need the feature to calculate the loss, not to feed-forward through the hidden layers in the network. (Basically what I want is to feed the feature in as an input and extract it as it is as an output along with y_pred to be accessed in the loss function).
A worked example would be much appreciated.
If you are writing your custom loss, you could use pass the feature as an input, and then using a Lambda layer, you can make it bypass the network and directly concatenate at the end. Something like the following -
from tensorflow.keras import layers, Model, utils
inp = layers.Input((11,))
x = layers.Lambda(lambda x: x[:,:-1])(inp)
o2 = layers.Lambda(lambda x: x[:,-1:])(inp)
x = layers.Dense(20)(x)
x = layers.Dense(20)(x)
o1 = layers.Dense(1)(x)
out = layers.concatenate([o1, o2])
model = Model(inp, out)
def custom_loss(outputs, actuals):
...
utils.plot_model(model, show_shapes=True, show_layer_names=False)
Here the first 10 features are the ones you want to pass via the network, and the last feature is the one you just want as is, for the custom loss. The final output is going to just be a concatenation of your expected output for the first 10 features via the network + the untouched feature.
If you want to know how to write a custom loss, please check this excellent SO post that explains it.
I want to create a custom loss which gets the output of the net and multiple arguments from a data generator.
I found this article, which describes how to calculate one loss from multiple layers with one label. But I want to calculate the loss from a single layer with multiple labels using the fit_generator. My problem is that Keras expects the output and the label to be of the same shape.
example:
Regular custom loss:
def custom_loss(y_pred, y_label):
return K.mean(y_pred - y_label)
An example for the type of custom loss I want to use:
def custom_loss(y_pred, y_label, y_weights):
loss = K.mean(y_pred - y_label)
return tf.compat.v1.losses.compute_weighted_loss(loss, y_weights)
This is just an example my original code is a little more complicated. I just want to be able to give the loss function two parameters (y_label and y_weights) instead of only one (y_label).
Does anyone know how to solve this problem?
I am not sure what exactly you are asking but maybe you can use this. You can try something like a custom function that returns a loss function.
def custom_loss(y_weights):
# Create a loss function that calculates what you want
def example_loss(y_true,y_pred):
loss = K.mean(y_pred - y_label)
return tf.compat.v1.losses.compute_weighted_loss(loss, y_weights)
# Return a function
return example_loss
# Compile the model
model.compile(optimizer='adam',
loss=custom_loss(y_weights), # Call the loss function with the preferred weights
metrics=['accuracy'])
You can also take a look at this question
I have CNN that I have built using on Tensor-flow 2.0. I need to access outputs of the intermediate layers. I was going over other stackoverflow questions that were similar but all had solutions involving Keras sequential model.
I have tried using model.layers[index].output but I get
Layer conv2d has no inbound nodes.
I can post my code here (which is super long) but I am sure even without that someone can point to me how it can be done using just Tensorflow 2.0 in eager mode.
I stumbled onto this question while looking for an answer and it took me some time to figure out as I use the model subclassing API in TF 2.0 by default (as in here https://www.tensorflow.org/tutorials/quickstart/advanced).
If somebody is in a similar situation, all you need to do is assign the intermediate output you want, as an attribute of the class. Then keep the test_step without the #tf.function decorator and create its decorated copy, say val_step, for efficient internal computation of validation performance during training. As a short example, I have modified a few functions of the tutorial from the link accordingly. I'm assuming we need to access the output after flattening.
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
self.intermediate=x #assign it as an object attribute for accessing later
x = self.d1(x)
return self.d2(x)
#Remove #tf.function decorator from test_step for prediction
def test_step(images, labels):
predictions = model(images, training=False)
t_loss = loss_object(labels, predictions)
test_loss(t_loss)
test_accuracy(labels, predictions)
return
#Create a decorated val_step for object's internal use during training
#tf.function
def val_step(images, labels):
return test_step(images, labels)
Now when you run model.predict() after training, using the un-decorated test step, you can access the intermediate output using model.intermediate which would be an EagerTensor whose value is obtained simply by model.intermediate.numpy(). However, if you don't remove the #tf_function decorator from test_step, this would return a Tensor whose value is not so straightforward to obtain.
Thanks for answering my earlier question. I wrote this simple example to illustrate how what you're trying to do might be done in TensorFlow 2.x, using the MNIST dataset as the example problem.
The gist of the approach:
Build an auxiliary model (aux_model in the example below), which is so-called "functional model" with multiple outputs. The first output is the output of the original model and will be used for loss calculation and backprop, while the remaining output(s) are the intermediate-layer outputs that you want to access.
Use tf.GradientTape() to write a custom training loop and expose the detailed gradient values on each individual variable of the model. Then you can pick out the gradients that are of interest to you. This requires that you know the ordering of the model's variables. But that should be relatively easy for a sequential model.
import tensorflow as tf
(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()
# This is the original model.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=[28, 28, 1]),
tf.keras.layers.Dense(100, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")])
# Make an auxiliary model that exposes the output from the intermediate layer
# of interest, which is the first Dense layer in this case.
aux_model = tf.keras.Model(inputs=model.inputs,
outputs=model.outputs + [model.layers[1].output])
# Define a custom training loop using `tf.GradientTape()`, to make it easier
# to access gradients on specific variables (the kernel and bias of the first
# Dense layer in this case).
cce = tf.keras.losses.CategoricalCrossentropy()
optimizer = tf.optimizers.Adam()
with tf.GradientTape() as tape:
# Do a forward pass on the model, retrieving the intermediate layer's output.
y_pred, intermediate_output = aux_model(x_train)
print(intermediate_output) # Now you can access the intermediate layer's output.
# Compute loss, to enable backprop.
loss = cce(tf.one_hot(y_train, 10), y_pred)
# Do backprop. `gradients` here are for all variables of the model.
# But we know we want the gradients on the kernel and bias of the first
# Dense layer, which happens to be the first two variables of the model.
gradients = tape.gradient(loss, aux_model.variables)
# This is the gradient on the first Dense layer's kernel.
intermediate_layer_kerenl_gradients = gradients[0]
print(intermediate_layer_kerenl_gradients)
# This is the gradient on the first Dense layer's bias.
intermediate_layer_bias_gradients = gradients[1]
print(intermediate_layer_bias_gradients)
# Update the variables of the model.
optimizer.apply_gradients(zip(gradients, aux_model.variables))
The most straightforward solution would go like this:
mid_layer = model.get_layer("layer_name")
you can now treat the "mid_layer" as a model, and for instance:
mid_layer.predict(X)
Oh, also, to get the name of a hidden layer, you can use this:
model.summary()
this will give you some insights about the layer input/output as well.
I want to develop a neural network with three inputs pos,anc,neg and three outputs pos_out,anc_out,neg_out. While calculating loss in my customized loss function in keras, I want to access pos_out, anc_out, neg_out in y_pred. I can access y_pred as a whole. But how to access individual part pos_out, anc_out and neg_out
I have applied max function to y_pred. It calculates max value correctly. If I am passing only one output in Model as Model(input=[pos,anc,neg], output=pos_out) then also it calculates max value correctly. But when it comes to accessing max values form pos_out, anc_out and neg_out separately in customized function, it does not work.
def testmodel(input_shape):
pos = Input(shape=(14,300))
anc = Input(shape=(14,300))
neg = Input(shape=(14,300))
model = Sequential()
model.add(Flatten(batch_input_shape=(1,14,300)))
pos_out = model(pos)
anc_out = model(anc)
neg_out = model(neg)
model = Model(input=[pos,anc,neg], output=[pos_out,anc_out,neg_out])
return model
def customloss(y_true,y_pred):
print((K.int_shape(y_pred)[1]))
#loss = K.max((y_pred))
loss = K.max[pos_out]
return loss
You can create a loss function that contains a closure that lets you access the model and thus the targets and the model layer outputs.
class ExampleCustomLoss(object):
""" The loss function can access model.inputs, model.targets and the outputs
of specific layers. These are all tensors and will have the expected results
for the batch.
"""
def __init__(self, model):
self.model = model
def loss(self, y_true, y_pred, **kwargs):
...
return loss
model = Model(..., ...)
loss_calculator = ExampleCustomLoss(model)
model.compile('adam', loss_calculator.loss)
However, it may be simpler to do the inverse. i.e. have a single model output
out = Concatenate(axis=1)([pos_out, anc_out, neg_out])
And then in the loss function slice y_true and y_pred.
From the names of variables, it looks as if you are trying to use a triplet loss. You may find this other question useful:
How to deal with triplet loss when at time of input i have only two files i.e. at time of testing
Your loss function gets 2 arguments, model output and true label, your model output will have the shape that you define when you define the net. Your loss function needs to output a single difference value, between your model's output and the true value of the label while training.
Also please add some trainable layers to your model, because your custom loss function will be useless otherwise.
I am trying to implement an autoencoder in Keras that not only minimizes the reconstruction error but its constructed features should also maximize a measure I define. I don't really have an idea of how to do this.
Here's a snippet of what I have so far:
corrupt_data = self._corrupt(self.data, 0.1)
# define encoder-decoder network structure
# create input layer
input_layer = Input(shape=(corrupt_data.shape[1], ))
encoded = Dense(self.encoding_dim, activation = "relu")(input_layer)
decoded = Dense(self.data.shape[1], activation="sigmoid")(encoded)
# create autoencoder
dae = Model(input_layer, decoded)
# define custom multitask loss with wlm measure
def multitask_loss(y_true, y_pred):
# extract learned features from hidden layer
learned_fea = Model(input_layer, encoded).predict(self.data)
# additional measure I want to optimize from an external function
wlm_measure = wlm.measure(learned_fea, self.labels)
cross_entropy = losses.binary_crossentropy(y_true, y_pred)
return wlm_measure + cross_entropy
# create optimizer
dae.compile(optimizer=self.optimizer, loss=multitask_loss)
dae.fit(corrupt_data, self.data,
epochs=self.epochs, batch_size=20, shuffle=True,
callbacks=[tensorboard])
# separately create an encoder model
encoder = Model(input_layer, encoded)
Currently this does not work properly... When I viewed the training history the model seems to ignore the additional measure and train only based on the cross entropy loss. Also if I change the loss function to consider only wlm measure, I get the error "numpy.float64" object has no attribute "get_shape" (I don't know if changing my wlm function's return type to a tensor will help).
There are a few places that I think may have gone wrong. I don't know if I am extracting the outputs of the hidden layer correctly in my custom loss function. Also I don't know if my wlm.measure function is outputting correctly—whether it should output numpy.float32 or a 1-dimensional tensor of type float32.
Basically a conventional loss function only cares about the output layer's predicted labels and the true labels. In my case, I also need to consider the hidden layer's output (activation), which is not that straightforward to implement in Keras.
Thanks for the help!
You don't want to define your learned_fea Model inside your custom loss function. Rather, you could define a single model upfront with two outputs: the output of the decoder (the reconstruction) and the output of the endoder (the feature representation):
multi_output_model = Model(inputs=input_layer, outputs=[decoded, encoded])
Now you can write a custom loss function that only applies to the output of the encoder:
def custom_loss(y_true, y_pred):
return wlm.measure(y_pred, y_true)
Upon compiling the model, you pass a list of loss functions (or a dictionary if you name your tensors):
model.compile(loss=['binary_crossentropy', custom_loss], optimizer=...)
And fit the model by passing a list of outputs:
model.fit(X=X, y=[data_to_be_reconstructed,labels_for_wlm_measure])