I'm trying to create an image denoising ConvNet in Keras and I want to create my own loss function. I want it to take a noisy image as an input and to get the noise as an output. This loss function is pretty much like a MSE loss but which will make my network learn to remove the clean image and not the noise from the input noisy image.
The loss function I want to implement with y the noisy image, x the clean image and R(y) the predicted image:
I've tried to make it by myself but I don't know how to make the loss access to my noisy images since it changes all the time.
def residual_loss(noisy_img):
def loss(y_true, y_pred):
return np.mean(np.square(y_pred - (noisy_img - y_true), axis=-1)
return loss
Basically, what I need to do is something like this :
input_img = Input(shape=(None,None,3))
c1 = Convolution2D(64, (3, 3))(input_img)
a1 = Activation('relu')(c1)
c2 = Convolution2D(64, (3, 3))(a1)
a2 = Activation('relu')(c2)
c3 = Convolution2D(64, (3, 3))(a2)
a3 = Activation('relu')(c3)
c4 = Convolution2D(64, (3, 3))(a3)
a4 = Activation('relu')(c4)
c5 = Convolution2D(3, (3, 3))(a4)
out = Activation('relu')(c5)
model = Model(input_img, out)
model.compile(optimizer='adam', loss=residual_loss(input_img))
But if I try this, I get :
IndexError: tuple index out of range
What can I do ?
Since it's quite unusual to use the "input" in the loss function (it's not meant for that), I think it's worth saying:
It's not the role of the loss function to separate the noise.
The loss function is just a measure of "how far from right you are".
It's your model that will separate things, and the results you expect from your model are y_true.
You should use a regular loss, with X_training = noisy images and Y_training = noises.
That said...
You can create a tensor for noisy_img outside the loss function and keep it stored. All operations inside a loss function must be tensor functions, so use the keras backend for that:
import keras.backend as K
noisy_img = K.variable(X_training) #you must do this for each bach
But you must take batch sizes into account, this var being outside the loss function will need you to fit just one batch per epoch.
def loss(y_true,y_pred):
return K.mean(K.square(y_pred-y_true) - K.square(y_true-noisy_img))
Training one batch per epoch:
for batch in range(0,totalSamples,size):
noisy_img = K.variable(X_training[batch:size])
model.fit(X_training[batch:size],Y_training[batch:size], batch_size=size)
For using just a mean squared error, organize your data like this:
originalImages = loadYourImages() #without noises
Y_training = createRandomNoises() #without images
X_training = addNoiseToImages(originalImages,Y_training)
Now you just use a "mse", or any other built-in loss.
model.fit(X_training,Y_training,....)
Related
I have 2 loss functions in my model - Cross Entropy and Mean Squared.
I want my model to minimize both the losses but the model is only minimizing mean squared error during training.
def buildGenerator(dmodel, batch=100):
inputs = Input(shape=(256,256,1))
x = Conv2D(
filters = 32,
kernel_size = 3,
padding = 'same',
strides = 1
)(inputs)
x = BatchNormalization(momentum = 0.9)(x)
x = LeakyReLU(alpha=0.2)(x)
.........................
...........................
outputs1 = Conv2D(
filters = 2,
kernel_size = 3,
padding = 'same',
strides = 1
)(x)
outputs2 = dmodel(outputs1)
model = Model(inputs = inputs, outputs = [ outputs2, outputs1], name = 'functional_model')
model.compile(
loss = ['binary_crossentropy','mse' ],
optimizer = 'Adam',
loss_weights = [1.0, 0.6],
metrics=['accuracy', 'mse']
)
return model
In this code, dmodel is another model. I am using dmodel to classify outputs1 generated by the model and then finding cross-entropy between input labels and the output labels.
This is how I am training
dmodel = buildDiscriminator()
dmodel.load_weights('./GAN/discriminator')
dmodel.trainable = False
x, y1 = getGeneratorData()
y2 = np.ones((batch, 1))
model = buildGenerator(dmodel)
model.fit(x,[y2, y1],epochs=1)
I tried a lot of things like changing loss_weights, changing loss functions but nothing is working. My model is only minimizing the MSE function.
I don't understand what I am doing wrong.
I think using the discriminator model inside the generator is the issue but I am not sure.
I do not know whether there is a simple syntax to combine different loss functions, but you can try to define an own loss class. In another thread I found this code snippet that defines an own loss class that combines two other loss functions:
rho = 0.05
class loss_with_KLD(losses.Loss):
def __init__(self, rho):
super(loss_with_KLD, self).__init__()
self.rho = rho
self.kl = losses.KLDivergence()
self.mse = losses.MeanSquaredError(reduction=tf.keras.losses.Reduction.SUM)
def call(self, y_true, y_pred):
mse = self.mse(y_true, y_pred)
kl = self.kl(self.rho, y_pred)
return mse + kl
If you just replace the KLDivergence by the binary cross entropy then this should work. Additionally, you would need to alter the call() function, since this implementation applies two loss function on the same predicted y value, but you actually predict two different y values. In this case, your y_true and y_pred would both contain two values and you would need to apply each loss function on only one of them. I do not know if it is easily possible to take a single element from a vector (in the style of y_true[0]), but if it is not, you could work around this by applying a "mask" to you vectors by multiplying them with [0, 1] or [1, 0], depending on the value you need. With this done you can use the reduce_sum() function get you single value and apply the loss function on your new y_true and y_pred.
This is a little bit more complicated, but it should get the job done.
When you specify 2 loss functions they apply to your 2 different outputs.
i.e. in your example binary_crossentropy applies to output2 which has a y_true value of all ones. And is the output of a non-trainable model.
It seems likely that you want to return a single value from model since you do not seem to have labels for output2. While you could define your own custom loss function that combines both losses on the same value, I would advise against it. If the output value is a single class prediction (i.e. pixel on/off) then binary_crossentrophy makes sense; If it is supposed to be a discrete value then mse makes sense.
I am trying to devise a custom loss function for Variational auto-encoder in Keras with two parts: reconstruction loss and divergence loss. However, instead of using the gaussian distribution for divergence loss, I want to sample randomly from the input and then perform the divergence loss based on the sampled inputs. However, I do not know how to sample inputs which are from the complete datastet and then perform a loss with respect to it. The encoder model is:
x_input = Input((input_size,))
enc1 = Dense(encoder_size[0], activation='relu')(x_input)
drop = Dropout(keep_prob)(enc1)
enc2 = Dense(encoder_size[1], activation='relu')(drop)
drop = Dropout(keep_prob)(enc2)
mu = Dense(latent_dim, activation='linear', name='encoder_mean')(drop)
encoder = Model(x_input,mu)
The structure of loss should be:
# the input is the placeholder for the complete input
def loss(x, y, input):
reconstruction_loss = mean_squared_error(x, y)
sample_num = 100
sample_input = sample_from_input(input, sample_num)
sample_encoded = encoder.predict(sample_input) <-- this would not work with placeholder
sample_prior = gaussian(mean=0, std=1)
# perform KL divergence between sample_encoded and sample_prior
I have not found anything similar given. It would be great if somebody can point me in the right direction.
There are couple of problems in your code. First, when you create your custom loss function, it expects only two (equivalent) parameters of y_true and y_pred. So you will not be able to pass explicitly the parameter of input in your case. If you wish to pass additional parameters, you have to use the concept of nested function.
Next thing is inside predict function you will not be able to pass TensorFlow placeholders. You will have to pass Numpy array equivalents in it. So I would recommend you to rewrite your sample_from_input which samples from a set of file path inputs, reads it and sends a Numpy array of file data. Also, in the parameter of input_data, pass it the file paths where your data is present.
I have enclosed only the relevant parts of code.
def custom_loss(input_data):
def loss(y_true, y_pred):
reconstruction_loss = mean_squared_error(x, y)
sample_num = 100
sample_input = sample_from_input(input_data)
# sample_input is a Numpy array
sample_encoded = encoder.predict(sample_input)
sample_prior = gaussian(mean=0, std=1)
# perform KL divergence between sample_encoded and sample_prior
divergence_loss = # Your logic returning a numeric value
return reconstruction_loss + divergence_loss
return loss
encoder.compile(optimizer='adam', loss=custom_loss('<<input_data_path>>'))
My question is similar to the one posed here:
keras combining two losses with adjustable weights
However, the outputs have a different dimensionality resulting in the outputs not being able to be concatenated. Hence, the solution is not applicable, is there another way to solve this problem?
The question:
I have a keras functional model with two layers with outputs x1 and x2.
x1 = Dense(1,activation='relu')(prev_inp1)
x2 = Dense(2,activation='relu')(prev_inp2)
I need to use these x1 and x2 use them in a weighted loss function like in the attached image. Propagate the 'same loss' into both branches. Alpha is flexible to vary with iterations.
For this question, a more elaborated solution is necessary. Since we're going to use a trainable weight, we will need a custom layer.
Also, we will be needing a different form of training, since our loss doesn't work like the others taking only y_true and y_pred and considers joining two different outputs.
Thus, we're going to create two versions of the same model, one for prediction, another for training, and the training version will contain the loss in itself, using a dummy keras loss function in compilation.
The prediction model
Let's use a very basic example of model with two outputs and one input:
#any input your true model takes
inp = Input((5,5,2))
#represents the localization output
outImg = Conv2D(1,3,activation='sigmoid')(inp)
#represents the classification output
outClass = Flatten()(inp)
outClass = Dense(2,activation='sigmoid')(outClass)
#the model
predictionModel = Model(inp, [outImg,outClass])
You use this one regularly for predictions. It's not necessary to compile this one.
The losses for each branch
Now, let's create custom loss functions for each branch, one for LossCls and another for LossLoc.
Using dummy examples here, you can elaborate these losses better if necessary. The most important is that they output batches shaped like (batch, 1) or (batch,). Both output the same shape so they can be summed later.
def calcImgLoss(x):
true,pred = x
loss = binary_crossentropy(true,pred)
return K.mean(loss, axis=[1,2])
def calcClassLoss(x):
true,pred = x
return binary_crossentropy(true,pred)
These will be used in Lambda layers in the training model.
The loss weighting layer - (WARNING! EDITED! - See explanation at the end)
Now, let's weight the losses with the trainable alpha. Trainable parameters need custom layers to be implemented.
class LossWeighter(Layer):
def __init__(self, **kwargs): #kwargs can have 'name' and other things
super(LossWeighter, self).__init__(**kwargs)
#create the trainable weight here, notice the constraint between 0 and 1
def build(self, inputShape):
self.weight = self.add_weight(name='loss_weight',
shape=(1,),
initializer=Constant(0.5),
constraint=Between(0,1),
trainable=True)
super(LossWeighter,self).build(inputShape)
def call(self,inputs):
#old answer: will always tend to completely ignore the biggest loss
#return (self.weight * firstLoss) + ((1-self.weight)*secondLoss)
#problem: alpha tends to 0 or 1, eliminating the biggest of the two losses
#proposal of working alpha optimization
#return K.square((self.weight * firstLoss) - ((1-self.weight)*secondLoss))
#problem: might not train any of the losses, and even increase one of them
#in order to minimize the difference between the two losses
#new answer - a mix between the two, applying gradients to the right weights
loss1, loss2 = inputs #trainable
static_loss1 = K.stop_gradient(loss1) #non_trainable
static_loss2 = K.stop_gradient(loss2) #non_trainable
a1 = self.weight #trainable
a2 = 1 - a1 #trainable
static_a1 = K.stop_gradient(a1) #non_trainable
static_a2 = 1 - static_a1 #non_trainable
#this trains only alpha to minimize the difference between both losses
alpha_loss = K.square((a1 * static_loss1) - (a2 * static_loss2))
#or K.abs (.....)
#this trains only the original model weights to minimize both original losses
model_loss = (static_a1 * loss1) + (static_a2 * loss2)
return alpha_loss + model_loss
def compute_output_shape(self,inputShape):
return inputShape[0]
Notice that there is a custom constraint to keep this weight between 0 and 1. This constraint is implemented with:
class Between(Constraint):
def __init__(self,min_value,max_value):
self.min_value = min_value
self.max_value = max_value
def __call__(self,w):
return K.clip(w,self.min_value, self.max_value)
def get_config(self):
return {'min_value': self.min_value,
'max_value': self.max_value}
The training model
This model will take the prediction model as base, add the loss calculations and loss weighter at the end and output only the loss value. Because it outputs only a loss, we will use the true targets as inputs, and a dummy loss function defined like:
def ignoreLoss(true,pred):
return pred #this just tries to minimize the prediction without any extra computation
Model inputs:
#true targets
trueImg = Input((3,3,1))
trueClass = Input((2,))
#predictions from the prediction model
predImg = predictionModel.outputs[0]
predClass = predictionModel.outputs[1]
Model outputs = losses:
imageLoss = Lambda(calcImgLoss, name='loss_loc')([trueImg, predImg])
classLoss = Lambda(calcClassLoss, name='loss_cls')([trueClass, predClass])
weightedLoss = LossWeighter(name='weighted_loss')([imageLoss,classLoss])
Model:
trainingModel = Model([predictionModel.input, trueImg, trueClass], weightedLoss)
trainingModel.compile(optimizer='sgd', loss=ignoreLoss)
Dummy training
inputImages = np.zeros((7,5,5,2))
outputImages = np.ones((7,3,3,1))
outputClasses = np.ones((7,2))
dummyOut = np.zeros((7,))
trainingModel.fit([inputImages,outputImages,outputClasses], dummyOut, epochs = 50)
predictionModel.predict(inputImages)
Necessary imports
from keras.layers import *
from keras.models import Model
from keras.constraints import Constraint
from keras.initializers import Constant
from keras.losses import binary_crossentropy #or another you need
(EDIT) Explaining the problem with the old answer:
The formula used in the old answer would make alpha always go to 0 or 1, meaning only the smallest of the two losses would be ever trained. (Useless)
A new formula leads alpha to make both losses have the same value. Alpha would be trained properly and not tend to 0 or 1. But, still, the losses would not be properly trained because "increasing one loss to reach the other" would be a possibility for the model, and once both losses were equal, the model would stop training.
The new solution is a mix of the two proposals above, while the first actually trains the losses but with wrong alpha; and the second trains alpha with wrong losses. The mixed solution adds both, but uses K.stop_gradient to prevent the wrong part of the training from happening.
The result of this will be: the "easiest" loss (not the biggest) will be more trained than the hardest. We may use K.abs or K.square, as compared to "mae" or "mse" between the two losses. The best option is up to experiment.
See this table comparing the old and new proposals:
This does not guarantee the best optimization though!!!
Training the easiest loss will not always have the best result, though. It may be better than favoring a huge loss just because it's formula is different. But the expected result might still need some manual weighting of the losses.
I fear there is no automatic training for this weight. If you have a target metric, you can try to train this metric (when possible, but metrics that depend on sorting, getting an index, rounding or anything that breaks backpropagation may not be possible to be transformed in losses).
There is no need to concatenate your outputs. To pass multiple arguments to a loss function, you can wrap it as follows:
def custom_loss(x1, x2, y1, y2, alpha):
def loss(y_true, y_pred):
return (1-alpha) * loss_cls(y1, x1) + alpha * loss_loc(y2, x2)
return loss
And then compile your functional model as:
x1 = Dense(1, activation='relu')(prev_inp1)
x2 = Dense(2, activation='relu')(prev_inp2)
y1 = Input((1,))
y2 = Input((2,))
model.compile('sgd',
loss=custom_loss(x1, x2, y1, y2, 0.5),
target_tensors=[y1, y2])
NOTE: Not tested.
I'm wondering if it's possible to add a custom model to a loss function in keras. For example:
def model_loss(y_true, y_pred):
inp = Input(shape=(128, 128, 1))
x = Dense(2)(inp)
x = Flatten()(x)
model = Model(inputs=[inp], outputs=[x])
a = model(y_pred)
b = model(y_true)
# calculate MSE
mse = K.mean(K.square(a - b))
return mse
This is a simplified example. I'll actually be using a VGG net in the loss, so just trying to understand the mechanics of keras.
The usual way of doing that is appending your VGG to the end of your model, making sure all its layers have trainable=False before compiling.
Then you recalculate your Y_train.
Suppose you have these models:
mainModel - the one you want to apply a loss function
lossModel - the one that is part of the loss function you want
Create a new model appending one to another:
from keras.models import Model
lossOut = lossModel(mainModel.output) #you pass the output of one model to the other
fullModel = Model(mainModel.input,lossOut) #you create a model for training following a certain path in the graph.
This model will have the exact same weights of mainModel and lossModel, and training this model will affect the other models.
Make sure lossModel is not trainable before compiling:
lossModel.trainable = False
for l in lossModel.layers:
l.trainable = False
fullModel.compile(loss='mse',optimizer=....)
Now adjust your data for training:
fullYTrain = lossModel.predict(originalYTrain)
And finally do the training:
fullModel.fit(xTrain, fullYTrain, ....)
This is old but I'm going to answer it because no one did directly. You definitely can call another model in a custom loss, and I actually think it's much easier than adding the model to the end of your main model and creating a whole new one and a whole new set of training labels.
Here is an example that both calls a model and an outside function that we define -
def normalize_tensor(in_feat):
norm_factor = tf.math.sqrt(tf.keras.backend.sum(in_feat**2, axis=-1, keepdims=True))
return in_feat / (norm_factor + 1e-10)
def VGGLoss(y_true, y_pred):
true = vgg(preprocess_input(y_true * 255))
pred = vgg(preprocess_input(y_pred * 255))
t = normalize_tensor(true[i])
p = normalize_tensor(pred[i])
vggLoss = tf.math.reduce_mean(tf.math.square(t - p))
return vggLoss
vgg() just calls the vgg16 model with no head.
preprocess_input is a keras function that normalizes inputs to be used in the vgg model (here we are assuming your model outputs an image in 0-1 range, then we multiply by 255 to get 0-255 range for vgg).
normalize_tensor takes the vgg activations and makes them have a magnitude of 1 for each channel, otherwise your loss will be massive.
I'm trying to get the activation values for each layer in this baseline autoencoder built using Keras since I want to add a sparsity penalty to the loss function based on the Kullbach-Leibler (KL) divergence, as shown here, pag. 14.
In this scenario, I'm going to calculate the KL divergence for each layer and then sum all of them with the main loss function, e.g. mse.
I therefore made a script in Jupyter where I do that but all the time, when I try to compile I get ZeroDivisionError: integer division or modulo by zero.
This is the code
import numpy as np
from keras.layers import Conv2D, Activation
from keras.models import Sequential
from keras import backend as K
from keras import losses
x_train = np.random.rand(128,128).astype('float32')
kl = K.placeholder(dtype='float32')
beta = K.constant(value=5e-1)
p = K.constant(value=5e-2)
# encoder
model = Sequential()
model.add(Conv2D(filters=16,kernel_size=(4,4),padding='same',
name='encoder',input_shape=(128,128,1)))
model.add(Activation('relu'))
# get the average activation
A = K.mean(x=model.output)
# calculate the value for the KL divergence
kl = K.concatenate([kl, losses.kullback_leibler_divergence(p, A)],axis=0)
# decoder
model.add(Conv2D(filters=1,kernel_size=(4,4),padding='same', name='encoder'))
model.add(Activation('relu'))
B = K.mean(x=model.output)
kl = K.concatenate([kl, losses.kullback_leibler_divergence(p, B)],axis=0)
Here seems the cause
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in _normalize_axis(axis, ndim)
989 else:
990 if axis is not None and axis < 0:
991 axis %= ndim <----------
992 return axis
993
so there might be something wrong in the mean calculation. If I print the value I get
Tensor("Mean_10:0", shape=(), dtype=float32)
that is quite strange because the weights and the biases are non-zero initialised. Thus, there might be something wrong in the way of getting the activation values either.
I really would not know hot to fix it, I'm not much of a skilled programmer.
Could anyone help me in understanding where I'm wrong?
First, you shouldn't be doing calculations outside layers. The model must keep track of all calculations.
If you need a specific calculation to be done in the middle of the model, you should use a Lambda layer.
If you need that a specific output be used in the loss function, you should split your model for that output and do calculations inside a custom loss function.
Here, I used Lambda layer to calculate the mean, and a customLoss to calculate the kullback-leibler divergence.
import numpy as np
from keras.layers import *
from keras.models import Model
from keras import backend as K
from keras import losses
x_train = np.random.rand(128,128).astype('float32')
kl = K.placeholder(dtype='float32') #you'll probably not need this anymore, since losses will be treated individually in each output.
beta = beta = K.constant(value=5e-1)
p = K.constant(value=5e-2)
# encoder
inp = Input((128,128,1))
lay = Convolution2D(filters=16,kernel_size=(4,4),padding='same', name='encoder',activation='relu')(inp)
#apply the mean using a lambda layer:
intermediateOut = Lambda(lambda x: K.mean(x),output_shape=(1,))(lay)
# decoder
finalOut = Convolution2D(filters=1,kernel_size=(4,4),padding='same', name='encoder',activation='relu')(lay)
#but from that, let's also calculate a mean output for loss:
meanFinalOut = Lambda(lambda x: K.mean(x),output_shape=(1,))(finalOut)
#Now, you have to create a model taking one input and those three outputs:
splitModel = Model(inp,[intermediateOut,meanFinalOut,finalOut])
And finally, compile your model with your custom loss function (we will define that later). But since I don't know if you're actually using the final output (not mean) for training, I'll suggest creating one model for training and another for predicting:
trainingModel = Model(inp,[intermediateOut,meanFinalOut])
trainingModel.compile(...,loss=customLoss)
predictingModel = Model(inp,finalOut)
#you don't need to compile the predicting model since you're only training the trainingModel
#both will share the same weights, you train one, and predict in the other
Our custom loss function should then deal with the kullback.
def customLoss(p,mean):
return #your own kullback expression (I don't know how it works, but maybe keras' one can be used with single values?)
Alternatively, if you want a single loss function to be called instead of two:
summedMeans = Add([intermediateOut,meanFinalOut])
trainingModel = Model(inp, summedMeans)