I am training a VGG-16 model toward a multi-class classification task with Tensorflow 2.4, Keras 2.4.0 versions. The y-true labels are one-hot encoded. I use a couple of custom loss functions, individually, to train the model. First, I used a custom cauchy-schwarz divergence loss function as shown below:
from math import sqrt
from math import log
from scipy.stats import gaussian_kde
from scipy import special
def cs_divergence(p1, p2):
"""p1 (numpy array): first pdfs, p2 (numpy array): second pdfs, Returns:float: CS divergence"""
r = range(0, p1.shape[0])
p1_kernel = gaussian_kde(p1)
p2_kernel = gaussian_kde(p2)
p1_computed = p1_kernel(r)
p2_computed = p2_kernel(r)
numerator = sum(p1_computed * p2_computed)
denominator = sqrt(sum(p1_computed ** 2) * sum(p2_computed**2))
return -log(numerator/denominator)
Then, I used a negative log likelihood custom loss function as shown below:
def nll(y_true, y_pred):
loss = -special.xlogy(y_true, y_pred) - special.xlogy(1-y_true, 1-y_pred)
return loss
And compiled the models as below during training the models individually with these losses:
sgd = SGD(lr=0.0001, decay=1e-6, momentum=0.9, nesterov=True)
model_vgg16.compile(optimizer=sgd,
loss=[cs_divergence],
metrics=['accuracy'])
and
sgd = SGD(lr=0.0001, decay=1e-6, momentum=0.9, nesterov=True)
model_vgg16.compile(optimizer=sgd,
loss=[nll],
metrics=['accuracy'])
I got the following errors when training the model with these loss function:
With cs_divergence, I got the following error:
TypeError: 'NoneType' object cannot be interpreted as an integer
With nll custom loss, I got the following error:
NotImplementedError: Cannot convert a symbolic Tensor (IteratorGetNext:1) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported
I downgraded the Numpy version to 1.19.5 as discussed in NotImplementedError: Cannot convert a symbolic Tensor (2nd_target:0) to a numpy array but it didn't help.
try maybe:
loss=cs_divergence
means without the Brackets.
Related
I am trying to use a custom loss function for calculating a weighted MSE in a regression taks (values in the task:-1,-0.5, 0, 0.5 , 1, 1.5, 3 etc.). Here is my implementation of custom loss function:
import tensorflow
import tensorflow.keras.backend as kb
def weighted_mse(y, yhat):
ind_losses = tensorflow.keras.losses.mean_squared_error(y, yhat)
weights_ind = kb.map_fn(lambda yi: weight_dict[kb.get_value(yi)], y, dtype='float32')
# average loss over weighted sum of the batch
return tensorflow.math.divide(tensorflow.math.reduce_sum(tensorflow.math.multiply(ind_losses, weights_ind)), len(y))
I am running an example which is working:
weight_dict = {-1.0: 70.78125, 0.0: 1.7224334600760458, 0.5: 4.58502024291498, 1.0: 7.524916943521595, 1.5: 32.357142857142854, 2.0: 50.33333333333333, 2.5: 566.25, 3.0: 566.25}
y_true = tensorflow.convert_to_tensor([[0.5],[3]])
y_pred = tensorflow.convert_to_tensor([[0.5],[0]])
weighted_mse(y_true, y_pred)
But when inputted into my model, it throws the following error:
AttributeError: 'Tensor' object has no attribute '_numpy'
Here is how I use the custom loss function:
model.compile(
optimizer=opt,
loss={
"predicted_class": weighted_mse
})
EDIT:
when changing weight_dict[kb.get_value(yi)] to weight_dict[float(yi)] I get the following error:
TypeError: float() argument must be a string or a number, not 'builtin_function_or_method'
What you want is basically the idea of sample weight. When using training API of Keras, alongside your data you can pass another array containing the weight for each sample which is used to determine the contribution of each sample in the loss function.
To use it, you can use sample_weight argument of fit method:
model.fit(X, y, sample_weight=X_weight, ...)
Note that X_weight should be an array of the same length as X (i.e. one weight value for each training sample). Further, if X is a tf.data.Dataset instance or a generator, this argument does not work and instead you need to pass the sample weight as the third element of the tuple returned by X.
This usually happens in an old version of tensorflow. There are 2 things you can try:
Add this line to the jupyter notebook when you are importing tensorflow like so:
import tensorflow as tf
tf.enable_eager_execution()
Upgrade tensorflow with the following command in prompt:
pip install tensorflow --upgrade
This is most probably because of eager execution. See the docs here for more info.
I have already been given a custom metric code on which my model is going to be evaluated but they've used sklearn's metrices. I know If I have a metric I can use it in callbacks like
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy', custom_metric])
ModelCheckpoint(monitor='val_custom_metric',
save_best_only=True,
save_weights_only=True,
mode='max',
verbose=1)
It is a multi-output problem with 3 labels,
Submissions are evaluated using a hierarchical macro-averaged recall. First, a standard macro-averaged recall is calculated for each component (label_1,label_2 or label_3). The final score is the weighted average of those three scores, with the label_1 given double weight. You can replicate the metric with the following python snippet:
and I am unable to comprehend how do I implement the code given below in keras-
import numpy as np
import sklearn.metrics
scores = []
for component in ['label_1', 'label_2', 'label_3']:
y_true_subset = solution[solution[component] == component]['target'].values
y_pred_subset = submission[submission[component] == component]['target'].values
scores.append(sklearn.metrics.recall_score(
y_true_subset, y_pred_subset, average='macro'))
final_score = np.average(scores, weights=[2,1,1])
How can I convert it in the form to use as a metric? or more preciely, how can I use keras.backend or to implement this code?
You can only implement the metric, the rest is very obscure and will certainly not participate in Keras.
threshold = 0.5 #you can work this threshold for better results
#considering y_true is made of 0 and 1 only
#considering output shape is (batch, 3)
def custom_metric(y_true, y_pred):
weights = K.constant([2,1,1]) #shape (3,)
y_pred = K.cast(K.greater(y_pred, threshold), K.floatx()) #shape (batch, 3)
true_positives = K.sum(y_pred * y_true, axis=0) #shape (3,)
false_negatives = K.sum((1-y_pred) * y_true, axis=0) #shape (3,)
recall = true_positives / (true_positives + false_negatives)
return K.mean(recall * weights)
Notice that this will be calulated batchwise, and since the denominator is different depending on the results, the calculated metric batchwise will be different compared to when you use the metric for the entire dataset.
You may need big batch sizes to avoid metric instability. And it might be interesting to apply the metric on the entire data with a callback to get the exact result.
I am fairly new to tensorflow and I was following the answer to the question below in order to build a custom loss function in Keras that considers only the top 20 predictions.
How can I sort the values in a custom Keras / Tensorflow Loss Function?
However, when I try to compile my model using this code I get the following error about dimensions
InvalidArgumentError: input must have last dimension >= k = 20 but is 1 for 'loss_21/dense_65_loss/TopKV2' (op: 'TopKV2') with input shapes: [?,1], [] and with computed input tensors: input[1] = <20>.
A simplified version of the code that re-produces the error is the following.
import tensorflow as tf
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.optimizers import SGD
top = 20
def top_loss(y_true, y_pred):
y_pred_top_k, y_pred_ind_k = tf.nn.top_k(y_pred, top)
loss_per_sample = tf.reduce_mean(tf.reduce_sum(y_pred_top_k,
axis=-1))
return loss_per_sample
model = Sequential()
model.add(Dense(50, input_dim=201))
model.add(Dense(1))
sgd = SGD(lr=0.01, decay=0, momentum=0.9)
model.compile(loss=top_loss, optimizer=sgd)
and the error is thrown at the following line of the top_loss function when the model is compiled.
y_pred_top_k, y_pred_ind_k = tf.nn.top_k(y_pred, top)
It seems that y_pred in compile time is by default of shape [?,1] while the tf.nn.top_k function expects dimension at least higher than 'k` (i.e. 20).
Do I have to cast y_pred to something so that tf.nn.top_k knows it is of the correct dimensions?
Use:
y_pred_top_k, y_pred_ind_k = tf.nn.top_k(y_pred[:,0], top)
y_pred[:,0] gets the predicted values of the full batch as a rank 1 tensor.
Another Problem:
However, you will still end up with problem with the last batch. Say your batch size is 32 and your train data is of size 100 then the last batch will be of size less then 20 and so tf.nn.top_k will result in a run time error for the last batch. Just make sure your last batch size is >= 20 to avoid this issue. However a much better way is to check if the current batch is less then 20 and if so adjust your k value to be used in the top_k
Code
import tensorflow as tf
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.optimizers import SGD
top = tf.constant(20)
def top_loss(y_true, y_pred):
result = tf.cond(tf.math.greater(top_, tf.shape(y_true)[0]),
lambda: tf.shape(y_true)[0], lambda: top)
y_pred_top_k, y_pred_ind_k = tf.nn.top_k(y_pred[:,0], result)
loss_per_sample = tf.reduce_mean(tf.reduce_sum(y_pred_top_k,
axis=-1))
return loss_per_sample
model = Sequential()
model.add(Dense(50, input_dim=201))
model.add(Dense(1))
sgd = SGD(lr=0.01, decay=0, momentum=0.9)
model.compile(loss=top_loss, optimizer=sgd)
Google Colab to reproduce the error None_for_gradient.ipynb
I need a custom loss function where the value is calculated according to the model inputs, these inputs are not the default values (y_true, y_pred). The predict method works for the generated architecture, but when I try to use the train_on_batch, the following error appears.
ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
My custom function of loss (below) was based on this example image_ocr.py#L475, in the Colab link has another example based on this solution Custom loss function y_true y_pred shape mismatch #4781, it also generates the same error:
from keras import backend as K
from keras import losses
import keras
from keras.models import TimeDistributed, Dense, Dropout, LSTM
def my_loss(args):
input_y, input_y_pred, y_pred = args
return keras.losses.binary_crossentropy(input_y, input_y_pred)
def generator2():
input_noise = keras.Input(name='input_noise', shape=(40, 38), dtype='float32')
input_y = keras.Input(name='input_y', shape=(1,), dtype='float32')
input_y_pred = keras.Input(name='input_y_pred', shape=(1,), dtype='float32')
lstm1 = LSTM(256, return_sequences=True)(input_noise)
drop = Dropout(0.2)(lstm1)
lstm2 = LSTM(256, return_sequences=True)(drop)
y_pred = TimeDistributed(Dense(38, activation='softmax'))(lstm2)
loss_out = keras.layers.Lambda(my_loss, output_shape=(1,), name='my_loss')([input_y, input_y_pred, y_pred])
model = keras.models.Model(inputs=[input_noise, input_y, input_y_pred], outputs=[y_pred, loss_out])
model.compile(loss={'my_loss': lambda y_true, y_pred: y_pred}, optimizer='adam')
return model
g2 = generator2()
noise = np.random.uniform(0,1,size=[10,40,38])
g2.train_on_batch([noise, np.ones(10), np.zeros(10)], noise)
I need help to verify which operation is generating this error, because as far as I know the keras.losses.binary_crossentropy is differentiable.
I think the reason is that input_y and input_y_pred are all keras Input,your loss function is calculated with these two tensor,they are not binded up with the model parameters,so the loss function gives no gradient to your model
I am trying to use huber loss in a keras model (writing DQN), but I am getting bad result, I think I am something doing wrong. My is code is below.
model = Sequential()
model.add(Dense(output_dim=64, activation='relu', input_dim=state_dim))
model.add(Dense(output_dim=number_of_actions, activation='linear'))
loss = tf.losses.huber_loss(delta=1.0)
model.compile(loss=loss, opt='sgd')
return model
I came here with the exact same question. The accepted answer uses logcosh which may have similar properties, but it isn't exactly Huber Loss. Here's how I implemented Huber Loss for Keras (note that I'm using Keras from Tensorflow 1.5).
import numpy as np
import tensorflow as tf
'''
' Huber loss.
' https://jaromiru.com/2017/05/27/on-using-huber-loss-in-deep-q-learning/
' https://en.wikipedia.org/wiki/Huber_loss
'''
def huber_loss(y_true, y_pred, clip_delta=1.0):
error = y_true - y_pred
cond = tf.keras.backend.abs(error) < clip_delta
squared_loss = 0.5 * tf.keras.backend.square(error)
linear_loss = clip_delta * (tf.keras.backend.abs(error) - 0.5 * clip_delta)
return tf.where(cond, squared_loss, linear_loss)
'''
' Same as above but returns the mean loss.
'''
def huber_loss_mean(y_true, y_pred, clip_delta=1.0):
return tf.keras.backend.mean(huber_loss(y_true, y_pred, clip_delta))
Depending if you want to reduce the loss or the mean of the loss, use the corresponding function above.
You can wrap Tensorflow's tf.losses.huber_loss in a custom Keras loss function and then pass it to your model.
The reason for the wrapper is that Keras will only pass y_true, y_pred to the loss function, and you likely want to also use some of the many parameters to tf.losses.huber_loss. So, you'll need some kind of closure like:
def get_huber_loss_fn(**huber_loss_kwargs):
def custom_huber_loss(y_true, y_pred):
return tf.losses.huber_loss(y_true, y_pred, **huber_loss_kwargs)
return custom_huber_loss
# Later...
model.compile(
loss=get_huber_loss_fn(delta=0.1)
...
)
I was looking through the losses of keras. Apparently logcosh has same properties as huber loss. More details of their similarity can be seen here.
How about:
loss=tf.keras.losses.Huber(delta=100.0)