I am working on a multi classification problem using CNN's in keras. My precision and recall score is always over 1 which does not make any sense at all. Attached below is my code, what am I doing wrong?
def recall(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def precision(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy',recall,precision])
I was able to figure this out. The above code works perfectly once you one-hot encode all the categorical labels. Also, make sure you do NOT have sparse_categorical_crossentropy as your loss function, and instead just use categorical_crossentropy.
If you wish to convert your categorical values to one-hot encoded values in Keras, you can just use this code:
from keras.utils import to_categorical
y_train = to_categorical(y_train)
The reason you have to do the above is noted in Keras documentation:
"when using the categorical_crossentropy loss, your targets should be in categorical format (e.g. if you have 10 classes, the target for each sample should be a 10-dimensional vector that is all-zeros except for a 1 at the index corresponding to the class of the sample). In order to convert integer targets into categorical targets, you can use the Keras utility to_categorical"
Related
I have a time-series model in tensorflow predicting vectors of size N with binary variables, e.g. [1 0 0 1 0].
For these vectors i have recall metric which is calculated correclty as seen below:
def recall_m(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
What i would like is given y_true and y_pred also calculate a metric as follows:
if y_true contains at least one '1' and y_pred also contains at least one '1' output 1
if y_true contains has all variables to '0' and y_pred also has all variables to '0' output 1
otherwise output 0
1's are considered as attack instances while 0 normal cases so this metric predicts the existence or not of an attack in the next N time steps.
What i have done so far:
def AttackAcc(y_true, y_pred):
has_one = K.sum(K.round(K.clip(y_true, 0, 1)))
has_one_pred = K.sum(K.round(K.clip(y_pred, 0, 1)))
if has_one >tf.constant(0.0):
has_one = tf.constant(1.0)
else:
has_one = tf.constant(0.0)
if has_one_pred >tf.constant(0.0):
has_one_pred = tf.constant(1.0)
else:
has_one_pred = tf.constant(0.0)
return tf.math.equal(has_one , has_one_pred )
However, this produces results that are unreasonable, e.g. outputing 1 when the conditions described above are not met.
Any idea about what am i doing wrong?
Is there a way to get precision for class 0 in a binary classification model using tf.keras.metrics.Precision?
I tried setting class_id to 0, but it still gives the precision to class 1.
I would like to save the model with the best class 0 precision value using a callback, this is a reason I need a metric for the precision in compile.
I use tf.keras.preprocessing.image_dataset_from_directory to create my dataset, the code looks the same for train/validation/test set (of course the train and val sets are shuffled):
ds_test = tf.keras.preprocessing.image_dataset_from_directory(
directory = test_path,
batch_size = my_batch_size,
image_size = (img_height, img_width),
shuffle = False
)
The precision metric is added in the compile method:
model.compile(loss = tf.keras.losses.BinaryCrossentropy(),
optimizer = tf.keras.optimizers.Adam(...),
metrics = ["accuracy",
tf.keras.metrics.Precision(class_id = 0, name = "precision_0")
]
)
When evaluating the model with tensorflow model.evaluate I get the precision of class 1 instead of class 0:
precision_0: 0.9556
Using sklearn.metrics.classification_report I got the precision for both classes:
precision
0 0.9723
1 0.9556
I would like to get precision for class 0 in tensorflow too, in this case 0.9723. Any ideas?
Thanks in advance!
I found a workaround which can simply solve my case:
I use class_names parameter of tf.keras.utils.image_dataset_from_directory to define the order of the classes, adding my "class 0" as second class:
"class_names: Only valid if "labels" is "inferred". This is the explicit list of class names (must match names of subdirectories). Used to control the order of the classes (otherwise alphanumerical order is used)."
The modified code to create the dataset:
ds_test = tf.keras.preprocessing.image_dataset_from_directory(
directory = test_path,
class_names = ["my_class_1", "my_class_0"]
batch_size = my_batch_size,
image_size = (img_height, img_width),
shuffle = False
)
Not as elegant as defining a custom metric, but it works.
Important note: This only works with binary classification!
You can write a custom metric for this. If you are using sigmoid activation, then as a prediction result you get the probability of being class 1.
Once you subclass tf.keras.metrics.Metric you can alter this:
class my_precision_class_0(tf.keras.metrics.Metric):
def __init__(self, threshold, name='my_precision_class_0', **kwargs):
super(my_precision_class_0, self).__init__(name=name, **kwargs)
self.true_positives = self.add_weight(name='tp', initializer='zeros')
self.false_positives = self.add_weight(name='fp', initializer='zeros')
self.threshold = threshold
def update_state(self, y_true, y_pred, sample_weight=None):
y_true_cls = tf.cast(tf.equal(y_true[:, 0], 0), tf.int64)
y_pred_cls = tf.cast(tf.less_equal(y_pred[:, 0], self.threshold), tf.int64)
true_positives = tf.math.count_nonzero(y_true_cls * y_pred_cls)
false_positives = tf.math.count_nonzero(y_pred_cls * (1 - y_true_cls))
self.true_positives.assign_add(tf.cast(true_positives, tf.float32))
self.false_positives.assign_add(tf.cast(false_positives, tf.float32))
def result(self):
return self.true_positives / (self.true_positives + self.false_positives)
def reset_states(self):
self.true_positives.assign(0)
self.false_positives.assign(0)
Here's what happens in y_pred_cls when using tf.less_equal, same thing applies for also y_true_cls:
x = tf.constant([0.4, 4.0, 6.0])
y = tf.constant([0.5])
r = tf.math.less_equal(x, y) # --> [True, False, False]
tf.cast(r, tf.int64) # --> [1, 0, 0]
We can use this metric in compile:
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy',
metrics=['accuracy', my_precision_class_0(threshold = 0.5),
tf.keras.metrics.Precision()])
model.fit(X, y, epochs=16)
model.evaluate(X, y, batch_size = 1)
# --> loss: 0.3370 - accuracy: 0.8790 - my_precision_class_0: 0.8983 - precision: 0.8617
from sklearn.metrics import classification_report
y_hat = (model.predict(X) > 0.5).astype(int)
print(classification_report(y, y_hat, digits=4))
precision recall f1-score support
0 0.8983 0.8531 0.8751 497
1 0.8617 0.9046 0.8826 503
accuracy 0.8790 1000
macro avg 0.8800 0.8788 0.8789 1000
weighted avg 0.8799 0.8790 0.8789 1000
I'm using the following custom metrics for Keras:
def mcor(y_true, y_pred):
#matthews_correlation
y_pred_pos = K.round(K.clip(y_pred, 0, 1))
y_pred_neg = 1 - y_pred_pos
y_pos = K.round(K.clip(y_true, 0, 1))
y_neg = 1 - y_pos
tp = K.sum(y_pos * y_pred_pos)
tn = K.sum(y_neg * y_pred_neg)
fp = K.sum(y_neg * y_pred_pos)
fn = K.sum(y_pos * y_pred_neg)
numerator = (tp * tn - fp * fn)
denominator = K.sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))
return numerator / (denominator + K.epsilon())
def precision(y_true, y_pred):
"""Precision metric.
Only computes a batch-wise average of precision.
Computes the precision, a metric for multi-label classification of
how many selected items are relevant.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
def recall(y_true, y_pred):
"""Recall metric.
Only computes a batch-wise average of recall.
Computes the recall, a metric for multi-label classification of
how many relevant items are selected.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def f1(y_true, y_pred):
def recall(y_true, y_pred):
"""Recall metric.
Only computes a batch-wise average of recall.
Computes the recall, a metric for multi-label classification of
how many relevant items are selected.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def precision(y_true, y_pred):
"""Precision metric.
Only computes a batch-wise average of precision.
Computes the precision, a metric for multi-label classification of
how many selected items are relevant.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
precision = precision(y_true, y_pred)
recall = recall(y_true, y_pred)
return 2*((precision*recall)/(precision+recall+K.epsilon()))
This is the compilation statement:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy', precision, recall, f1])
Using ModelCheckpoint, the Keras model is saved automatically as the best model is found. The classification categories have been one-hot encoded.
However, when the saved model is loaded back using:
# load model
from keras.models import load_model
custom_obj = {'accuracy':accuracy, 'Loss':Loss, 'precision':precision, 'recall':recall, 'f1':f1}
model = load_model('Asset_3_best_model.h5', custom_objects=custom_obj)
Custom objects from the previously defined custom Keras functions are listed here.
I observe the following error when the model is loaded back from memory:
ValueError: ('Could not interpret metric function identifier:',
0.8701059222221375)
I've tried many different custom functions, but I couldn't find a solution to re-load my saved model. This is a multi classification time series challenge and I hope to learn if there is an easier method to solve this metric calculation.
I'm also working on finding a way to calculate F1 score for my binary classification problem. I came across a tutorial of TensorFlow and it worked for me: https://www.tensorflow.org/tutorials/structured_data/imbalanced_data
Although, it is not custom but direct implementation.
METRICS = [
keras.metrics.TruePositives(name='tp'),
keras.metrics.FalsePositives(name='fp'),
keras.metrics.TrueNegatives(name='tn'),
keras.metrics.FalseNegatives(name='fn'),
keras.metrics.BinaryAccuracy(name='accuracy'),
keras.metrics.Precision(name='precision'),
keras.metrics.Recall(name='recall'),
keras.metrics.AUC(name='auc'),
]
After this, you will have to add a parameter in compile function:
model.compile(...,metrics=METRICS)
I commented tf, fp, tn, fn for my code and got below output:
Train on 2207 samples, validate on 552 samples
Epoch 1/6
- 7s - loss: 1.2502 - accuracy: 0.6357 - precision: 0.4252 - recall: 0.0688 - auc: 0.5138 - val_loss: 0.6229 - val_accuracy: 0.6667 - val_precision: 0.8000 - val_recall: 0.0214 - val_auc: 0.6800
Epoch 2/6
- 7s - loss: 0.6451 - accuracy: 0.6461 - precision: 0.7500 - recall: 0.0076 - auc: 0.5735 - val_loss: 0.6368 - val_accuracy: 0.6685 - val_precision: 0.8333 - val_recall: 0.0267 - val_auc: 0.7144
...
Check if this fixes your problem. If I have missed anything, please let me know.
I am working on a multi-label image classification problem with the evaluation being conducted in terms of F1-score between system predicted and ground truth labels.
Given that, should I use loss="binary_crossentropy" or loss=keras_metrics.f1_score() where keras_metrics.f1_score() is taken from here: https://pypi.org/project/keras-metrics/? I am a bit confused because all of the tutorials I have found on the Internet regarding multi-label classification are based on the binary_crossentropy loss function, but here I have to optimize against F1-score.
Furthermore, should I set metrics=["accuracy"] or maybe metrics=[keras_metrics.f1_score()] or I should left this completely empty?
Based on user706838 answer ...
use the f1_score in https://www.kaggle.com/rejpalcz/best-loss-function-for-f1-score-metric
import tensorflow as tf
import keras.backend as K
def f1_loss(y_true, y_pred):
tp = K.sum(K.cast(y_true*y_pred, 'float'), axis=0)
tn = K.sum(K.cast((1-y_true)*(1-y_pred), 'float'), axis=0)
fp = K.sum(K.cast((1-y_true)*y_pred, 'float'), axis=0)
fn = K.sum(K.cast(y_true*(1-y_pred), 'float'), axis=0)
p = tp / (tp + fp + K.epsilon())
r = tp / (tp + fn + K.epsilon())
f1 = 2*p*r / (p+r+K.epsilon())
f1 = tf.where(tf.is_nan(f1), tf.zeros_like(f1), f1)
return 1 - K.mean(f1)
My problem is I don't want the weights to be adjusted if y_true takes certain values. I do not want to simply remove those examples from training data because of the nature of the RNN I am trying to use.
Is there a way to write a conditional loss function in Keras with this behavior?
For example: if y_true is negative then apply zero gradient so that parameters in the model do not change, if y_true is positive loss = losses.mean_squared_error(y_true, y_pred).
You can define a custom loss function and simply use K.switch to conditionally get zero loss:
from keras import backend as K
from keras import losses
def custom_loss(y_true, y_pred):
loss = losses.mean_squared_error(y_true, y_pred)
return K.switch(K.flatten(K.equal(y_true, 0.)), K.zeros_like(loss), loss)
Test:
from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Dense(1, input_shape=(1,)))
model.compile(loss=custom_loss, optimizer='adam')
weights, bias = model.layers[0].get_weights()
x = np.array([1, 2, 3])
y = np.array([0, 0, 0])
model.train_on_batch(x, y)
# check if the parameters has not changed after training on the batch
>>> (weights == model.layers[0].get_weights()[0]).all()
True
>>> (bias == model.layers[0].get_weights()[1]).all()
True
Since the y's are in batches, you need to select those from the batch which are non-zero in the custom loss function
def myloss(y_true, y_pred):
idx = tf.not_equal(y_true, 0)
y_true = tf.boolean_mask(y_true, idx)
y_pred = tf.boolean_mask(y_pred, idx)
return losses.mean_squared_error(y_true, y_pred)
Then it can be used as such:
model = keras.Sequential([Dense(32, input_shape=(2,)), Dense(1)])
model.compile('adam', loss=myloss)
x = np.random.randn(2, 2)
y = np.array([1, 0])
model.fit(x, y)
But you might need extra logic in the loss function in case all y_true in the batch were zero, in this case, the loss function can be modified as such:
def myloss2(y_true, y_pred):
idx = tf.not_equal(y_true, 0)
y_true = tf.boolean_mask(y_true, idx)
y_pred = tf.boolean_mask(y_pred, idx)
loss = tf.cond(tf.equal(tf.shape(y_pred)[0], 0), lambda: tf.constant(0, dtype=tf.float32), lambda: losses.mean_squared_error(y_true, y_pred))
return loss