I want to do evaluation of a classification Tensorflow model.
To compute the accuracy, I have the following code :
predictions = tf.argmax(logits, axis=-1, output_type=tf.int32)
accuracy = tf.metrics.accuracy(labels=label_ids, predictions=logits)
It work well in single label classification, but now I want to do multilabel classification, where my labels are Array of Integers instead of Integers.
Here is an example of label [0, 1, 1, 0, 1, 0] that are stored in label_ids, and an example of predictions [0.1, 0.8, 0.9, 0.1, 0.6, 0.2] from the Tensor logits
What function should I use instead of argmax to do so ? (My labels are arrays of 6 Integers with value of either 0 or 1)
If needed, we can suppose that there is a threshold of 0.5.
It is probably better to do this type of post-processing evaluation outside of tensorflow, where it is more natural to try several different thresholds.
If you want to do it in tensorflow, you can consider:
predictions = tf.math.greater(logits, tf.constant(0.5))
This will return a tensor of the original logits shape with True for all entries greater than 0.5. You can then calculate accuracy as before. This is suitable for cases where many labels can be simultaneously true for a given sample.
Use below code to caclutae accuracy in multiclass classification:
tf.argmax will return the axis where y value is max for both y_pred and y_true(actual y).
Further tf.equal is used to find total number of matches (It returns True, False).
Convert the boolean into float(i.e. 0 or 1) and use tf.reduce_mean to calculate the accuracy.
correct_mask = tf.equal(tf.argmax(y_pred,1), tf.argmax(y_true,1))
accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))
Edit
Example with data:
import numpy as np
y_pred = np.array([[0.1,0.5,0.4], [0.2,0.6,0.2], [0.9,0.05,0.05]])
y_true = np.array([[0,1,0],[0,0,1],[1,0,0]])
correct_mask = tf.equal(tf.argmax(y_pred,1), tf.argmax(y_true,1))
accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))
with tf.Session() as sess:
# print(sess.run([correct_mask]))
print(sess.run([accuracy]))
Output:
[0.6666667]
Related
I have a y_test.shape = (2005, 1) I build a model and did a prediction as follow prediction=model1.predict(np.array(test_x))
I got prediction shape (2005, 7)
I wanted to get the model accuracy but due to diffrent shapes I can't perform perform accuracy calculation. The way model predict worked is each class was in a column and if the pictures belongs to the class it will have 1. Is there a way to make it vector with each row corresponding to the class number.
Also is there another way to get the accuracy of the prediction if I have the labeled test? and I want to compare it with the prediction?
Assume prediction is the predicted probability of test_x in each and every class, yo u may use prediction.argmax(axis=1) to get a size-2005 vector containing the index of the predicted class (from 0 to 6). The prediction loss can also be computed
pred_class = prediction.argmax(axis=1)
l2loss = ((pred_class - y_test[:, 0]) ** 2).mean()
y_test[:, 0] merely change a matrix (2005, 1) to a vector of size 2005. Or you may do the 0/1 loss as
binary_loss = (pred_class != y_test[:, 0]).mean()
I want to build a custom accuracy metric with tolerance. Instead of counting elements exactly equal in y_true and y_pred, this accuracy regards the two elements are consistent if their difference within a given tolerance value. For example, if the differences between predicted degrees and true degrees are smaller than 5 degree, we can think the results are correct and calculate the accuracy based on this rule. I want to use this metric in model.compile so it should be a callable function.
I wrote a function as follows.
def accuracy_with_tolerence(y_true,y_pred):
"""
y_true/y_pred: batch of samples; (BatchSize, 1)
"""
threshold = 5
differnece = tf.abs(tf.subtract(y_true,y_pred)) - threshold
boolean_results = [True if i < 0 else False for i in differnece]
return K.mean(math_ops.cast(boolean_results, K.floatx()))
It can return the correct accuracy value.
x = tf.constant([1, 2, 3], dtype=tf.float32)
y = tf.constant([5, 8, 10], dtype=tf.float32)
acc = accuracy_with_tolerence(x,y)
print(acc)
tf.Tensor(0.33333334, shape=(), dtype=float32)
But when I want to use it in compile, there is an error:
# Initialize ResNet50
model = resnet50()
model.compile(optimizer='adam',loss='mse',metrics=[accuracy_with_tolerence])
model.load_weights(checkpoint_filepath_0)
model.evaluate(x_test,y_test)
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.
It seems I cannot iterate the Tensor. So how can I get element-wise boolean comparison results in the metric function? How can I realize this accuracy function?
Thank you in advance.
You can't make a list comprehension with a tensor. The operation you're looking for is tf.where and you can use it as follows:
def accuracy_with_tolerence(y_true, y_pred):
threshold = 5
differnece = tf.abs(tf.subtract(y_true, y_pred)) - threshold
boolean_results = tf.where(differnece>0, True, False)
return K.mean(math_ops.cast(boolean_results, K.floatx()))
Note that you can simplify the code further:
...
boolean_results = tf.where(tf.abs(tf.subtract(y_true, y_pred)) - threshold>0, 1., 0.)
return K.mean(boolean_results)
I have found an implementation of the Monte carlo Dropout on pytorch the main idea of implementing this method is to set the dropout layers of the model to train mode. This allows for different dropout masks to be used during the different various forward passes.
The implementation illustrate how multiple predictions from the various forward passes are stacked together and used for computing different uncertainty metrics.
import sys
import numpy as np
import torch
import torch.nn as nn
def enable_dropout(model):
""" Function to enable the dropout layers during test-time """
for m in model.modules():
if m.__class__.__name__.startswith('Dropout'):
m.train()
def get_monte_carlo_predictions(data_loader,
forward_passes,
model,
n_classes,
n_samples):
""" Function to get the monte-carlo samples and uncertainty estimates
through multiple forward passes
Parameters
----------
data_loader : object
data loader object from the data loader module
forward_passes : int
number of monte-carlo samples/forward passes
model : object
keras model
n_classes : int
number of classes in the dataset
n_samples : int
number of samples in the test set
"""
dropout_predictions = np.empty((0, n_samples, n_classes))
softmax = nn.Softmax(dim=1)
for i in range(forward_passes):
predictions = np.empty((0, n_classes))
model.eval()
enable_dropout(model)
for i, (image, label) in enumerate(data_loader):
image = image.to(torch.device('cuda'))
with torch.no_grad():
output = model(image)
output = softmax(output) # shape (n_samples, n_classes)
predictions = np.vstack((predictions, output.cpu().numpy()))
dropout_predictions = np.vstack((dropout_predictions,
predictions[np.newaxis, :, :]))
# dropout predictions - shape (forward_passes, n_samples, n_classes)
# Calculating mean across multiple MCD forward passes
mean = np.mean(dropout_predictions, axis=0) # shape (n_samples, n_classes)
# Calculating variance across multiple MCD forward passes
variance = np.var(dropout_predictions, axis=0) # shape (n_samples, n_classes)
epsilon = sys.float_info.min
# Calculating entropy across multiple MCD forward passes
entropy = -np.sum(mean*np.log(mean + epsilon), axis=-1) # shape (n_samples,)
# Calculating mutual information across multiple MCD forward passes
mutual_info = entropy - np.mean(np.sum(-dropout_predictions*np.log(dropout_predictions + epsilon),
axis=-1), axis=0) # shape (n_samples,)
What I'm trying to do is to calculate accuracy across different forward passes, can anyone please help me on how to get the accuracy and make any changes on the dimensions used in this implementation
I am using the CIFAR10 dataset and would like to use the dropout on test time The code for the data_loader
testset = torchvision.datasets.CIFAR10(root='./data', train=False,download=True, transform=test_transform)
#loading the test set
data_loader = torch.utils.data.DataLoader(testset, batch_size=n_samples, shuffle=False, num_workers=4) ```
Accuracy is the percentage of correctly classified samples. You can create a boolean array that indicates whether a certain prediction is equal to its corresponding reference value, and you can get the mean of these values to calculate accuracy. I have provided a code example of this below.
import numpy as np
# 2 forward passes, 4 samples, 3 classes
# shape is (2, 4, 3)
dropout_predictions = np.asarray([
[[0.2, 0.1, 0.7], [0.1, 0.5, 0.4], [0.9, 0.05, 0.05], [0.25, 0.74, 0.01]],
[[0.1, 0.5, 0.4], [0.2, 0.6, 0.2], [0.8, 0.10, 0.10], [0.25, 0.01, 0.74]]
])
# Get the predicted value for each sample in each forward pass.
# Shape of output is (2, 4).
classes = dropout_predictions.argmax(-1)
# array([[2, 1, 0, 1],
# [1, 1, 0, 2]])
# Test equality among the reference values and predicted classes.
# Shape is unchanged.
y_true = np.asarray([2, 1, 0, 1])
elementwise_equal = np.equal(y_true, classes)
# array([[ True, True, True, True],
# [False, True, True, False]])
# Calculate the accuracy for each forward pass.
# Shape is (2,).
elementwise_equal.mean(axis=1)
# array([1. , 0.5])
In the example above, you can see that the accuracy for the first forward pass was 100%, and the accuracy for the second forward pass was 50%.
#jakub's answer is correct. However, I wanted to propose an alternate approach that may be better especially for more nascent researchers.
Scikit-learn comes with many built in performance measurement functions, including accuracy. To get those approaches to work with PyTorch, you only need to convert your torch tensor to numpy arrays:
x = torch.Tensor(...) # Fill-in as needed
x_np = x.numpy() # Convert to numpy
Then, you simply import scikit-learn:
from sklearn.metrics import accuracy_score
y_pred = [0, 2, 1, 3]
y_true = [0, 1, 2, 3]
accuracy_score(y_true, y_pred)
This simply returns 0.5. Easy peasy and less likely to have a bug.
I'm trying to segment data where the label can be quite sparse. Therefore I want to only calculate gradients in columns that have at least one nonzero value.
I've tried some methods where I apply an extra input which is the mask of these nonzero columns, but given that all the necessary information already is contained in y_true, a method which only looks at y_true to find the mask would definitely be preferable.
If I would implement it with numpy, it would probably look something like this:
def loss(y_true, y_pred):
indices = np.where(np.sum(y_true, axis=1) > 0)
return binary_crossentropy(y_true[indices], y_pred[indices])
y_true and y_pred are in this example vectorized 2D images.
How could this be "translated" to a differentiable Keras loss function?
Use tf-compatible operations, via tf and keras.backend:
import tensorflow as tf
import keras.backend as K
from keras.losses import binary_crossentropy
def custom_loss(y_true, y_pred):
indices = K.squeeze(tf.where(K.sum(y_true, axis=1) > 0))
y_true_sparse = K.cast(K.gather(y_true, indices), dtype='float32')
y_pred_sparse = K.cast(K.gather(y_pred, indices), dtype='float32')
return binary_crossentropy(y_true_sparse, y_pred_sparse) # returns a tensor
I'm unsure about the exact dimensionality specs of your problem, but loss must evaluate to a single value - which above doesn't, since you're passing multi-dimensional predictions and labels. To reduce dims, wrap the return above with e.g. K.mean. Example:
y_true = np.random.randint(0,2,(10,2))
y_pred = np.abs(np.random.randn(10,2))
y_pred /= np.max(y_pred) # scale between 0 and 1
print(K.get_value(custom_loss(y_true, y_pred))) # get_value evaluates returned tensor
print(K.get_value(K.mean(custom_loss(y_true, y_pred))
>> [1.1489482 1.2705883 0.76229745 5.101402 3.1309896] # sparse; 5 / 10 results
>> 2.28284 # single value, as required
(Lastly, note that this sparsity will bias the loss by excluding all-zero columns from the total label/pred count; if undesired, you can average via K.sum and K.shape or K.size)
I learning AI with Python and have this situation: I created a deep learning model that has 10 neurons in his Input layer. On the output layer I have 3 neurons. I split up my data to 80% for learning and 20% for testing.
The trained model is ready for testing.
Until now, I always got situation that I have only one neuron in the output layer. So, I tested the accuracy in that way:
classifier = Sequential()
# ...
classifier.add(Dense(units = 3, kernel_initializer = 'uniform', activation = 'sigmoid'))
# ...
y_pred = classifier.predict(np.array(X_test))
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
which working great when the output layer has only ONE value on each prediction.
In my case, I have 3 values in each prediction.
y_pred = array ([[3.142904686503911194e-11, 1.000000000000000000e+00, 1.729809626091548085e-16],
[7.398544450698540942e-12, 1.000000000000000000e+00, 1.776427415878292515e-22],
[4.224535246066807304e-07, 1.000000000000000000e+00 7.929732391553923065e-12]])
And I want to compare it to my expected values, which:
y_test = [[0, 1, 0], [0, 1, 0], [0, 1, 0]]
So, I have the option to make this work manually:
Put 1 in the highest value in the prediction value. Other values are getting 0.
Compare the two vectors row by row.
It looks like must have a better way to do it?
You want to measure how "close" the prediction vector is to the expected vector. A good formula that describes the "amount of difference" between two vectors is to check the magnitude (or square magnitude) of the delta vector (prediction - expected).
In this case, you can do something like this:
def square_magnitude(vector):
return sum(x*x for x in vector)
def inaccuracy(pred, test): #should only get equal-length items
return square_magnitude([pred[i] - test[i] for i in range(len(pred))]) / len(pred)
Since you have three samples:
total_inaccuracy = sum(inaccuracy(y_pred[i], y_test[i]) for i in range(len(y_pred))) / len(y_pred)
This should be 0 when it's perfectly accurate and higher (positive) when it's less accurate.