I have a question related to the roc_curve from scikit-learn for a deep learning exercise, I have noticed that my data has 1 as the positive label. After my training the testing accuracy is coming around 74% but the roc area under curve(AUC) score coming as only as .24.
y_pred = model.predict([x_test_real[:, 0],x_test_real[:, 1]])
fpr, tpr, thresholds = metrics.roc_curve(y_test_real, y_pred,pos_label=1)
roc_auc = metrics.auc(fpr, tpr)
print("roc_auc: %0.2f" % roc_auc)
If I change the pos_label to 0. The auc score becomes 0.76(obviously)
y_pred = model.predict([x_test_real[:, 0],x_test_real[:, 1]])
fpr, tpr, thresholds = metrics.roc_curve(y_test_real, y_pred,pos_label=0)
roc_auc = metrics.auc(fpr, tpr)
print("roc_auc: %0.2f" % roc_auc)
Now I ran a small experiment, I changed my training and testing labels(which are binary classification)
y_train_real = 1 - y_train_real
y_test_real = 1 - y_test_real
like this, which should flip the positive and negative labels from 1 to 0. Then I run my code again. This time expecting the behavior of the roc auc to flip as well. But NO!
fpr, tpr, thresholds = metrics.roc_curve(y_test_real, y_pred,pos_label=0)
Is still giving .80 and with pos_label=1 is giving .2. This is confusing me,
If I change the positive label in my training target should it not affect the roc_curve auc values??
Which case is the correct analysis
Does the output has anything to do with the loss function used? I am solving a binary classification problem of match and not match using "contrastive loss"
Can anyone help me here? :)
Related
I try to figure out what is the best threshold to turn probability predictions (of logistic regression) into hard labels in binary classification. I read that Youden’s J statistics (calculating True Positive Rate – False Positive Rate at different thresholds and pick one with highest TPR-FPR value) is a good way to do this. So I have put together the following:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score, roc_curve
model = LogisticRegression()
model.fit(X, y)
y_pred = model.predict(X)
y_pred_proba = model.predict_proba(X)
y_pred_proba = y_pred_proba[:, 1]
fpr, tpr, thresholds = roc_curve(y, y_pred_proba, drop_intermediate=True)
Then, I calculated the best threshold with the following:
best_J_index = np.argmax(tpr-fpr)
best_threshold = thresholds[best_J_index]
This indeed gives better result than the default 0.5 probability threshold, but when I check all thresholds in thresholds regards to F1-score with the following:
for thr in thresholds:
y_pred_hard = np.where(y_pred_proba > thr, 1, 0)
print(f"Threshold: {thr}, F1: {f1_score(y, y_pred_hard)}")
I got different best threshold! Is there any calculation error in my script, or F1 and Youden's do not necessarily agree in thresholding (if so, why not)?
I am running a Convolutional Neural Network. After it finishes running, I use some metrics to evaluate the performance of the model. 2 of the metrics are the auc and roc_auc_score from sklearn
AUC function: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.auc.html?highlight=auc#sklearn.metrics.auc
AUROC function: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html#sklearn.metrics.roc_auc_score
The code I am using is the following:
print(pred)
fpr, tpr, thresholds = metrics.roc_curve(true_classes, pred, pos_label=1)
print("-----AUC-----")
print(metrics.auc(fpr, tpr))
print("----ROC AUC-----")
print(metrics.roc_auc_score(true_classes, pred))
Where true_classes is a table which is of the form : [0 1 0 1 1 0] where 1 is the positive label and 0 the negative.
And pred is the predictions of the model:
prediction = classifier.predict(test_final)
prediction1 = []
predictions = []
for preds in prediction:
prediction1.append(preds[0])
pred = prediction1
However I am getting the same AUC and ROC AUC value no matter how many times I run the test (What I mean by that is that AUC and ROC AUC values in each test are the same. Not that they remain the same on all the tests. For example for test 1 I get AUC = 0.987 and ROC_AUC = 0.987 and for test 2 I get AUC = 0.95 and ROC_AUC = 0.95) . Am I doing something wrong? Or is it normal?
As per documentation linked, metrics.auc is a general case method to calculate area under a curve from points of that curve.
metrics.roc_auc_score is a specific case method used to calculate Area Under Curve for ROC curve.
You would not expect to see different results if you're using the same data to calculate both, as metrics.roc_auc_score will do the same thing as metrics.auc and, most likely, use the metrics.auc method itself, under the hood (i.e. use the general method for the specific task of calculating Area under ROC curve).
I have a data having 3 class labels(0,1,2). I tried to make ROC curve. and did it by using pos_label parameter.
fpr, tpr, thresholds = metrics.roc_curve(Ytest, y_pred_prob, pos_label = 0)
By changing pos_label to 0,1,2- I get 3 graphs, Now I am having issue in calculating AUC score.
How can I average the 3 graphs and plot 1 graph from it and then calculate the Roc_AUC score.
i am having error in by this
metrics.roc_auc_score(Ytest, y_pred_prob)
ValueError: multiclass format is not supported
please help me.
# store the predicted probabilities for class 0
y_pred_prob = cls.predict_proba(Xtest)[:, 0]
#first argument is true values, second argument is predicted probabilities
fpr, tpr, thresholds = metrics.roc_curve(Ytest, y_pred_prob, pos_label = 0)
plt.plot(fpr, tpr)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.title('ROC curve classifier')
plt.xlabel('False Positive Rate (1 - Specificity)')
plt.ylabel('True Positive Rate (Sensitivity)')
plt.grid(True)
# store the predicted probabilities for class 1
y_pred_prob = cls.predict_proba(Xtest)[:, 1]
#first argument is true values, second argument is predicted probabilities
fpr, tpr, thresholds = metrics.roc_curve(Ytest, y_pred_prob, pos_label = 0)
plt.plot(fpr, tpr)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.title('ROC curve classifier')
plt.xlabel('False Positive Rate (1 - Specificity)')
plt.ylabel('True Positive Rate (Sensitivity)')
plt.grid(True)
# store the predicted probabilities for class 2
y_pred_prob = cls.predict_proba(Xtest)[:, 2]
#first argument is true values, second argument is predicted probabilities
fpr, tpr, thresholds = metrics.roc_curve(Ytest, y_pred_prob, pos_label = 0)
plt.plot(fpr, tpr)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.title('ROC curve classifier')
plt.xlabel('False Positive Rate (1 - Specificity)')
plt.ylabel('True Positive Rate (Sensitivity)')
plt.grid(True)
from the above code. 3 roc curves are generated. Due to multi-classes.
I want to have a one roc curve from above 3 by taking average or mean. Then, one roc_auc score from that.
Highlights in the multi-class AUC:
You cannot calculate a common AUC for all classes. You must calculate the AUC for each class separately. Just as you have to calculate the recall, precision is separate for each class when making a multi-class classification.
THE SIMPLEST method of calculating the AUC for individual classes:
We choose a classifier
from sklearn.linear_model import LogisticRegression
LRE = LogisticRegression(solver='lbfgs')
LRE.fit(X_train, y_train)
I am making a list of multi-class classes
d = y_test.unique()
class_name = list(d.flatten())
class_name
Now calculate the AUC for each class separately
for p in class_name:
`fpr, tpr, thresholds = metrics.roc_curve(y_test,
LRE.predict_proba(X_test)[:,1], pos_label = p)
auroc = round(metrics.auc(fpr, tpr),2)
print('LRE',p,'--AUC--->',auroc)`
For multiclass, it is often useful to calculate the AUROC for each class. For example, here's an excerpt from some code I use to calculate AUROC for each class separately, where label_meanings is a list of strings describing what each label is, and the various arrays are formatted such that each row is a different example and each column corresponds to a different label:
for label_number in range(len(label_meanings)):
which_label = label_meanings[label_number] #descriptive string for the label
true_labels = true_labels_array[:,label_number]
pred_probs = pred_probs_array[:,label_number]
#AUROC and AP (sliding across multiple decision thresholds)
fpr, tpr, thresholds = sklearn.metrics.roc_curve(y_true = true_labels,
y_score = pred_probs,
pos_label = 1)
auc = sklearn.metrics.auc(fpr, tpr)
If you want to plot an average AUC curve across your three classes: This code https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html includes parts that calculate the average AUC so that you can make a plot (if you have three classes, it will plot the average AUC for the three classes.)
If you just want an average AUC across your three classes: once you have calculated the AUC of each class separately you can average the three numbers to get an overall AUC.
If you want more background on AUROC and how it is calculated for single class versus multi class you can see this article, Measuring Performance: AUC (AUROC).
I have an understanding problem by using the roc libraries.
I want to plot a roc curve with a python
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html
I am writing a program which evalutes detectors (haarcascade, neuronal networks) and want to evaluate them.
So I already have the data saved in a file in the following format:
0.5 TP
0.43 FP
0.72 FN
0.82 TN
...
whereas TP means True Positive, FP - False Positivve, FN - False Negative, TN - True Negative
I parse it and fill 4 arrays with this data set.
Then I want to put this in
fpr, tpr = sklearn.metrics.roc_curve(y_true, y_score, average='macro', sample_weight=None)
but how to do this? What is y_true in my case and y_score?
afterwards, I put it fpr, tpr in
auc = sklearn.metric.auc(fpr, tpr)
Quotting Wikipedia:
The ROC is created by plotting the FPR (false positive rate) vs the TPR (true positive rate) at various thresholds settings.
In order to compute FPR and TPR, you must provide the true binary value and the target scores to the function sklearn.metrics.roc_curve.
So in your case, I would do something like this :
from sklearn.metrics import roc_curve
from sklearn.metrics import auc
# Compute fpr, tpr, thresholds and roc auc
fpr, tpr, thresholds = roc_curve(y_true, y_score)
roc_auc = auc(fpr, tpr)
# Plot ROC curve
plt.plot(fpr, tpr, label='ROC curve (area = %0.3f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--') # random predictions curve
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.xlabel('False Positive Rate or (1 - Specifity)')
plt.ylabel('True Positive Rate or (Sensitivity)')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")
If you want to have a deeper understanding of how the False positive rate and the True positive rate are computed for all the possible thresholds values, I suggest you to read this article
I have difficulty in plotting OneClassSVM's AUC plot in python (I am using sklearn which generates confusion matrix like [[tp, fp],[fn,tn]] with fn=tn=0.
from sklearn.metrics import roc_curve, auc
fpr, tpr, thresholds = roc_curve(y_test, y_nb_predicted)
roc_auc = auc(fpr, tpr) # this generates ValueError[1]
print "Area under the ROC curve : %f" % roc_auc
plt.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
I want to handle error [1] and plot AUC for OneClassSVM.
[1] ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Please see my answer on a similar question. The gist is:
OneClassSVM fundamentally doesn't support converting a decision into a probability score, so you cannot pass the necessary scores into functions that require varying a score threshold, such as for ROC or Precision-Recall curves and scores.
You can approximate this type of score by computing the max value of your OneClassSVM's decision function across your input data, call it MAX, and then score the prediction for a given observation y by computing y_score = MAX - decision_function(y).
Use these scores to pass as y_score to functions such as average_precision_score, etc., which will accept non-thresholded scores instead of probabilities.
Finally, keep in mind that ROC will make less physical sense for OneClassSVM specifically because OneClassSVM is intended for situations where there is an expected and huge class imbalance (outliers vs. non-outliers), and ROC will not accurately up-weight the relative success on the small amount of outliers.
Use the predprobs function to calculate the scores or probabilities/scores as asked in the auc(y_true, y_score), the issue is because of y_score. you can convert it as shown in the following line of code
# Classifier - Algorithm - SVM
# fit the training dataset on the classifier
SVM = svm.SVC(C=1.0, kernel='linear', degree=3, gamma='auto',probability=True)
SVM.fit(Train_X_Tfidf,Train_Y)
# predict the labels on validation dataset
predictions_SVM = SVM.predict(Test_X_Tfidf)
# Use accuracy_score function to get the accuracy
print("SVM Accuracy Score -> ",accuracy_score(predictions_SVM, Test_Y))
probs = SVM.predict_proba(Test_X_Tfidf)
preds = probs[:,1]
fpr, tpr, threshold = roc_curve(Test_Y, preds)
print("SVM Area under curve -> ",auc(fpr, tpr))
see the difference between the accuracy_score and the auc(), you need the scores of predictions.
share edit delete flag