Scikit classification report - change the format of displayed results

Scikit classification report - change the format of displayed results - python

Scikit classification report would show precision and recall scores with two digits only. Is it possible to make it display 4 digits after the dot, I mean instead of 0.67 to show 0.6783?
from sklearn.metrics import classification_report
print classification_report(testLabels, p, labels=list(set(testLabels)), target_names=['POSITIVE', 'NEGATIVE', 'NEUTRAL'])
precision recall f1-score support
POSITIVE 1.00 0.82 0.90 41887
NEGATIVE 0.65 0.86 0.74 19989
NEUTRAL 0.62 0.67 0.64 10578
Also, should I worry about a precision score of 1.00? Thanks!

I just came across this old question.
It is indeed possible to have more precision points in classification_report. You just need to pass in a digits argument.
classification_report(y_true, y_pred, target_names=target_names, digits=4)
From the documentation:
digits : int
Number of digits for formatting output floating point values
Demonstration:
from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))
Output:
precision recall f1-score support
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
avg / total 0.70 0.60 0.61 5
With 4 digits:
print(classification_report(y_true, y_pred, target_names=target_names, digits=4))
Output:
precision recall f1-score support
class 0 0.5000 1.0000 0.6667 1
class 1 0.0000 0.0000 0.0000 1
class 2 1.0000 0.6667 0.8000 3
avg / total 0.7000 0.6000 0.6133 5

No, it is not possible to display more digits with classification_report. The format string is hardcoded, see here.
edit: there is an update, see CentAu's answer

Related

How to disable seqeval label formatting for POS-tagging

I am trying to evaluate my POS-tagger using huggingface's implementation of the seqeval metric but, since my tags are not made for NER, they are not formatted the way the library expects them. Consequently, when I try to read the results of my classification report, the labels for class-specific results consistently lack the first character (the last if I pass suffix=True).
Is there a way to disable entity recognition in labels or do I have to pass all my labels with a starting space to solve this issue? (Given that the library is supposed to be suitable for POS-tagging, I hope there is a built-in solution)
SSCCE:
from seqeval.metrics import accuracy_score
from seqeval.metrics import classification_report
from seqeval.metrics import f1_score
y_true = [['INT', 'PRO', 'PRO', 'VER:pres'], ['ADV', 'PRP', 'PRP', 'ADV']]
y_pred = [['INT', 'PRO', 'PRO', 'VER:pres'], ['ADV', 'PRP', 'PRP', 'ADV']]
print(classification_report(y_true, y_pred))
Output:
precision
recall
f1-score
support
DV
1.00
1.00
1.00
2
ER:pres
1.00
1.00
1.00
1
NT
1.00
1.00
1.00
1
RO
1.00
1.00
1.00
1
RP
1.00
1.00
1.00
1
micro avg
1.00
1.00
1.00
6
macro avg
1.00
1.00
1.00
6
weighted avg
1.00
1.00
1.00
6

Is there a way to find the average precision and recall of each class in the custom Tensorflow model?

I have trained a model using TensorFlow and SSD MobileNet. I was able to find the mean average precision of the model. Is their a way to find the average precision of each class in the models.
I am using Tensorflow 2.5 version.
Thanks in advance

You can use sklearn per the code below. For both the confusion matrix and the classification report you need to provide y_predict and y_true. After you train you model then do predictions on the test set. Somewhere I assume you have y_true in your code as the label for the classes. I will assume they are present in a list called y_true and are in the SAME order as your inputs to model.predict. I will also assume you have a list called classes which are the names of your classes in order.
For example if cats is label 0 and dogs is label 1 then classes=[cats, dogs]
from sklearn.metrics import confusion_matrix, classification_report
preds=model.predict ---etc
ypredict=[]
for p in preds:
index=np.argmax(p)
y_predict.append(index)
y_true= np.array(y_true)
y_predict=np.array(y_predict)
# create a confusion matrix
cm = confusion_matrix(y_true, y_predict )
# code below formats the confusion matrix plot
length=len(classes)
if length<8:
fig_width=8
fig_height=8
else:
fig_width= int(length * .5)
fig_height= int(length * .5)
plt.figure(figsize=(fig_width, fig_height))
sns.heatmap(cm, annot=True, vmin=0, fmt='g', cmap='Blues', cbar=False)
plt.xticks(np.arange(length)+.5, classes, rotation= 90)
plt.yticks(np.arange(length)+.5, classes, rotation=0)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
clr = classification_report(y_true, y_pred, target_names=classes)
print("Classification Report:\n----------------------\n", clr)
Below is an example of a classification report
Classification Report:
----------------------
precision recall f1-score support
Banana 1.00 1.00 1.00 15
Bread 1.00 1.00 1.00 15
Eggs 1.00 1.00 1.00 15
Milk 1.00 1.00 1.00 15
Mixed 1.00 1.00 1.00 12
Potato 1.00 1.00 1.00 15
Spinach 1.00 1.00 1.00 15
Tomato 1.00 1.00 1.00 15
accuracy 1.00 117
macro avg 1.00 1.00 1.00 117
weighted avg 1.00 1.00 1.00 117

sklearn: multi-class problem and reporting sensitivity and specificity

I have a three-class problem and I'm able to report precision and recall for each class with the below code:
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
which gives me the precision and recall nicely for each of the 3 classes in a table format.
My question is how can I now get sensitivity and specificity for each of the 3 classes? I looked at sklearn.metrics and I didn't find anything for reporting sensitivity and specificity.

If we check the help page for classification report:
Note that in binary classification, recall of the positive class is
also known as “sensitivity”; recall of the negative class is
“specificity”.
So we can convert the pred into a binary for every class, and then use the recall results from precision_recall_fscore_support.
Using an example:
from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))
Looks like:
precision recall f1-score support
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
accuracy 0.60 5
macro avg 0.50 0.56 0.49 5
weighted avg 0.70 0.60 0.61 5
Using sklearn:
from sklearn.metrics import precision_recall_fscore_support
res = []
for l in [0,1,2]:
prec,recall,_,_ = precision_recall_fscore_support(np.array(y_true)==l,
np.array(y_pred)==l,
pos_label=True,average=None)
res.append([l,recall[0],recall[1]])
put the results into a dataframe:
pd.DataFrame(res,columns = ['class','sensitivity','specificity'])
class sensitivity specificity
0 0 0.75 1.000000
1 1 0.75 0.000000
2 2 1.00 0.666667

Classification report's output is a formatted string. This code snippet extracts the required values and stores it in a 2-D list.
Note: To understand the code better, add print statements to check the variable values.
y = classification_report(y_test,y_pred) #classification report's output is a string
lines = y.split('\n') #extract every line and store in a list
res = [] #list to store the cleaned results
for i in range(len(lines)):
line = lines[i].split(" ") #Values are separated by blanks. Split at the blank spaces.
line = [j for j in line if j!=''] #add only the values into the list
if len(line) != 0:
#empty lines get added as empty lists. Remove those
res.append(line)

all classifiers are predicting "bad" positives

I'm classifying my data using several algorithms including
KNN, LogisticRegression, RandomForrest, DecisionTreeClassifier, GaussianNB etc.
After fitting my data I am analyzing results using the following:
from sklearn.metrics import confusion_matrix, classification_report
classification_report(y_test, predicted)
Im not totally clear on the semantics of the "predicted positive / negative" et.al in respects to which label it is trying to predict.
Also maybe more importantly I don't understand and am trying to analize why all of the various algorithms are predicting relatively well in regards to "Predicted Negative / True Negative vs Predicted Negative / True Positive" portions but very bad in regards to the "Predict Positive" portion .
In other words from my understanding it is quite good at saying "not something" but basically tossing a coin at predicting "is something" (around 50-50)
here are some example classification reports I generated for the different techniques:
confusion matrix (knn)
Predicted Negative Predicted Positive
True Negative 14776 5442
True Positive 2367 6337
precision recall f1-score support
f 0.73 0.86 0.79 17143
t 0.73 0.54 0.62 11779
avg / total 0.73 0.73 0.72 28922
confusion matrix (SVM)
Predicted Negative Predicted Positive
True Negative 14881 4947
True Positive 2262 6832
precision recall f1-score support
f 0.75 0.87 0.81 17143
t 0.75 0.58 0.65 11779
avg / total 0.75 0.75 0.74 28922
confusion matrix (logistic regression)
Predicted Negative Predicted Positive
True Negative 14881 4947
True Positive 2262 6832
precision recall f1-score support
f 0.75 0.87 0.81 17143
t 0.75 0.58 0.65 11779
avg / total 0.75 0.75 0.74 28922
confusion matrix (decision tree)
Predicted Negative Predicted Positive
True Negative 14852 4941
True Positive 2291 6838
precision recall f1-score support
f 0.75 0.87 0.80 17143
t 0.75 0.58 0.65 11779
avg / total 0.75 0.75 0.74 28922
confusion matrix (naive_bayes)
Predicted Negative Predicted Positive
True Negative 13435 4759
True Positive 3708 7020
precision recall f1-score support
f 0.74 0.78 0.76 17143
t 0.65 0.60 0.62 11779
avg / total 0.70 0.71 0.70 28922
confusion matrix (random_forest)
Predicted Negative Predicted Positive
True Negative 13287 5248
True Positive 3856 6531
precision recall f1-score support
f 0.72 0.78 0.74 17143
t 0.63 0.55 0.59 11779
avg / total 0.68 0.69 0.68 28922
confusion matrix (gradient_boost)
Predicted Negative Predicted Positive
True Negative 15071 5583
True Positive 2072 6196
precision recall f1-score support
f 0.73 0.88 0.80 17143
t 0.75 0.53 0.62 11779
avg / total 0.74 0.74 0.72 28922
confusion matrix (neural network MLPClassifier)
Predicted Negative Predicted Positive
True Negative 10789 3653
True Positive 6354 8126
precision recall f1-score support
f 0.75 0.63 0.68 17143
t 0.56 0.69 0.62 11779
avg / total 0.67 0.65 0.66 28922
The only one which seems to predict "Predicted Positive" reasonably was the MLPClassifier classifier.

Sorry, I didn't know how the dataset you used looks like. But let's say there is a flipping coin experiment with 2 kinds of results, either head (1) or tail (0). Now we implement a regression algorithm to predict the results based a bunch of possible features.
If the prediction is correct (as same as the class label), we will count it as a true one. If not, it will be a false record.
If the algorithm outputs a "Head" prediction, it would be regarded as a positive result, and negative for "tail".
For single "True Positive" portion, it has a little value. But if we add it with "False Negative", the sum of them is actually the amount of positive case.
And if we divid "True Positive" by the sum of all positive case, which is normally called "recall" or TP rate, we would get the accuracy of this model in predicting positive (head) case.
We could compare the TP rate(TP/P) with the FP rate(FP/N) to analyze the performance of a given model.
There is also some other combination and usage with these positive, negative, true, false and rate things, such as sensitivity and specificity etc..
If you want to know more, I would recommend you to look ROC Curve

Scipy/Numpy/scikits - calculating precision/recall scores based on two arrays

I fit a Logistic Regression Model and train the model based on training dataset using the following
import scikits as sklearn
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(C=0.1, penalty='l1')
model = lr.fit(training[:,0:-1], training[:,-1)
I have a cross validation dataset which contains a labels associated in input matrix and can be accessed as
cv[:,-1]
I run my cross validation dataset against the trained model which returns me the list of 0s and 1s based on prediction
cv_predict = model.predict(cv[:,0:-1])
Question
I want to calculate the precision and recall scores based on acutal labels and predicted labels. Is there a standard method to do it using numpy/scipy/scikits?
Thank you

Yes there are, see the documentation: http://scikit-learn.org/stable/modules/classes.html#classification-metrics
You should also have a look at the sklearn.metrics.classification_report utility:
>>> from sklearn.metrics import classification_report
>>> from sklearn.linear_model import SGDClassifier
>>> from sklearn.datasets import load_digits
>>> digits = load_digits()
>>> n_samples, n_features = digits.data.shape
>>> n_split = n_samples / 2
>>> clf = SGDClassifier().fit(digits.data[:n_split], digits.target[:n_split])
>>> predictions = clf.predict(digits.data[n_split:])
>>> expected = digits.target[n_split:]
>>> print classification_report(expected, predictions)
precision recall f1-score support
0 0.90 0.98 0.93 88
1 0.81 0.69 0.75 91
2 0.94 0.98 0.96 86
3 0.94 0.85 0.89 91
4 0.90 0.93 0.91 92
5 0.92 0.92 0.92 91
6 0.92 0.97 0.94 91
7 1.00 0.85 0.92 89
8 0.71 0.89 0.79 88
9 0.89 0.83 0.86 92
avg / total 0.89 0.89 0.89 899

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scikit classification report - change the format of displayed results - python

No, it is not possible to display more digits with classification_report. The format string is hardcoded, see here. edit: there is an update, see CentAu's answer

Related

How to disable seqeval label formatting for POS-tagging

Is there a way to find the average precision and recall of each class in the custom Tensorflow model?

sklearn: multi-class problem and reporting sensitivity and specificity

all classifiers are predicting "bad" positives

Scipy/Numpy/scikits - calculating precision/recall scores based on two arrays

Categories

Resources