I am using multi class classification problem and solved using XGBoost. Number of unique classes are 7.
I Got Classification report with each class Precision, Recall and F1 score.
I did not have any coding clue to try on this in Python.
I need Mean Per Class Accuracy of each class. Is there any mathematical formula to calculate Per class accuracy.
Update:
Test data per class samples:
Class # samples
0 13
1 16
2 9
SVM predictions per class samples:
Class # samples
0 13
1 15
2 10
SVM Classification Report is:
svm precision recall f1-score support
0 1.00 1.00 1.00 13
1 1.00 0.94 0.97 16
2 0.90 1.00 0.95 9
micro avg 0.97 0.97 0.97 38
macro avg 0.97 0.98 0.97 38
weighted avg 0.98 0.97 0.97 38
Can you please suggest me based on this?
Per-class recall = (members of class identified correctly)/(number of members of class)
Simply multiply each per-class recall value by the number of samples that are actually in the class to get the number of each class classified correctly, add these up to get the total number of correct predictions, and then divide by the total number of samples to get the mean per-class accuracy.
Related
I have the code for checking metrics accuracy of the model, but I don't understand how can I use this model to test on one string, for practical application.
Currently, I have this code of LogisticRegression for sentiment analysis, using GoEmotions dataset.
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer(max_features=2500, min_df=7, max_df=0.8, stop_words=cached_stopwords)
train_bow = count_vect.fit_transform(train_data[0])
test_bow = count_vect.transform(test_data[0])
from sklearn.linear_model import LogisticRegression
lr_model = LogisticRegression(n_jobs=-1, max_iter=150)
lr_model.fit(train_bow, train_data[1])
pred_lr = lr_model.predict(test_bow)
and I can see the accuracy of the model using "metrics(prediction, actual)" function
Confusion_matrix
[[ 240 12 2 99 309 22 19]
[ 28 24 2 7 17 2 4]
[ 8 1 39 12 23 5 2]
[ 57 1 5 1529 407 21 34]
[ 101 6 5 225 1186 23 60]
[ 22 1 1 54 105 124 10]
[ 25 1 1 102 303 15 126]]
Accuracy: 0.6021743136170997
classification_report
precision recall f1-score support
anger 0.50 0.34 0.41 703
disgust 0.52 0.29 0.37 84
fear 0.71 0.43 0.54 90
joy 0.75 0.74 0.75 2054
neutral 0.50 0.74 0.60 1606
sadness 0.58 0.39 0.47 317
surprise 0.49 0.22 0.30 573
accuracy 0.60 5427
macro avg 0.58 0.45 0.49 5427
weighted avg 0.61 0.60 0.59 5427
Now, how can I predict on a single input sample ?
I've tried using the model with a different preprocessed text data,
specifically with the 'Amazon Review Data (2018)' but only using 'reviewText'.
This is the code I've tried.
import joblib
with open('lr_joblib.pkl', 'rb') as f:
lr_model = joblib.load(f)
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer(max_features=2500, min_df=7, max_df=0.8, stop_words=cached_stopwords)
count_vect.fit(data['reviewText'])
test_bow = count_vect.transform(data['reviewText'])
lr_model.predict(test_bow)
I was able to produce an array of emotion prediction, this is the output of print(data.iloc[0]) after adding the output from the model as a new column. The name of new columns that the model added are: "emotion" and "sentiment".
overall 5
verified True
reviewText looks better person careful drop phone rhinest...
emotion joy
sentiment positive
Name: 0, dtype: object
I can use the model with single-column lists, i think....
but how I'm not sure how or if I can use this model on a single input.
For example I have this string variable like:
x = "dont want spend lot cash want great deal this shop buy"
and I want to predict the "emotion" and "sentiment" of this 'x' variable with the model I have.
I have trained a model using TensorFlow and SSD MobileNet. I was able to find the mean average precision of the model. Is their a way to find the average precision of each class in the models.
I am using Tensorflow 2.5 version.
Thanks in advance
You can use sklearn per the code below. For both the confusion matrix and the classification report you need to provide y_predict and y_true. After you train you model then do predictions on the test set. Somewhere I assume you have y_true in your code as the label for the classes. I will assume they are present in a list called y_true and are in the SAME order as your inputs to model.predict. I will also assume you have a list called classes which are the names of your classes in order.
For example if cats is label 0 and dogs is label 1 then classes=[cats, dogs]
from sklearn.metrics import confusion_matrix, classification_report
preds=model.predict ---etc
ypredict=[]
for p in preds:
index=np.argmax(p)
y_predict.append(index)
y_true= np.array(y_true)
y_predict=np.array(y_predict)
# create a confusion matrix
cm = confusion_matrix(y_true, y_predict )
# code below formats the confusion matrix plot
length=len(classes)
if length<8:
fig_width=8
fig_height=8
else:
fig_width= int(length * .5)
fig_height= int(length * .5)
plt.figure(figsize=(fig_width, fig_height))
sns.heatmap(cm, annot=True, vmin=0, fmt='g', cmap='Blues', cbar=False)
plt.xticks(np.arange(length)+.5, classes, rotation= 90)
plt.yticks(np.arange(length)+.5, classes, rotation=0)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
clr = classification_report(y_true, y_pred, target_names=classes)
print("Classification Report:\n----------------------\n", clr)
Below is an example of a classification report
Classification Report:
----------------------
precision recall f1-score support
Banana 1.00 1.00 1.00 15
Bread 1.00 1.00 1.00 15
Eggs 1.00 1.00 1.00 15
Milk 1.00 1.00 1.00 15
Mixed 1.00 1.00 1.00 12
Potato 1.00 1.00 1.00 15
Spinach 1.00 1.00 1.00 15
Tomato 1.00 1.00 1.00 15
accuracy 1.00 117
macro avg 1.00 1.00 1.00 117
weighted avg 1.00 1.00 1.00 117
x_test,x_val,y_test,y_val = train_test_split(x_test,y_test,test_size=0.5)
print(x_train.shape)
#(1413, 3) <----Result
print(x_val.shape)
#(472, 3) <----Result
print(x_test.shape)
#(471, 3) <----Result
I proceeded with data split using machine learning and got the above results.
from sklearn.tree import DecisionTreeClassifier
dTree = DecisionTreeClassifier(max_depth=2,random_state=0).fit(x_train,y_train)
print("train score : {}".format(dTree.score(x_train, y_train)))
#train score : 1.0 <----Result
print("val score : {}".format(dTree.score(x_val, y_val)))
#val score : 1.0 <----Result
We then used Decision Tree to print out the score of train and val, respectively, and the results were all 1.
predict_y = dTree.predict(x_test)
from sklearn.metrics import classification_report
print(classification_report(y_test, dTree.predict(x_test)))
print("test score : {}".format(dTree.score(x_test, y_test)))
precision recall f1-score support
A 1.00 1.00 1.00 235
B 1.00 1.00 1.00 236
accuracy 1.00 471
macro avg 1.00 1.00 1.00 471
weighted avg 1.00 1.00 1.00 471
test score : 0.9978768577494692
Finally, classification_report also showed the above results. Are some of my data splits wrong? Or Does the value of 1 mean all datas perfectly classified?If I'm wrong, I want to hear the right solution.
I'm classifying my data using several algorithms including
KNN, LogisticRegression, RandomForrest, DecisionTreeClassifier, GaussianNB etc.
After fitting my data I am analyzing results using the following:
from sklearn.metrics import confusion_matrix, classification_report
classification_report(y_test, predicted)
Im not totally clear on the semantics of the "predicted positive / negative" et.al in respects to which label it is trying to predict.
Also maybe more importantly I don't understand and am trying to analize why all of the various algorithms are predicting relatively well in regards to "Predicted Negative / True Negative vs Predicted Negative / True Positive" portions but very bad in regards to the "Predict Positive" portion .
In other words from my understanding it is quite good at saying "not something" but basically tossing a coin at predicting "is something" (around 50-50)
here are some example classification reports I generated for the different techniques:
confusion matrix (knn)
Predicted Negative Predicted Positive
True Negative 14776 5442
True Positive 2367 6337
precision recall f1-score support
f 0.73 0.86 0.79 17143
t 0.73 0.54 0.62 11779
avg / total 0.73 0.73 0.72 28922
confusion matrix (SVM)
Predicted Negative Predicted Positive
True Negative 14881 4947
True Positive 2262 6832
precision recall f1-score support
f 0.75 0.87 0.81 17143
t 0.75 0.58 0.65 11779
avg / total 0.75 0.75 0.74 28922
confusion matrix (logistic regression)
Predicted Negative Predicted Positive
True Negative 14881 4947
True Positive 2262 6832
precision recall f1-score support
f 0.75 0.87 0.81 17143
t 0.75 0.58 0.65 11779
avg / total 0.75 0.75 0.74 28922
confusion matrix (decision tree)
Predicted Negative Predicted Positive
True Negative 14852 4941
True Positive 2291 6838
precision recall f1-score support
f 0.75 0.87 0.80 17143
t 0.75 0.58 0.65 11779
avg / total 0.75 0.75 0.74 28922
confusion matrix (naive_bayes)
Predicted Negative Predicted Positive
True Negative 13435 4759
True Positive 3708 7020
precision recall f1-score support
f 0.74 0.78 0.76 17143
t 0.65 0.60 0.62 11779
avg / total 0.70 0.71 0.70 28922
confusion matrix (random_forest)
Predicted Negative Predicted Positive
True Negative 13287 5248
True Positive 3856 6531
precision recall f1-score support
f 0.72 0.78 0.74 17143
t 0.63 0.55 0.59 11779
avg / total 0.68 0.69 0.68 28922
confusion matrix (gradient_boost)
Predicted Negative Predicted Positive
True Negative 15071 5583
True Positive 2072 6196
precision recall f1-score support
f 0.73 0.88 0.80 17143
t 0.75 0.53 0.62 11779
avg / total 0.74 0.74 0.72 28922
confusion matrix (neural network MLPClassifier)
Predicted Negative Predicted Positive
True Negative 10789 3653
True Positive 6354 8126
precision recall f1-score support
f 0.75 0.63 0.68 17143
t 0.56 0.69 0.62 11779
avg / total 0.67 0.65 0.66 28922
The only one which seems to predict "Predicted Positive" reasonably was the MLPClassifier classifier.
Sorry, I didn't know how the dataset you used looks like. But let's say there is a flipping coin experiment with 2 kinds of results, either head (1) or tail (0). Now we implement a regression algorithm to predict the results based a bunch of possible features.
If the prediction is correct (as same as the class label), we will count it as a true one. If not, it will be a false record.
If the algorithm outputs a "Head" prediction, it would be regarded as a positive result, and negative for "tail".
For single "True Positive" portion, it has a little value. But if we add it with "False Negative", the sum of them is actually the amount of positive case.
And if we divid "True Positive" by the sum of all positive case, which is normally called "recall" or TP rate, we would get the accuracy of this model in predicting positive (head) case.
We could compare the TP rate(TP/P) with the FP rate(FP/N) to analyze the performance of a given model.
There is also some other combination and usage with these positive, negative, true, false and rate things, such as sensitivity and specificity etc..
If you want to know more, I would recommend you to look ROC Curve
I fit a Logistic Regression Model and train the model based on training dataset using the following
import scikits as sklearn
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(C=0.1, penalty='l1')
model = lr.fit(training[:,0:-1], training[:,-1)
I have a cross validation dataset which contains a labels associated in input matrix and can be accessed as
cv[:,-1]
I run my cross validation dataset against the trained model which returns me the list of 0s and 1s based on prediction
cv_predict = model.predict(cv[:,0:-1])
Question
I want to calculate the precision and recall scores based on acutal labels and predicted labels. Is there a standard method to do it using numpy/scipy/scikits?
Thank you
Yes there are, see the documentation: http://scikit-learn.org/stable/modules/classes.html#classification-metrics
You should also have a look at the sklearn.metrics.classification_report utility:
>>> from sklearn.metrics import classification_report
>>> from sklearn.linear_model import SGDClassifier
>>> from sklearn.datasets import load_digits
>>> digits = load_digits()
>>> n_samples, n_features = digits.data.shape
>>> n_split = n_samples / 2
>>> clf = SGDClassifier().fit(digits.data[:n_split], digits.target[:n_split])
>>> predictions = clf.predict(digits.data[n_split:])
>>> expected = digits.target[n_split:]
>>> print classification_report(expected, predictions)
precision recall f1-score support
0 0.90 0.98 0.93 88
1 0.81 0.69 0.75 91
2 0.94 0.98 0.96 86
3 0.94 0.85 0.89 91
4 0.90 0.93 0.91 92
5 0.92 0.92 0.92 91
6 0.92 0.97 0.94 91
7 1.00 0.85 0.92 89
8 0.71 0.89 0.79 88
9 0.89 0.83 0.86 92
avg / total 0.89 0.89 0.89 899