You are working on a spam classification system using regularized logistic regression. “Spam” is a positive class (y = 1) and “not spam” is the negative class (y = 0). You have trained your classifier and there are m = 1000 examples in the cross-validation set. The chart of predicted class vs. actual class is:
Acutal class 1 | Actual class 0 | |
Predicted class 1 | 85 | 890 |
Predicted class 0 | 15 | 10 |
For reference:
Accuracy = (true positives + true negatives) / (total examples)
Precision = (true positives) / (true positives + false positives)
Recall = (true positives) / (true positives + false negatives)
F1 score = (2 * precision * recall) / (precision + recall)
What is the classifier’s F1 score (as a value from 0 to 1)?
NOTE:
Accuracy = (85 + 10) / (1000) = 0.095
Precision = (85) / (85 + 890) = 0.087
Recall = There are 85 true positives and 15 false negatives, so recall is
85 / (85 + 15) = 0.85.
F1 Score = (2 * (0.087 * 0.85)) / (0.087 + 0.85) = 0.16