In this article we will discuss how to measure the performance of a machine learning model using Accuracy method.
1. Accuracy
It is one of the simplest technique to measure the performance of a model. It is used for measuring the performace of the classification models. The accuracy value lies between 1 and 0. 1 being the best and 0 the worst.
$$ Accuracy = \frac{\text{Number of classification done correctly } }{\text{Total classification instances in the test set}} $$
Example
Lets suppose we have a dataset of 100 images.
- 60 images of cat
- 40 images of dog
- task = classify the images to cat and dog
- we created a machine learning model and it made the classification as below
Images | Total | Correct Prediction | Incorrect Prediction |
---|---|---|---|
Cat | 60 | 55 | 5 |
Dog | 40 | 36 | 4 |
$$ Accuracy = \frac{\text{Number of classification done correctly } }{\text{Total classification instances in the test set}}= \frac{ 55+36}{100} = 91 \% $$
Issues
Failure with imbalanced data
If data is imbalanced “Accuracy” method will not serve the purpose. Lets suppose in our previous example there are 90 images of cat rather than 60 and 10 images of dog instead of 40. Now we create a “Stupid” algorithm which will classify any image given as cat. In that case, if our new dataset is given as input, the model will classify
- 90 images correctly
- 10 images incorrectly
- Accuracy = 90%
So Accuracy should never be used as a measure of performance machine learning model with imbalanced data.
Failure with Probability Score Interpretation
Accuracy is not the suitable model to evaluate the model that return probability scores. Lets understand this with an example
We have two models Model 1 and Model 2 which returns probability as its output.The condition to calculate the predicted labels y_pred_1 and y_pred_2 is such that any value > 0.5 is considered as 1 and any value <0.5 is considered as 0.so Accuracy cannot use be used as a performance measure for models that returns probablity scoresIf we compare, model_1 and model_2, both have the same accuracy value. But through observation, we can understand that model_1 is better than model_2.
|
|