By clicking "Accept", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. See our Privacy Policy for more information
Glossary
Accuracy
AI DEFINITION

Accuracy

In Machine Learning, accuracy is one of the most straightforward performance metrics. It tells us the proportion of predictions a model gets right compared to the total number of predictions it makes.

Why it matters
If a model achieves 90% accuracy, it means that out of every 10 predictions, 9 were correct. This simplicity makes accuracy an appealing first measure of model performance.

Limitations
Accuracy can, however, give a false sense of reliability in cases where the data is imbalanced. For example, if 95% of medical test results are negative and only 5% positive, a model that always predicts “negative” will still achieve 95% accuracy — but completely fail at detecting actual positives. That’s why complementary metrics like precision, recall, and F1-score are essential for a complete evaluation.

Applications

Accuracy is often described as the most intuitive metric: it simply asks, “How often did the model get it right?” For quick sanity checks, it is invaluable, especially during the early stages of model development when one needs a baseline reference. However, accuracy alone rarely tells the whole story. In practice, practitioners often distinguish between overall accuracy and balanced accuracy, the latter correcting for imbalanced datasets by averaging performance across classes.

In real-world systems, accuracy must be interpreted in context. A 99% accurate handwriting recognition model may be exceptional, but a 99% accurate self-driving car vision system would still be catastrophically unreliable if the 1% includes pedestrians or stop signs. This is why industries under strict safety or regulatory standards often prioritize other metrics—such as recall in medical tests or precision in fraud detection—over plain accuracy.

Another subtlety is that accuracy assumes all errors carry the same cost, which is rarely true. Missing a fraudulent transaction is far more damaging than wrongly flagging a legitimate one. Because of this, practitioners often use confusion matrices to break down true positives, true negatives, false positives, and false negatives, offering a more nuanced understanding than accuracy alone.

📚 Further Reading

  • Sokolova, M. & Lapalme, G. (2009). A systematic analysis of performance measures for classification.
  • Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow.