🎓 Learning corner
Evaluation metrics without a statistics background
A metric is a score that answers one specific question about how well a system is doing. Start with the kind of system you are evaluating, then choose the metric family that matches it.
Choose the right starting point:
Classification
Use this path when a system decides whether something belongs to a class: cat or not cat, name or not name, sensitive or not sensitive.
🎯 Precision
🔍 Recall
⚖️ F1
Search and ranking
Use this path when a system returns an ordered list: search results, similar documents, recommendations, or ranked answers.
🔍 Recall@1
⚔️ Spearman
🥇 MRR@10
One-page cheat sheet
| Metric | Plain-English question | Best for |
|---|---|---|
| 🎯 Precision | When the system says yes, how often is it right? | Avoiding false alarms. |
| 🔍 Recall | Of everything we wanted to find, how much did we find? | Avoiding missed items. |
| ⚖️ F1 | Is the system both careful and complete? | Balancing precision and recall. |
| 🔍 Recall@1 | Was the correct item first? | Strict top-result evaluation. |
| ⚔️ Spearman | Do two rankings mostly agree? | Comparing ordered lists. |
| 🥇 MRR@10 | How high was the first correct item within the top 10? | Search, recommendation, and retrieval systems. |