Classification and
Decision Making
Classification is what most AI systems do in practice: look at an input and assign it to a category. This lesson covers decision trees, confidence scores, and what happens when the AI gets it wrong.
In 2020, a study published in Nature Medicine found that an AI could detect breast cancer
from mammograms more accurately than radiologists. It reduced false negatives by 9.4%
and false positives by 5.7%. The AI was classifying each scan into one of two categories:
cancer or no cancer.
But here is the uncomfortable part. The AI could not say why it made each decision.
The radiologist could point to the specific area of concern and explain their reasoning.
The AI just gave a probability: 87% likely malignant. You either trusted it or you did not.
McKinney et al., Nature Medicine, 2020.
What classification means
Classification is the task of assigning an input to one of several predefined categories. It is probably the most common task in applied machine learning.
Binary classification assigns each input to one of exactly two classes: spam or not spam, fraud or legitimate, cancer or no cancer, pass or fail. The output is a decision plus a confidence score - a probability between 0 and 1 representing how certain the model is.
Multi-class classification assigns each input to one of several categories: a photo to "cat," "dog," or "rabbit." A handwritten digit to 0 through 9. A piece of text to one of dozens of topics. The model produces a confidence score for every possible class, and picks the highest.
Decision trees are one of the oldest and most interpretable classification methods. They work by asking a series of yes/no questions about the input features, following different branches based on the answers, until they reach a leaf node - a final classification. Unlike neural networks, every decision is explainable: you can trace exactly why the model classified something the way it did.
Recall: Of all the actual positives, how many did the model find? High recall means few things missed.
In cancer screening, you want high recall - missing a real cancer is worse than a false alarm. In spam filtering, you want high precision - blocking a real email is worse than missing spam.
A decision tree, visualised
This is how a decision tree classifies an email as spam or not spam. Each node is a question; each branch is a yes/no answer; each leaf is a classification.
Notice how every decision is traceable. You can always answer "why did it classify this as spam?" - which is something neural networks often cannot do.
Walk the tree
You are walking through a decision tree that classifies animals. Answer each question about the animal you are thinking of. The tree will follow your answers to a classification.
False positive or false negative?
Precision and recall sound abstract. These six real-world scenarios will make them concrete. For each one, decide whether the AI outcome described is a false positive (AI flags something that is not actually there) or a false negative (AI misses something that actually is there).
Questions worth thinking about
What to remember
Explore further
Wikipedia makes an excellent starting point for established computing concepts. For any specific fact or claim, scroll to the References section at the bottom of the article and go to the primary source directly.
Check your understanding
Exam-style practice
Practice what you've learned
Three printable worksheets covering classifiers, decision trees, and false positives at three levels: Recall, Apply, and Exam-style.