Lesson 1 of 6
Understanding AI Lesson 1 - What is AI?
Lesson 1 of 6

What is AI?
Rules vs Learning

Discover the fundamental shift between telling a computer exactly what to do and letting it figure things out for itself - and why that difference underpins everything in modern AI.

GCSE & A-Level Free 2 activities + quiz

In 1997, a computer called Deep Blue beat world chess champion Garry Kasparov. The world called it artificial intelligence. But Deep Blue wasn't intelligent - its programmers had written millions of rules: "if the opponent plays this, respond with that." It won by calculating more moves per second than any human could, not by understanding the game.

Twenty years later, a program called AlphaZero beat the best chess engine in the world after learning the game entirely from scratch - with no rules given to it at all. It played against itself for four hours, discovered strategies that no human had ever tried, and then destroyed every expert system ever built.

- DeepMind, 2017. AlphaZero was given only the legal moves - it worked out everything else alone.

Think: Both programs play chess. Both are called AI. But they work in completely different ways. By the end of this lesson you'll understand exactly what that difference is - and why it matters.

Two completely different ideas

Before we can understand AI, we need to understand what computers normally do - and why that's fundamentally different from what AI does.

In traditional programming, a human writes every single rule. The program does exactly what the programmer tells it - nothing more, nothing less. It's deterministic: the same input always gives the same output, and the programmer has anticipated every situation in advance. If something unexpected happens, the program fails.

In machine learning - the main branch of AI - you don't write the rules at all. Instead, you give the computer thousands of labelled examples and let it find the patterns itself. The rules emerge from the data. The programmer never writes them explicitly. The model can then make reasonable decisions on situations it has never seen before.

This sounds simple but it's a genuinely radical idea. For most of computing history, the assumption was that you had to tell a computer exactly what to do. Machine learning breaks that assumption entirely.

โš™๏ธ
Traditional Programming

A human defines every rule. The computer executes instructions precisely.

  • Programmer writes explicit rules as code
  • Rules + input data go in
  • Output comes out
  • Can only handle situations the programmer anticipated
  • Transparent - you can read every rule
Example: A calculator. Every operation - add, divide, square root - is written by the programmer as explicit code.
VS
๐Ÿง 
Machine Learning (AI)

The computer finds its own rules from examples. No rules are written explicitly.

  • Labelled training data goes in
  • Computer finds patterns automatically
  • A trained model comes out
  • Can generalise to new, unseen situations
  • Often opaque - even the creators don't know every rule it learned
Example: A spam filter. Trained on millions of emails - nobody wrote "if subject contains FREE then spam." It found that pattern itself.
Key term - Generalisation
A machine learning model that has been trained well can make reasonable predictions on inputs it has never seen before. This is called generalisation. It's what makes ML powerful - and also what makes it unpredictable if the training data was poor.

How does a machine actually learn?

"Learning from data" sounds abstract. Here's what it actually means in practice - broken into four stages that every machine learning system goes through.

Training Data
Thousands of labelled examples go in - e.g. photos marked "cat" or "not cat"
Model Trains
The model makes predictions and measures how wrong it is - then adjusts
Improves
This repeats thousands of times - each time the model gets slightly better
Trained Model
The finished model can now make predictions on new data it has never seen

The key step is in the middle. When the model makes a wrong prediction, it adjusts its internal settings - called parameters or weights - to be slightly less wrong next time. This process happens automatically, millions of times, until the model is accurate enough.

No programmer decides what the model learns. The patterns emerge entirely from the data itself. This is why the quality and quantity of training data is everything in AI - the model can only learn what the data contains.

Important implication
If the training data is biased, incomplete, or wrong - the model will learn those flaws too. A model trained only on photos of cats from Europe might fail to recognise cats from a different context. Garbage in, garbage out - but the "garbage" is invisible until the model fails in the real world. We'll cover this in detail in Lesson 5.

A spam filter - step by step

Let's trace through how a real machine learning system - a spam filter - is built. This is not simplified. This is genuinely how it works.

Building a spam filter with machine learning
A step-by-step walkthrough
1
Collect labelled training data
Engineers collect millions of emails that humans have already labelled as spam or not spam. These labels are essential - the model needs to know the correct answer during training so it can measure how wrong it is.
[SPAM] "Congratulations! You've won ยฃ5,000. Click here NOW to claim..."
[LEGIT] "Hi Sarah, meeting moved to 3pm tomorrow - see you then."
[SPAM] "FREE iPhone 15 - limited time offer - act fast"
[LEGIT] "Your order #4821 has been dispatched and will arrive Friday."
2
Convert emails into numbers
Computers only understand numbers. Each email is converted into a numerical representation - for example, which words appear and how often. "FREE", "CONGRATULATIONS" and "CLICK HERE" appear far more in spam. The model will learn this - but nobody tells it. It discovers the pattern.
3
Train the model
The model processes each email, predicts "spam" or "not spam", and then checks whether it was right. Every time it's wrong, it adjusts its internal settings slightly. After processing millions of emails this way, it has effectively built up a picture of what spam looks like - without anyone writing a single spam rule.
4
Test on new emails
The trained model is tested on emails it has never seen before. A good model correctly classifies 99%+ of new emails. It can even identify new spam tactics it wasn't trained on - because it learned the underlying patterns of spam, not just specific phrases.
5
Deploy - and keep learning
The model is deployed to millions of inboxes. When users mark something as spam that the model missed, that becomes new training data. The model continues to improve over time. This is why Gmail's spam filter gets better the more people use it.

Sort it out

Now apply what you've learned. Drag each system into the correct category.

Rules-based or Machine Learning?
Drag all 8 cards into the correct bucket

Some of these will surprise you. Think carefully - the key question is: were the rules written by a programmer, or did the system learn them from data?

Spell checker flagging words not in a dictionary
Netflix recommending your next show
A thermostat turning heating on below 18ยฐC
Face unlock recognising your face
Google Translate converting text
A traffic light switching after 60 seconds
YouTube deciding which video to show you next
A vending machine accepting a ยฃ1 coin
Rules-Based
Machine Learning

Questions worth sitting with

These don't have single right answers. Think them through - then reveal a suggested response.

Question 1
A self-driving car uses machine learning to recognise pedestrians. What are the risks of this compared to a rules-based system?
Key points to include: A rules-based system is predictable - you know exactly what it will do in any situation. A machine learning system may behave unpredictably in situations it wasn't trained on (e.g. unusual weather, unusual lighting, unusual clothing). The risk is that the system's failures are hard to anticipate or explain. If it fails, it may be unclear why - and therefore hard to fix. On the other hand, a rules-based system couldn't handle the variety of real-world pedestrians at all - no programmer could write rules for every possible scenario. Neither approach is risk-free.
Question 2
Why might it be impossible to build a face recognition system using traditional programming?
Key points to include: A traditional programmer would need to write rules defining exactly what a face looks like - the shape, the proportions, where the eyes are, what counts as "similar enough." But faces vary enormously in age, ethnicity, lighting, angle, expression, glasses, hair. It would be impossible to write rules that cover every combination. Machine learning sidesteps this entirely: show the model millions of labelled face images and let it find the defining features itself - without anyone specifying what those features are.
Question 3
A bank uses a machine learning model to decide who gets a loan. A customer is rejected but no reason can be given. Is this acceptable? What are the implications?
Key points to include: This is a real and legally contested issue. In many countries, people have a right to understand why automated decisions are made about them - the GDPR in the UK and EU includes this. Machine learning models are often opaque - even the creators cannot fully explain every decision. This creates accountability problems: if the model is discriminatory (e.g. trained on historical data where certain groups were unfairly denied loans), it will perpetuate that discrimination invisibly. Explainability in AI is now an active area of research and regulation precisely because of this tension.

What training data would you need?

Building an ML system starts with data. For each scenario below, choose the most appropriate training dataset.

Choose the right training data
Select the best answer for each scenario
Scenario 1 - Detecting fraudulent bank transactions
A bank wants to build an ML model that flags unusual transactions as potentially fraudulent. What training data should it use?
A list of rules about what fraud looks like, written by security experts
Millions of past transactions, each labelled as "fraudulent" or "legitimate" by humans
Only the fraudulent transactions - so the model learns what fraud looks like
A random sample of transactions with no labels
Scenario 2 - Diagnosing skin cancer from photos
A medical AI needs to classify photos of skin lesions as cancerous or benign. What makes the best training dataset?
500 photos taken by the same doctor under identical lighting conditions
100,000 photos of cancerous lesions only, sourced from hospitals worldwide
100,000+ photos of both cancerous and benign lesions, from diverse patients, verified by dermatologists
Images downloaded from the internet and labelled automatically by another AI
Scenario 3 - Translating English to French
A translation model needs training data. Which dataset would produce the best results?
A French grammar textbook and a French-English dictionary
1,000 professionally translated sentences covering common phrases
10,000 sentences translated by non-native speakers without review
Billions of sentence pairs from books, websites and official documents translated by humans

What to remember

Core takeaways - Lesson 1
1
Traditional programs follow explicit rules written by a programmer. They can only handle what the programmer anticipated and fail on anything else.
2
Machine learning programs learn from labelled data. The rules are never written - they emerge from patterns in examples. The programmer designs the process, not the rules.
3
Training involves repeated correction. The model makes predictions, measures how wrong it is, and adjusts its internal parameters - millions of times - until it is accurate enough.
4
Generalisation is the goal. A good ML model makes sensible predictions on data it has never seen before - not just the data it was trained on.
5
Data quality is everything. A model can only learn what the training data contains. Poor, biased or incomplete data produces a poor, biased or incomplete model.

Check your understanding

5 Questions
Answer all five, then submit for instant feedback
Question 1
Which of the following best describes machine learning?
A program that follows explicit rules written by a programmer
A program that finds patterns in data without being given explicit rules
A program that uses a lookup table to answer questions
A program that can only handle inputs the programmer has seen before
Question 2
A spam filter is trained on 500,000 emails labelled "spam" or "not spam". What type of system is this?
Rules-based programming
Machine learning
A lookup table system
A deterministic algorithm
Question 3
What is the key limitation of traditional programming compared to machine learning?
It runs slower than machine learning systems
It can only handle situations the programmer has explicitly anticipated
It cannot process numbers
It requires more data than machine learning
Question 4
During training, a machine learning model repeatedly measures how wrong its predictions are and adjusts. What are the internal values that get adjusted called?
Rules
Parameters or weights
Labels
Inputs
Question 5
A company trains an image recognition model using only photos of people taken in a professional studio under controlled lighting.
Why might this model perform poorly when deployed in the real world?
The model will have too many parameters and run too slowly
The training data does not represent the variety of real-world conditions, so the model cannot generalise
Studio photos contain too much data for the model to process
The model needs to be retrained every time it sees a new person

Exam-style practice

Write a structured answer
Explain the difference between a rules-based computer system and a machine learning system. Use an example of each in your answer.
[4 marks]
0 words
Mark scheme - 4 marks
A rules-based system follows explicit instructions written by a programmer - the programmer defines every possible situation and response. (1 mark)
A machine learning system learns patterns from labelled training data - the rules are not written, they emerge from the examples. (1 mark)
Appropriate example of rules-based: e.g. a thermostat, calculator, traffic light, spell checker - any system where the behaviour is defined by explicit programmed logic. (1 mark)
Appropriate example of ML: e.g. spam filter, face recognition, Netflix recommendations, Google Translate - any system trained on data rather than programmed with rules. (1 mark)
Examiner note: Award full marks for any answer that correctly identifies the key distinction (explicit rules vs. learned patterns from data) and provides a valid example of each. Do not penalise for alternative valid examples.
Printable Worksheets

Practice what you've learned

Three printable worksheets covering rules-based systems and machine learning at three levels: Recall, Apply, and Exam-style.

Exam Practice
Lesson 1: What is AI? Rules vs Learning
GCSE-style written questions covering AI concepts. Work through them like an exam.
Start exam practice Download PDF exam
Lesson 2
How Machines Learn from Data
Explore how a machine finds patterns in thousands of examples without being explicitly programmed.
Next Lesson
Lesson 1 - Teacher Resources
What is AI? Rules vs Learning
Teacher mode (all pages)
Shows examiner notes on the Exam Practice page
Suggested starter (5 min)
Write on the board: "Write a rule that would correctly identify every spam email. You have 90 seconds." Ask who thinks they've written a complete rule. Keep asking what else they'd need to add. Nobody will finish. This surfaces exactly why rules-based approaches fail for complex tasks - without you having to explain it first.
Lesson objectives
1Explain the difference between a rules-based system and a machine learning system, with a real-world example of each.
2Explain why machine learning is preferable when the rules governing a task are too complex or numerous to write explicitly.
3Identify at least one risk of using machine learning in a safety-critical context.
Key vocabulary (board-ready)
Algorithm
A precise, step-by-step set of instructions a computer follows to complete a task.
Rules-based system
A program that makes decisions by checking conditions against a fixed set of human-written rules.
Machine learning
A technique where a system identifies patterns in training data to improve its performance, without being explicitly programmed for each case.
Training data
The dataset used to teach a machine learning model the patterns it needs to make predictions or decisions.
Supervised learning
A type of machine learning where the model is trained on labelled examples - each input is paired with the correct output.
Discussion prompts
Try writing rules to decide whether an email is spam. After 2 minutes: could you cover every possible spam email? What does this tell us about why ML is needed?
A self-driving car trained on UK roads is deployed in a country with different road markings. What might go wrong - and why does this happen with ML systems?
Banks use rules-based fraud detection AND machine learning. Why might they need both rather than choosing one approach?
Common misconceptions
X"ML means the computer is thinking for itself" - redirect: it finds statistical patterns in data. There is no reasoning or understanding, only pattern-matching at scale.
X"If it makes mistakes, it isn't AI" - all AI systems make errors. The question is whether the error rate is acceptable for the context.
X"Rules-based systems are always inferior to ML" - for stable, well-defined problems (password validation, eligibility checks), rules-based systems are often more reliable and transparent.
Exit ticket questions
State one difference between a rules-based system and a machine learning system.
[1 mark]
Give one reason why machine learning is preferred over a rules-based approach for image recognition.
[1 mark]
A spam filter learns from emails users mark as spam. Is this rules-based or machine learning? Explain your answer.
[2 marks]
Homework idea
Find one real-world AI system that uses machine learning. Write three sentences: what it does, what data it likely trains on, and why rules-based programming would not work for this task. Include the name of the system and a source.
Classroom tips
The rules-sorting activity works well as a physical card sort before students begin the lesson. It surfaces prior knowledge and misconceptions quickly.
Students consistently underestimate how many rules are needed for a "simple" task. The starter activity above makes this concrete before any formal teaching.
Timing: 20 minutes independent / 35 minutes with class discussion.
Resources
AI Ethics Exam Practice Download student worksheet (PDF) Set as class homework (coming soon)