Lesson 5 of 6
Understanding AI Lesson 5 - Bias, Fairness and Ethics in AI
Lesson 5 of 6

Bias, Fairness
and Ethics in AI

Biased AI is not a theoretical risk - it is a documented reality with real consequences for real people. This lesson examines where bias comes from, what it looks like, and who is responsible.

GCSE and A-Level Free Bias scenario activity

Between 2014 and 2017, Amazon built a machine learning system to screen CVs for software engineering roles. It was trained on CVs submitted over the previous ten years - most of which came from men, because the tech industry is majority male.

The model learned that the word "women" was a negative signal. It penalised CVs that mentioned "women's chess club" or "women's coding society." It downgraded graduates from all-women colleges. It had not been told to discriminate - it had simply learned that historically successful candidates were mostly men, and it optimised for that pattern.

Amazon scrapped the system in 2018 when they discovered it was doing this.

Reuters investigation, October 2018.

Think: Nobody told the system to be sexist. It was doing exactly what it was designed to do - find patterns in historical data. Who is responsible for the outcome?

Where bias comes from

Bias in AI does not usually come from malicious programmers. It emerges from the relationship between the model, the data it is trained on, and the world that data describes. There are several distinct types.

Historical bias
The world has been biased. Training data reflects that. A model trained on past hiring decisions learns past hiring practices - including their discriminatory elements.
Representation bias
Some groups are under-represented in training data. Facial recognition systems trained mostly on lighter-skinned faces perform worse on darker-skinned faces. The data reflects who was in the room.
Measurement bias
The proxy used to measure the thing you care about is itself biased. "Number of arrests" is used as a proxy for crime, but arrests reflect policing patterns, not just criminal behaviour.
Feedback loops
AI decisions shape future data. If an AI recommends more police patrols in certain areas, more crimes are detected there, which reinforces the pattern in future training data.
Real-world impact
In 2016, the COMPAS system used by US courts to predict reoffending risk was found to falsely flag Black defendants as high-risk at twice the rate of white defendants. Judges were using this score to influence bail and sentencing decisions. The algorithm was never made fully public, so its workings could not be independently verified.
UK legal context
In the UK, the Equality Act 2010 prohibits discrimination based on protected characteristics (race, sex, age, disability, etc.). GDPR gives individuals the right to explanation for automated decisions that significantly affect them. The EU AI Act 2024 classifies AI in hiring, education and justice as "high-risk" and requires human oversight.

Spot the bias

Bias Scenarios
Identify the type of bias in each scenario and its root cause

For each scenario, choose the most accurate description of the bias present. Then reveal the explanation to check your reasoning.

Scenario 1
The loan approval model
A bank trains an AI to approve or reject loan applications. Training data comes from the last 30 years of decisions. In those decisions, people from certain postcodes were historically rejected at higher rates, partly due to previous discriminatory lending practices. The model learns postcode as a significant negative predictor.
What is the primary type of bias?
Measurement bias - postcode is an inaccurate measure of creditworthiness
Historical bias - the model has learned patterns from past decisions that were themselves discriminatory
Representation bias - people from those postcodes are under-represented in the training data
Historical bias is the primary type. The model is not measuring anything inaccurately - postcode genuinely predicts past loan repayment in this dataset. The problem is that the past data reflects previous discrimination. The model is faithfully reproducing biased history. This is the hardest type to fix because the data is "accurate" - it just describes an unjust world. Note also the element of measurement bias: postcode is being used as a proxy for creditworthiness, but it actually reflects geography and demography more than individual financial behaviour.
Scenario 2
The content moderation system
A social media platform trains a hate speech detector on posts flagged by its existing moderation team. The moderation team works in English and focuses on posts that received high numbers of reports from English-speaking users. Posts in Swahili, Bengali and Arabic are rarely flagged, so the model sees almost no examples of hate speech in those languages.
What is the primary type of bias?
Historical bias - hate speech has been permitted on the platform historically
Representation bias - speakers of non-English languages are under-represented in the training data
Feedback loop - the model will flag more English content, reinforcing the pattern
Representation bias is the primary type. The training data overwhelmingly represents one language group. The model will perform well at detecting hate speech in English but poorly in other languages. This means speakers of those languages receive less protection. The feedback loop element is also real: the model will flag more English content in deployment, creating more English training examples for the next version, compounding the disparity over time.
Scenario 3
The predictive policing tool
A police force uses an AI tool to predict which areas are at highest risk of crime. The model is trained on arrest data from the last decade. Police resources are then concentrated in predicted high-risk areas, leading to more patrols and more arrests in those areas. The following year, those areas show even higher arrest rates, further reinforcing the model's predictions.
What makes this case particularly difficult to address?
The model is too accurate - it correctly identifies crime hotspots
A feedback loop means AI decisions are creating the data used to validate those decisions, making it impossible to know if the predictions are accurate or self-fulfilling
The model cannot be updated with new data
The feedback loop is the core problem. More police in an area always means more arrests in that area. These additional arrests are fed back as training data, confirming the prediction. The model cannot distinguish between "this area has more crime" and "this area has more arrests because we put more police there." The measurement bias is also critical: arrests measure policing activity, not underlying crime. Areas with no police patrols could have high crime that is simply never recorded. The model treats absence of arrest data as evidence of safety.

Questions worth thinking about

Question 1
Can an AI system be completely unbiased? Is that even a meaningful goal?
Key points: Probably not completely, and arguably not meaningfully. Any training dataset reflects the world as it was, not as it should be. Choosing which bias to remove requires a value judgment about what "fair" means - which is itself contested. Fairness can be defined multiple ways (equal error rates across groups, equal outcomes, equal opportunity) and these definitions are mathematically incompatible with each other. The goal is not "no bias" but rather: understand which biases are present, measure their impact on different groups, and make conscious choices about acceptable trade-offs rather than letting the model make those choices invisibly.
Question 2
Who should be held responsible when a biased AI makes a decision that harms someone - the developer, the company deploying it, or the person who used its output?
Key points: This is a genuinely contested legal and ethical question. Arguments for developer responsibility: they designed the system and chose the training data. Arguments for deploying company responsibility: they chose to use it in this context, at this scale, with these populations. Arguments for individual responsibility: the person who acted on the AI output made the final decision. The EU AI Act 2024 and proposed UK AI legislation are attempting to answer this by assigning responsibility based on who is "placing the system into service" in a given context. Most frameworks require that high-risk AI systems have a human ultimately responsible for decisions, rather than delegating to the machine.
Question 3
If a completely accurate AI predicts that a particular group has a statistically higher risk of a bad outcome, is it ethical to use that prediction to make decisions about individuals in that group?
Key points: This is the "actuarial fairness" debate. Statistical accuracy at the group level does not justify decisions about individuals, because individuals are not groups. A prediction that "people from this background are 30% more likely to reoffend" tells you nothing certain about any specific individual. Using group statistics to make decisions about individuals is exactly what the Equality Act prohibits when those groups correspond to protected characteristics. Furthermore, if such predictions are acted upon (e.g. by restricting opportunities), they can become self-fulfilling - perpetuating the inequality they are measuring. Accuracy and justice are not the same thing.

Rate the Risk: EU AI Act edition

In March 2024, the EU passed the world's first comprehensive AI law. It assigns every AI system to one of four risk categories. Below are five real AI systems. Click each one to assign it to what you think is the correct category, then check your answers.

Unacceptable Risk - Banned
Poses a clear threat to fundamental rights. Prohibited in the EU entirely.
High Risk - Strict Rules
Affects rights or safety in important decisions. Allowed, but needs human oversight, transparency, and audit trails.
Limited Risk - Must Disclose
Interacts with humans but does not make major decisions. Must disclose it is AI when asked.
Minimal Risk - No Rules
No significant impact on rights or safety. No specific obligations required.
AI that ranks job applications and produces a shortlist before a human reviews them
A social media recommendation algorithm that decides which videos you see next
Real-time facial recognition scanning crowds in public spaces to identify wanted persons
A customer service chatbot on a retail website that answers questions about orders
An AI that predicts which individuals are likely to commit crimes before they have done anything
Click a system above, then click the category you want to place it in.

What to remember

Core takeaways - Lesson 5
1
Bias does not require intent. A model trained on historical data will faithfully reproduce historical patterns - including discriminatory ones.
2
Types of bias include: historical bias (learning from a biased past), representation bias (under-represented groups in training data), measurement bias (using a flawed proxy), and feedback loops (AI decisions shaping future training data).
3
Accuracy and fairness can conflict. A highly accurate model may be accurate on average while being significantly worse for particular groups. Average performance hides disparities.
4
Legal protections exist in the UK. The Equality Act 2010 and GDPR provide frameworks for challenging discriminatory automated decisions, including the right to explanation.
5
Explainability is a safeguard against bias. If a system cannot explain its decisions, bias cannot be identified, challenged, or corrected.

Explore further

Wikipedia makes an excellent starting point for established computing concepts. For any specific fact or claim, scroll to the References section at the bottom of the article and go to the primary source directly.

In The News
The EU AI Act: the world's first comprehensive law regulating artificial intelligence (2024)
March 2024 - fully in force from 2026
In March 2024, the European Union passed the AI Act - the first binding, comprehensive legal framework for AI anywhere in the world. The Act places every AI system into a risk category and sets rules accordingly. Some uses are banned entirely: real-time facial recognition in public spaces, AI that manipulates people through subliminal techniques, and "social scoring" systems that rate citizens based on their behaviour. High-risk systems in areas like hiring, education, policing, and medical diagnosis must meet strict transparency and accuracy requirements, with human oversight required for consequential decisions. The UK, outside the EU, has adopted a different approach - giving existing regulators sector-by-sector guidance rather than one overarching law. This difference has triggered significant debate about where innovation and protection are best balanced.
Discussion questions
The EU has banned real-time facial recognition in public in most circumstances. The UK has not. Which approach do you think offers better protection? What are the trade-offs for each?
The Act requires high-risk AI systems to provide explanations for their decisions. A lawyer can explain their reasoning. A neural network with millions of weights cannot. Is requiring explainability technically realistic, and what happens when it cannot be achieved?
Some AI companies argue that heavy regulation will drive AI development to less regulated countries - China, or the US under deregulation. Is "regulatory arbitrage" a good reason to weaken laws, or should governments hold firm?
Read more: EU AI Act (Wikipedia)    BBC News coverage

Check your understanding

5 Questions
Answer all five, then submit for instant feedback
Question 1
What is historical bias in a machine learning model?
A bug introduced by programmers who hold biased views
Bias that emerges when training data reflects past discriminatory practices or inequalities
Bias caused by outdated hardware
Errors introduced when data is collected over a long time period
Question 2
What is a feedback loop in the context of AI bias?
When users send feedback to the AI company about errors
When AI decisions shape future data, which is then used to train the next version of the model, reinforcing the original bias
When the model re-reads the same training data multiple times
When two AI systems share training data with each other
Question 3
A facial recognition system achieves 99% accuracy overall. However, its accuracy for darker-skinned women is only 79%, while for lighter-skinned men it is 99.7%.
What does this suggest about the training data?
The training data was too small overall
The training data likely under-represented darker-skinned women (representation bias)
Facial recognition is inherently impossible for darker-skinned people
The model needs more hidden layers
Question 4
Under GDPR, what right do individuals have regarding automated decisions?
The right to opt out of all AI systems
The right to an explanation for automated decisions that significantly affect them
The right to view the AI's source code
The right to retrain the model on new data
Question 5
Why is "using number of arrests as a proxy for crime" an example of measurement bias?
Because arrests are too expensive to count accurately
Because arrests reflect police activity and prior targeting decisions, not just underlying criminal behaviour - so the measure is skewed by the process used to collect it
Because crime data is protected by privacy law
Because there are too few arrests to train a model on

Exam-style practice

Write a structured answer
A financial company uses a machine learning model to decide whether to approve or reject credit card applications. The model was trained on historical application and repayment data from the past 20 years. Explain two ways in which this model could produce biased decisions, and for each one, describe a step the company could take to reduce that bias. [6 marks]
[6 marks]
0 words
Mark scheme - 6 marks (3 marks per bias point: 1 identify + 1 explain + 1 mitigation)
Bias 1 - Historical bias: The training data from the past 20 years may reflect discriminatory lending practices. Groups who were historically rejected unfairly would have fewer approved applications in the data, causing the model to continue rejecting them. Mitigation: audit the training data for disparate outcomes by group; remove or re-weight historical decisions known to be discriminatory. (3 marks)
Bias 2 - Proxy / measurement bias: The model may use features like postcode or occupation as proxies for creditworthiness. These correlate with protected characteristics (race, class) rather than individual ability to repay. Mitigation: remove protected characteristics and known proxies from the feature set; test model outcomes across demographic groups to detect disparate impact. (3 marks)
Accept other valid biases (representation bias, feedback loop) with accurate explanation and a plausible mitigation for each. Both a bias and a mitigation are required for each 3-mark point.
Printable Worksheets

Practice what you've learned

Three printable worksheets covering bias types, fairness, transparency, and accountability at three levels: Recall, Apply, and Exam-style.

Exam Practice
Lesson 5: Bias, Fairness and Ethics in AI
GCSE-style written questions covering AI concepts. Work through them like an exam.
Start exam practice Download PDF exam
Lesson 5 - Teacher Resources
Bias, Fairness and Ethics in AI
Teacher mode (all pages)
Shows examiner notes on the Exam Practice page
Suggested starter (5 min)
Write three scenarios on the board: (A) AI recommends films based on watch history, (B) AI scores job applications and rejects candidates automatically, (C) AI determines bail conditions for criminal defendants. Ask: which of these worries you most - and why? Show of hands for each. Use the spread of answers to open a discussion about what makes an AI decision "high stakes".
Lesson objectives
1Identify and describe at least three types of bias that can affect AI systems, with a real-world example of each.
2Explain the real-world consequences of biased AI in at least two contexts: criminal justice and healthcare or employment.
3Apply the EU AI Act risk classification framework to classify a new AI system and justify the classification.
Key vocabulary (board-ready)
Algorithmic bias
A systematic error in an AI system's output that creates unfair results for particular groups, often arising from biased training data or flawed problem framing.
Proxy discrimination
When an AI discriminates against a protected group without using that variable directly, by relying on correlated variables such as postcode or name format.
Transparency
The ability to understand and explain how an AI system reached a particular decision.
Accountability
The principle that a specific person or organisation can be held responsible for the consequences of an AI system's decisions.
EU AI Act
European Union legislation (2024) that classifies AI systems by risk level and prohibits or restricts certain high-risk and unacceptable-risk applications.
Discussion prompts
The EU AI Act bans real-time facial recognition in public spaces. Should the same apply to historical footage used to investigate a crime after the fact? Is there an ethical difference?
A company's hiring algorithm never uses race as a variable - but its decisions still correlate strongly with race. How is this possible? What is the term for this?
If a company deploys an AI system that turns out to be biased, should they be legally required to compensate affected people - even if the bias was unintentional?
Common misconceptions
X"If the algorithm doesn't use race as a variable, it can't discriminate by race" - proxy discrimination means race-correlated variables (postcode, school name, name style) can produce racially biased outcomes without explicitly using race.
X"AI bias is caused by deliberately biased programmers" - most algorithmic bias emerges from training data that reflects historical human inequalities, not from malicious intent.
X"Fixing bias just means removing protected attributes from the data" - removing a variable often has little effect if correlated proxy variables remain in the dataset.
Exit ticket questions
Define algorithmic bias.
[1 mark]
Explain how a hiring algorithm could produce gender-biased results without using gender as an input variable.
[2 marks]
The EU AI Act classifies some AI applications as 'unacceptable risk'. Give one example and explain why it receives this classification.
[2 marks]
Homework idea
Find a real news article (from the last three years) about an AI system found to be biased or unfair. Write three paragraphs: (1) what the AI was designed to do, (2) what bias was found and how it affected real people, (3) what the company did or should have done in response. Include a link to the article.
Classroom tips
The Rate the Risk classifier works well as a class vote: display each scenario and ask for a show of hands before revealing. Use disagreements as discussion starters.
Proxy discrimination (race inferred via postcode, name format, etc.) often surprises students. Build this in explicitly with the hiring algorithm prompt above.
This lesson pairs directly with the Exam Practice page - Q3, Q4, Q5 and Q7 all draw on lesson 5 content. Consider directing students there after the lesson.
Resources
AI Ethics Exam Practice Download student worksheet (PDF) Set as class homework (coming soon)