Lesson 1 of 4
Databases: Lesson 1

Database Concepts

Understand what databases are, why flat-file databases cause problems, and how relational databases solve them using tables, records, fields, data types and primary keys.

45 minutes All exam boards

Every time you log into an app, search for a product, or book a ticket, a database is working behind the scenes. Spotify stores millions of songs, artists and users. Amazon stores billions of products, orders and customers. How do they keep all that data organised without it becoming a contradictory mess?

Think about it: If your school stored every student's information in a single spreadsheet, and a teacher changed their email address, how many rows would need to be updated? What happens if someone only updates half of them?
Terms you need to know
Database
An organised collection of structured data stored electronically, designed for efficient retrieval and management.
Flat-file database
A database that stores all data in a single table. Simple but causes redundancy when multiple entities are involved.
Relational database
A database that splits data across multiple linked tables to eliminate redundancy and improve data integrity.
Table
A structured grid of rows and columns. Each table stores data about one type of entity (e.g. students, courses).
Record (row)
A single entry in a table. Each record contains all the data about one instance of the entity.
Field (column)
A single attribute or property stored in the table. Each field has a name and a data type.
Primary key
A field that uniquely identifies each record in a table. No two records can share the same primary key value, and it cannot be empty.
Data redundancy
The unnecessary repetition of the same data across multiple records. Wastes storage and creates inconsistency risks.
All data in one table

A flat-file database stores all data in a single table. For simple data about one type of thing, this works fine - a table of book titles is a flat file. Problems start when your data involves multiple related entities.

The example below stores student enrolments. Notice what happens when three students are on the same course:

StudentIDStudentNameAgeCourseNameCourseCodeTeacherNameTeacherEmail
1Alice Smith15GCSE MathsMA11Mr Jonesjones@school.uk
2Bob Brown16GCSE MathsMA11Mr Jonesjones@school.uk
3Clara Davis15GCSE MathsMA11Mr Jonesjones@school.uk
4Dan Evans15GCSE CSCS11Ms Patelpatel@school.uk

Red cells = redundant data (identical values repeated across multiple rows)

Three problems with flat-file databases

Update anomaly: If Mr Jones changes his email, you must update every row where he appears. Miss one row and the data is contradictory.

Insertion anomaly: You cannot add a new course to the database without also adding a student - there is nowhere else to put course data.

Deletion anomaly: If the last student on a course leaves, deleting their row also deletes all information about the course and teacher.

The building blocks of a database

Whether flat-file or relational, every database is built from tables. A table has columns (fields) and rows (records).

StudentIDFirstNameLastNameDateOfBirthIsEnrolled
1001AliceSmith2009-04-12TRUE
1002BobBrown2008-11-03TRUE
1003ClaraDavis2009-07-22FALSE

Each column is a field - one attribute of every student. Each row is a record - all the data about one specific student. The table above has 5 fields and 3 records.

Data types

Every field must have a data type - this tells the database what kind of data to expect and how to store it.

Integer
Whole numbers. Example: StudentID = 1001, Age = 15
Real / Float
Decimal numbers. Example: Price = 9.99, Score = 78.5
Boolean
TRUE or FALSE only. Example: IsEnrolled = TRUE
Char
A single character. Example: Grade = 'A'
String / Text
A sequence of characters. Example: Name = "Alice Smith"
Date / Time
Date and/or time values. Example: DOB = 2009-04-12
Exam angle

You may be asked to choose an appropriate data type for a given field. Always consider whether the data is a number (will you do maths on it?), text, true/false, or a date. Storing a phone number as an integer is wrong - you cannot do arithmetic on phone numbers, and leading zeros would be lost.

Uniquely identifying every record

A primary key is a field chosen to uniquely identify each record in the table. It is the foundation of database design.

Three rules for primary keys
Unique: No two records can have the same primary key value.
Not null: A primary key field can never be empty (NULL). Every record must have one.
Stable: Primary key values should not change - they are used to link records across tables.

In the Students table above, StudentID is the primary key. A student's name could match another student's name, their date of birth could match, but their ID is generated to be unique. Real-world examples: National Insurance number, ISBN for books, product barcodes.

Exam angle

You may be asked why a student's name is not a suitable primary key. The answer: names are not guaranteed to be unique (two students could be called Alice Smith), and a name field could be left blank. An automatically assigned ID has neither problem.

Test yourself

1. A database stores employee records. If one employee's department name appears in 50 rows and the department moves building, what is the minimum number of rows that must be updated in a flat-file database?

Correct. In a flat-file database, the same department name is stored in every row for every employee in that department. All 50 rows must be updated - and if any is missed, the data becomes inconsistent. A relational database stores the department once, requiring only one update.

2. Which of the following is the most suitable primary key for a table of books?

Correct. ISBN (International Standard Book Number) is assigned uniquely to every book. A title is not guaranteed to be unique (two books can have the same title), an author's name is not unique, and a year is definitely not unique. ISBN satisfies all three primary key rules: unique, not null and stable.

3. What is the correct name for a single row in a database table?

Correct. A row is called a record - it contains all the data about one instance. A column is called a field (or attribute). An entity is the thing being represented (e.g. Student, Course) - the table itself is the entity.
Challenge question

A student suggests using a student's email address as a primary key instead of assigning a numeric ID. Give two reasons why this could cause problems.

An email address might change (the student transfers schools, changes their name, or the school's domain changes) - a primary key should be stable. Second, if the email field is left blank by mistake, the record has no key - a primary key cannot be null. A numeric ID is assigned automatically and never needs to change, making it a safer choice.
Printable Worksheets

Practice what you have learned

Three levels of worksheet for this lesson. Download, print and complete offline.

Recall
Database Fundamentals
Key term definitions, flat-file vs relational identification, data type selection and primary key rules.
Download
Apply
Data Types and Keys
Given a scenario, choose suitable data types and justify. Identify primary key candidates and explain anomalies.
Download
Exam Style
Exam-Style Questions
Extended response questions on redundancy, primary keys and database structure. Mark scheme included.
Download
Lesson 1 - Databases
Database Concepts
Starter activity
Show students a simple spreadsheet of school clubs with teacher names and emails repeated for every student. Ask: "If Ms Patel changes her email address, how many cells need updating?" Lead into the concept of redundancy naturally. This makes the problem feel real before any theory is introduced.
Lesson objectives
1
Define database, flat-file database and relational database.
2
Explain what data redundancy is and why it causes problems.
3
Identify tables, records and fields in a given database structure.
4
Select appropriate data types for given fields and justify the choice.
5
Define primary key and state the three rules it must satisfy.
Key vocabulary
Database
Organised collection of structured data stored electronically for efficient access.
Flat-file database
Single-table database. Suitable only for simple single-entity data. Causes anomalies with related entities.
Data redundancy
Unnecessary repetition of data. Causes update, insertion and deletion anomalies.
Primary key
Unique, non-null, stable field that identifies each record. Foundation of relational database design.
Discussion questions
What real-world databases have you interacted with today without realising it?
Why is a student's name a poor choice for a primary key?
What could go wrong if two records in the same table had the same primary key?
Exit tickets
State two problems caused by data redundancy in a flat-file database. [2 marks]
Give one reason why a student's name is not a suitable primary key. [1 mark]
A table stores customer orders. Name a suitable data type for each: CustomerID, OrderTotal, IsDelivered, DeliveryDate. [4 marks]
Homework suggestion
Research one real-world database system (NHS patient records, Amazon product catalogue, school MIS). Describe: what entities it stores, what fields those entities have, what the primary key is for each table, and what would happen if it were stored as a flat file.