Lesson 1 of 4

Databases: Lesson 1

Database Concepts

Understand what databases are, why flat-file databases cause problems, and how relational databases solve them using tables, records, fields, data types and primary keys.

45 minutes All exam boards

Why this matters

Every time you log into an app, search for a product, or book a ticket, a database is working behind the scenes. Spotify stores millions of songs, artists and users. Amazon stores billions of products, orders and customers. How do they keep all that data organised without it becoming a contradictory mess?

Think about it: If your school stored every student's information in a single spreadsheet, and a teacher changed their email address, how many rows would need to be updated? What happens if someone only updates half of them?

Key terms

Terms you need to know

Database

An organised collection of structured data stored electronically, designed for efficient retrieval and management.

Flat-file database

A database that stores all data in a single table. Simple but causes redundancy when multiple entities are involved.

Relational database

A database that splits data across multiple linked tables to eliminate redundancy and improve data integrity.

Table

A structured grid of rows and columns. Each table stores data about one type of entity (e.g. students, courses).

Record (row)

A single entry in a table. Each record contains all the data about one instance of the entity.

Field (column)

A single attribute or property stored in the table. Each field has a name and a data type.

Primary key

A field that uniquely identifies each record in a table. No two records can share the same primary key value, and it cannot be empty.

Data redundancy

The unnecessary repetition of the same data across multiple records. Wastes storage and creates inconsistency risks.

Flat-file databases

All data in one table

A flat-file database stores all data in a single table. For simple data about one type of thing, this works fine - a table of book titles is a flat file. Problems start when your data involves multiple related entities.

The example below stores student enrolments. Notice what happens when three students are on the same course:

StudentID	StudentName	Age	CourseName	CourseCode	TeacherName	TeacherEmail
1	Alice Smith	15	GCSE Maths	MA11	Mr Jones	jones@school.uk
2	Bob Brown	16	GCSE Maths	MA11	Mr Jones	jones@school.uk
3	Clara Davis	15	GCSE Maths	MA11	Mr Jones	jones@school.uk
4	Dan Evans	15	GCSE CS	CS11	Ms Patel	patel@school.uk

Red cells = redundant data (identical values repeated across multiple rows)

Three problems with flat-file databases

Update anomaly: If Mr Jones changes his email, you must update every row where he appears. Miss one row and the data is contradictory.

Insertion anomaly: You cannot add a new course to the database without also adding a student - there is nowhere else to put course data.

Deletion anomaly: If the last student on a course leaves, deleting their row also deletes all information about the course and teacher.

Tables, records and fields

The building blocks of a database

Whether flat-file or relational, every database is built from tables. A table has columns (fields) and rows (records).

StudentID	FirstName	LastName	DateOfBirth	IsEnrolled
1001	Alice	Smith	2009-04-12	TRUE
1002	Bob	Brown	2008-11-03	TRUE
1003	Clara	Davis	2009-07-22	FALSE

Each column is a field - one attribute of every student. Each row is a record - all the data about one specific student. The table above has 5 fields and 3 records.

Data types

Every field must have a data type - this tells the database what kind of data to expect and how to store it.

Integer

Whole numbers. Example: StudentID = 1001, Age = 15

Real / Float

Decimal numbers. Example: Price = 9.99, Score = 78.5

Boolean

TRUE or FALSE only. Example: IsEnrolled = TRUE

Char

A single character. Example: Grade = 'A'

String / Text

A sequence of characters. Example: Name = "Alice Smith"

Date / Time

Date and/or time values. Example: DOB = 2009-04-12

Exam angle

You may be asked to choose an appropriate data type for a given field. Always consider whether the data is a number (will you do maths on it?), text, true/false, or a date. Storing a phone number as an integer is wrong - you cannot do arithmetic on phone numbers, and leading zeros would be lost.

Primary keys

Uniquely identifying every record

A primary key is a field chosen to uniquely identify each record in the table. It is the foundation of database design.

Three rules for primary keys

Unique: No two records can have the same primary key value.

Not null: A primary key field can never be empty (NULL). Every record must have one.

Stable: Primary key values should not change - they are used to link records across tables.

In the Students table above, StudentID is the primary key. A student's name could match another student's name, their date of birth could match, but their ID is generated to be unique. Real-world examples: National Insurance number, ISBN for books, product barcodes.

Exam angle

You may be asked why a student's name is not a suitable primary key. The answer: names are not guaranteed to be unique (two students could be called Alice Smith), and a name field could be left blank. An automatically assigned ID has neither problem.

Quick quiz

Test yourself

1. A database stores employee records. If one employee's department name appears in 50 rows and the department moves building, what is the minimum number of rows that must be updated in a flat-file database?

Correct. In a flat-file database, the same department name is stored in every row for every employee in that department. All 50 rows must be updated - and if any is missed, the data becomes inconsistent. A relational database stores the department once, requiring only one update.

2. Which of the following is the most suitable primary key for a table of books?

Correct. ISBN (International Standard Book Number) is assigned uniquely to every book. A title is not guaranteed to be unique (two books can have the same title), an author's name is not unique, and a year is definitely not unique. ISBN satisfies all three primary key rules: unique, not null and stable.

3. What is the correct name for a single row in a database table?

Correct. A row is called a record - it contains all the data about one instance. A column is called a field (or attribute). An entity is the thing being represented (e.g. Student, Course) - the table itself is the entity.

Think deeper

Challenge question

A student suggests using a student's email address as a primary key instead of assigning a numeric ID. Give two reasons why this could cause problems.

An email address might change (the student transfers schools, changes their name, or the school's domain changes) - a primary key should be stable. Second, if the email field is left blank by mistake, the record has no key - a primary key cannot be null. A numeric ID is assigned automatically and never needs to change, making it a safer choice.

Flat-file vs Relational Comparator

See the same data in both formats. Redundant cells are highlighted in red. See exactly how foreign keys eliminate repetition.

Series overview Lesson 2: Relational Design

Printable Worksheets

Practice what you have learned

Three levels of worksheet for this lesson. Download, print and complete offline.

Recall

Database Fundamentals

Key term definitions, flat-file vs relational identification, data type selection and primary key rules.

Download

Apply

Data Types and Keys

Given a scenario, choose suitable data types and justify. Identify primary key candidates and explain anomalies.

Download

Exam Style

Exam-Style Questions

Extended response questions on redundancy, primary keys and database structure. Mark scheme included.

Download

Lesson 1 - Databases

Database Concepts

Starter activity

Show students a simple spreadsheet of school clubs with teacher names and emails repeated for every student. Ask: "If Ms Patel changes her email address, how many cells need updating?" Lead into the concept of redundancy naturally. This makes the problem feel real before any theory is introduced.

Lesson objectives

Define database, flat-file database and relational database.

Explain what data redundancy is and why it causes problems.

Identify tables, records and fields in a given database structure.

Select appropriate data types for given fields and justify the choice.

Define primary key and state the three rules it must satisfy.

Key vocabulary

Database

Organised collection of structured data stored electronically for efficient access.

Flat-file database

Single-table database. Suitable only for simple single-entity data. Causes anomalies with related entities.

Data redundancy

Unnecessary repetition of data. Causes update, insertion and deletion anomalies.

Primary key

Unique, non-null, stable field that identifies each record. Foundation of relational database design.

Discussion questions

What real-world databases have you interacted with today without realising it?

Why is a student's name a poor choice for a primary key?

What could go wrong if two records in the same table had the same primary key?

Exit tickets

State two problems caused by data redundancy in a flat-file database. [2 marks]

Give one reason why a student's name is not a suitable primary key. [1 mark]

A table stores customer orders. Name a suitable data type for each: CustomerID, OrderTotal, IsDelivered, DeliveryDate. [4 marks]

Homework suggestion

Research one real-world database system (NHS patient records, Amazon product catalogue, school MIS). Describe: what entities it stores, what fields those entities have, what the primary key is for each table, and what would happen if it were stored as a flat file.

Resources

Flat-file vs Relational tool Worksheet 1 - Recall Worksheet 2 - Apply Worksheet 3 - Exam Style Unit Exam Paper (all 4 lessons)