GCSE CS Tool

Flat-file vs Relational Databases

See the same data as a flat-file and as a relational database. Spot redundancy, understand why it causes problems, and see how foreign keys fix it.

24
Redundant cells
3
Relational tables
2
Foreign keys

StudentEnrollment (flat-file)

All data stored in one single table. Every fact about a student, their course, and their teacher is repeated on every row.
24
Highlighted cells are redundant - the same data repeated unnecessarily
Redundant (repeated) data
Unique per student
student_idstudent_nameage course_namecourse_code teacher_nameteacher_emailclassroom
1Alice Smith15 GCSE MathsMA11 Mr Jonesjones@school.ukRoom 14
2Bob Brown16 GCSE MathsMA11 Mr Jonesjones@school.ukRoom 14
3Clara Davis15 GCSE MathsMA11 Mr Jonesjones@school.ukRoom 14
4Dan Evans15 GCSE CSCS11 Ms Patelpatel@school.ukRoom 22
5Eve Foster16 GCSE CSCS11 Ms Patelpatel@school.ukRoom 22
6Frank Green16 GCSE CSCS11 Ms Patelpatel@school.ukRoom 22
7Grace Hill15 GCSE BiologyBI11 Dr Cartercarter@school.ukLab 3
8Harry Ings16 GCSE BiologyBI11 Dr Cartercarter@school.ukLab 3
24 redundant cells out of 64 total
37.5% of the data is unnecessary repetition
Spot the problem: If Mr Jones moves to Room 15, you must update 3 rows. Miss one, and the database now contains contradictory information. This is called a data integrity problem.

Relational database - same data, three tables

The data is split across three linked tables. Each fact is stored exactly once. Tables are connected using primary keys and foreign keys.
PK = Primary Key FK = Foreign Key Foreign keys point to the primary key of another table
Students table
student_id PKnameagecourse_id FK
1Alice Smith15101
2Bob Brown16101
3Clara Davis15101
4Dan Evans15102
5Eve Foster16102
6Frank Green16102
7Grace Hill15103
8Harry Ings16103
course_id links each student to the Courses table. Each course ID appears once in Courses.
Courses table
course_id PKcourse_namecodeteacher_id FKroom
101GCSE MathsMA11201Room 14
102GCSE CSCS11202Room 22
103GCSE BiologyBI11203Lab 3
Each course stored once only. teacher_id links to the Teachers table.
Teachers table
teacher_id PKnameemail
201Mr Jonesjones@school.uk
202Ms Patelpatel@school.uk
203Dr Cartercarter@school.uk
Each teacher stored once only. No email appears more than once.

How the tables link

Follow the foreign keys to answer a query like: "Which teacher takes Alice Smith?"
Students Alice Smith has course_id = 101
Courses course_id 101 has teacher_id = 201
Teachers teacher_id 201 is Mr Jones
Result: Alice Smith is taught by Mr Jones - found by following two foreign keys, with no data stored twice.

Three ways flat-file databases break

These are called update anomalies, insertion anomalies, and deletion anomalies. All three are caused by data redundancy.

Update anomaly

Scenario: Ms Patel gets married and changes her email to mpatel@school.uk.
In the flat-file, her email appears in 3 rows (students 4, 5, 6). If you update rows 4 and 5 but forget row 6, the database now contains two different emails for the same teacher. The data is inconsistent - you cannot trust it.
In the relational database, Ms Patel's email appears in exactly one row in the Teachers table. One update, no inconsistency.

Insertion anomaly

Scenario: A new course, GCSE Physics, has been added for next year. No students are enrolled yet.
In the flat-file, every row requires a student. You cannot add a course without also creating a student row - which means inventing fake data or leaving fields empty. The database design forces an impossible situation.
In the relational database, you simply add a row to the Courses table. It exists independently of students.

Deletion anomaly

Scenario: Grace Hill and Harry Ings both leave the school. Their rows are deleted.
In the flat-file, they were the only students in GCSE Biology. Deleting them also deletes all information about Dr Carter - their email, the course name, the room. The data is permanently lost.
In the relational database, deleting students only removes rows from the Students table. Dr Carter and GCSE Biology still exist in their own tables.

Flat-file vs Relational - summary

Both store data. Only one does it well.
Flat-file database
  • All data in one table
  • Same data repeated across many rows (redundancy)
  • Updating one fact requires changing many rows
  • Easy to create inconsistencies
  • Cannot add a course without adding a student
  • Deleting a student can destroy course/teacher data
  • Suitable only for simple, single-entity data
Relational database
  • Data split across multiple linked tables
  • Each fact stored exactly once (no redundancy)
  • Updating one fact requires changing one row
  • Data integrity maintained automatically
  • Entities (courses, teachers) exist independently
  • Relationships represented by foreign keys
  • Scalable to any real-world system

Key terms recap

Primary Key (PK)
A field that uniquely identifies each record in a table. No two rows can have the same value. Cannot be NULL.
Foreign Key (FK)
A field in one table that matches the primary key of another table. Used to link tables together and enforce referential integrity.
Data Redundancy
The unnecessary repetition of data in a database. Leads to wasted storage and risks of inconsistency.
Data Integrity
The accuracy and consistency of data. A relational database maintains integrity because each fact has one authoritative location.