Lesson 11 of 13
Lesson 11

Records and Structured Data

Group several named fields about one item into a single value so they cannot drift apart. Records are how real systems describe students, products, transactions and any other entity with more than one attribute. Examined by OCR J277 (2.2.3), AQA 8525 (3.2.6) and CIE 0478 (10.1).

Records, fields, dictionaries, data classes, structs
Language:

A school stores 800 students. Each has a name, year group, form class and date of birth. You could keep four parallel lists, but the moment one list is sorted on its own the data is silently corrupt: Aisha's name now sits next to Ben's year group, and no error is ever raised. A record bundles every field about one student into a single value so they travel together for life.

Think about it: If parallel lists silently corrupt data, why do beginners reach for them first? What does that tell you about defensive programming?
Record
A structured data type that groups several named fields about one item into a single value.
Field
One named piece of data inside a record, e.g. name or year_group.
Dictionary
Python's most common record-like type. Keys are field names, values are the data.
Data class
A Python decorator that gives you a record type with named fields and a fixed shape.
Struct / class
In C#, struct and class let you declare a record type with typed named fields.
List of records
A list whose elements are all records of the same shape. The standard pattern for storing many entities.
Field access
Reading or writing one field by name, e.g. aisha["name"] or aisha.Name.
Heterogeneous
A record can hold fields of different types (string, int, date) at the same time. A list usually cannot.

1. Why parallel lists go wrong

Imagine four lists holding 800 students:

names      = ["Aisha", "Ben", "Cara"]
year_groups = [10, 11, 10]
forms      = ["10B", "11A", "10C"]
dobs       = ["2010-03-14", "2009-07-22", "2010-11-05"]

# Sort by name alphabetically...
names.sort()   # names is now ["Aisha", "Ben", "Cara"]
# but year_groups, forms and dobs were NOT moved.
# Aisha's year-group is now whatever was at index 0 before. Silent corruption.
string[] names = { "Aisha", "Ben", "Cara" };
int[] yearGroups  = { 10, 11, 10 };
string[] forms = { "10B", "11A", "10C" };

Array.Sort(names);  // names re-ordered, others not. Silent data corruption.
The real cost

Nothing crashes. Nothing throws an error. The program runs, prints results and the marks are awarded to the wrong students for weeks before someone notices.

2. The record solution

A record bundles all fields about one student into a single value. Sorting the list of records moves every field together.

aisha = {
    "name": "Aisha Khan",
    "year_group": 10,
    "form": "10B",
    "dob": "2010-03-14"
}

print(aisha["name"])         # Aisha Khan
print(aisha["year_group"])   # 10
aisha["form"] = "10C"           # update one field, others unchanged
struct Student
{
    public string Name;
    public int YearGroup;
    public string Form;
    public string Dob;
}

Student aisha = new Student {
    Name = "Aisha Khan",
    YearGroup = 10,
    Form = "10B",
    Dob = "2010-03-14"
};

Console.WriteLine(aisha.Name);       // Aisha Khan
Console.WriteLine(aisha.YearGroup);  // 10
aisha.Form = "10C";
Why this is structurally safer

All fields belong to one value. Any operation (sort, copy, pass to a function, store in a list) moves every field together. There is no way for the name to drift away from the date of birth.

3. A list of records: the standard pattern

Most exam questions ask you to store many records of the same shape, then loop through them to filter, count or summarise.

students = [
    {"name": "Aisha", "year_group": 10, "score": 78},
    {"name": "Ben",   "year_group": 11, "score": 64},
    {"name": "Cara",  "year_group": 10, "score": 91},
    {"name": "Dev",   "year_group": 10, "score": 55}
]

# How many year-10 students scored over 70?
count = 0
for s in students:
    if s["year_group"] == 10 and s["score"] > 70:
        count = count + 1
print(count)   # 2

# Sort by score, highest first - all fields move together
students.sort(key=lambda s: s["score"], reverse=True)
print(students[0]["name"])   # Cara
List<Student> students = new List<Student> {
    new Student { Name="Aisha", YearGroup=10, Score=78 },
    new Student { Name="Ben",   YearGroup=11, Score=64 },
    new Student { Name="Cara",  YearGroup=10, Score=91 }
};

int count = 0;
foreach (Student s in students)
{
    if (s.YearGroup == 10 && s.Score > 70) count++;
}
Console.WriteLine(count);  // 2

4. Records inside records (nested data)

A field inside a record can itself be another record, or a list. This is how real systems describe complex things: an order has a customer (a record) and a list of items (each of which is a record).

order = {
    "order_id": 1042,
    "customer": {"name": "Aisha", "email": "a@school.uk"},
    "items": [
        {"sku": "BK-101", "qty": 2, "price": 4.50},
        {"sku": "PN-200", "qty": 5, "price": 0.80}
    ]
}

# Total price across all items in the order
total = sum(item["qty"] * item["price"] for item in order["items"])
print(f"Total: {total:.2f}")        # Total: 13.00
print(order["customer"]["name"])  # Aisha
Reading nested records

In an exam you may be asked to "trace" or "describe" an expression like order["items"][0]["price"]. Read it left to right: take the order, take its items list, take the first item, take that item's price field. Result: 4.50.

5. Pseudo-code for records (OCR / AQA / CIE)

Exam pseudo-code declares record types formally. You will not be asked to compile this, but you must read it, identify the fields and explain why grouping them is useful.

record Student
    name : string
    yearGroup : integer
    form : string
endrecord

aisha ← new Student("Aisha Khan", 10, "10B")
print(aisha.name)
BoardPseudo-code termField access
OCR J277record ... endrecorddot notation: aisha.name
AQA 8525RECORD ... ENDRECORDdot notation: aisha.name
CIE 0478TYPE ... ENDTYPEdot notation: aisha.name
"A record is just a list"

A list is an ordered collection of values that are usually all the same kind of thing (a list of names, a list of scores). A record holds different kinds of fields about one thing (one student's name, year group, score). They are different data structures with different purposes. The exam will catch you out if you confuse them.

6. A six-mark question, fully marked

Question: A school stores student data. Compare storing this data using parallel lists with storing it using a list of records. [6 marks]

Mark scheme - one mark each, up to 6
  • Parallel lists store each field in a separate list, indexed by position.
  • A record bundles all fields about one student into a single value.
  • If one parallel list is sorted on its own, the data becomes corrupt because the other lists are not re-ordered.
  • With a list of records, sorting moves every field together, so the data stays consistent.
  • Records access fields by name, which is more readable than remembering that index 2 is the year group.
  • Records make adding a new field (e.g. email) easier because the change is in one place, not spread across several lists.

Beyond the basics

A junior developer says "we don't need records, we can just use a list of lists like [["Aisha", 10, 78], ["Ben", 11, 64]]". Give two concrete reasons this is worse than a list of records, and one situation where the list-of-lists choice would actually be reasonable.
Two reasons it is worse:
1. Readability. student[1] tells the reader nothing. student["year_group"] reads itself. Six months later you will not remember which index meant what.
2. Change-resistance. If the school adds a new field, every line of code that reads student[2] for the score is now broken because the score moved to index 3. With named fields, adding a new field affects nothing else.

When list-of-lists is reasonable: when the inner lists really do represent the same kind of thing in fixed positions and there are no field names at all, e.g. a 2D grid of pixels, a chess board, or rows of CSV being processed once and discarded. Use named records for entities; use lists-of-lists for matrices.
Q1. Which statement best describes a record?
A record bundles related fields (name, year, form, etc.) about one entity into one value so they stay tied together.
Q2. Why is a list of records safer than four parallel lists?
All fields belong to the same record, so when one record moves every field moves with it. Parallel lists silently corrupt when one is re-ordered.
Q3. In student["form"] = "10C", what is "form"?
"form" is the field name (key); "10C" is the value being stored against it.
Q4. Given order["items"][1]["qty"], what does this expression return?
Read it left to right: order, then its items list, then index 1 (the second item), then that item's qty field.
Q5. State one advantage of records over list-of-lists for storing student data. [1 mark]
Named field access is the headline advantage. Adding a new field does not break existing code that reads other fields.
Programming - Lesson 11
Records and Structured Data
Starter activity
Write four parallel lists on the board: names, ages, year groups, predicted grades, with five entries each. Sort the names alphabetically without telling the class. Ask: "Did anything bad happen?" Reveal that the other three lists were not sorted and use this to motivate why records exist. Most students will not see the bug until you spell it out, which is exactly the danger.
Lesson objectives
1
Define a record and list its key properties (named fields, fixed shape, heterogeneous types).
2
Build a record using a Python dictionary or a C# struct/class with at least four fields.
3
Write a list of records and loop over it to filter, count and sort.
4
Read nested-record expressions like order["items"][0]["price"] and explain what they evaluate to.
5
Compare records with parallel lists and with list-of-lists, justifying which is best for a given scenario.
6
Read OCR/AQA/CIE pseudo-code that declares a record type.
Key vocabulary
recordfielddictionarykeyvaluestructclassdata classparallel listsnested recordheterogeneousdot notation
Discussion questions
Why do most beginners default to parallel lists even though they are unsafe?
When would a list-of-lists actually be the correct choice over a list of records?
A record's shape is fixed at design time. What problem does that cause if the school adds a new field after the program is in use?
How does a record relate to a row in a database table? Where does the analogy break down?
Exit tickets
Define a record. [1 mark]
Give two advantages of using a list of records instead of parallel lists. [2 marks]
Write the pseudo-code (or Python/C#) for a record that stores a book's title, author and price. [3 marks]
A program needs to store 50 cars, each with a registration, owner and mileage. Justify a suitable data structure. [4 marks]
Homework suggestion
Design a record type for a library book. Choose at least five fields. Then write a Python or C# program that stores ten books in a list of records, prints all books by a given author, and reports the average price across the list. Hand in the code and a paragraph explaining why a list of records was a better choice than parallel lists for this task.
Classroom tools