Lesson 5 of 6

Series 7 - Lesson 5

Translators: Compilers, Interpreters and Assemblers

You cannot write Python and hand it to a CPU. Something has to translate it. That something is a translator - and the type you use changes how your program is distributed, debugged, and how fast it runs.

45 minutes GCSE Translators

Start here

When you run Python, the CPU never sees Python

A CPU executes binary machine code instructions. Full stop. It does not know what Python is. It does not know what Java is. It only knows operations like "load this value", "add these registers", "jump to this memory address".

So when you press Run in Python, what happens? A piece of software called a translator steps in. It reads your Python code and either converts it to machine code all at once (compiler) or executes it line by line as it translates (interpreter). The CPU never actually runs Python - it runs the machine code that the translator produces.

Think: If you compile a Python program on a Windows machine, could you send the compiled output to someone with a Mac and have it run? Why or why not?

Core content

The three types of translator

Compiler

Translates an entire HLL program into machine code in one pass before execution. Produces a standalone executable. Languages: C, C++, Rust.

Interpreter

Translates and executes a HLL program one line at a time. No standalone executable produced. Languages: Python, JavaScript, Ruby.

Assembler

Translates assembly language into machine code. Each assembly instruction converts to one machine code instruction. One-to-one mapping.

Compiler

Translates whole program first, then runs.

Produces fast executable - no translation overhead at runtime

Source code is hidden - distribute the .exe without revealing your code

All syntax errors found before running

Longer development cycle - must recompile after every change

Platform-specific - must recompile for each OS/architecture

All errors reported at once - can be overwhelming to debug

Interpreter

Translates and runs one line at a time.

Easier to debug - stops at first error, shows exactly where

Cross-platform - any machine with the interpreter can run it

No compile step - run immediately, great for development

Slower execution - translation happens at runtime every time

Source code must be distributed with the program

Interpreter must be installed on the user's machine

How a compiler works

Inside the compilation process: step by step

Compilation is not one operation - it is a pipeline of stages. Click each stage to see what it does to your code.

Lexical Analysis (Tokenisation)

Source code is broken into tokens

The lexer reads your source code character by character and groups them into meaningful units called tokens: keywords (if, while, print), identifiers (variable names), operators (+, -, =), numbers, strings and punctuation. Whitespace is discarded. Unknown characters cause a lexical error.

x = 5 + y
Tokens: [IDENTIFIER:x] [OPERATOR:=] [NUMBER:5] [OPERATOR:+] [IDENTIFIER:y]

Syntax Analysis (Parsing)

Tokens are checked against grammar rules

The parser takes the token stream and checks whether it conforms to the grammar rules of the language. It builds a parse tree (Abstract Syntax Tree) representing the program's structure. If tokens are in the wrong order - for example, a missing closing bracket - a syntax error is reported here.

Semantic Analysis

Meaning is checked for logical consistency

Even syntactically correct code can be meaningless. The semantic analyser checks that variables are declared before use, that types are used correctly (you cannot add a number to a string in many languages), and that function calls match their definitions. These are semantic errors, distinct from syntax errors.

Code Generation and Optimisation

Machine code is produced and improved

The compiler generates machine code instructions from the parse tree. It then optimises this code - removing redundant operations, reordering instructions for the CPU pipeline, and taking advantage of specific hardware features. A heavily optimised compiled program can run many times faster than a naive translation.

Linking and Output

Final executable is assembled

The linker combines the compiled object code with any libraries the program uses (standard library functions, third-party packages) and produces the final executable file. This .exe or binary can be distributed and run on any compatible system without the original source code or compiler.

Interactive tool

Tokeniser demo: see stage 1 of compilation live

Type a line of code and watch the lexer break it into tokens in real time. Each token type is colour-coded.

Live Tokeniser

Type code - see it split into tokens instantly

keyword identifier number operator string punctuation

Interactive tool

Compiler vs interpreter: watch the difference

The compiler reads the entire program first, checks all of it, then runs the output in one go. The interpreter translates and executes one line at a time. Try adding a bug to line 3 to see the most important exam difference.

Compiler vs Interpreter

Same code - watch how each translator approaches it differently

Compiler

Waiting - press Run

1x = 10

2y = 20

3z = x + y

4print(z)

Output will appear here

Interpreter

Waiting - press Run

1x = 10

2y = 20

3z = x + y

4print(z)

Output will appear here

Press Run to see both translators process the same code. Then try adding a bug.

Exam focus - the big comparison table

This is one of the most common exam topics. Be ready to state three differences between a compiler and an interpreter, with reasons. Key points: (1) compiler translates whole program first vs interpreter line by line; (2) compiled code runs faster as no translation overhead at runtime; (3) interpreter stops at first error making debugging easier; (4) compiled produces standalone executable, interpreter requires the interpreter software to be installed; (5) source code is hidden after compilation, visible with an interpreter.

Think deeper

Python is described as an interpreted language, but modern Python uses a hybrid approach: it first compiles your code to bytecode (.pyc files), then the Python interpreter runs that bytecode. Why might this be faster than pure interpretation? And why is it still not as fast as compiled languages like C?

Bytecode is a compact intermediate representation that is faster to translate to machine code than raw source text. Pre-compiling to bytecode means the lexing and parsing steps (which are slow text operations) only happen once per file. However, bytecode is not machine code - it still requires the Python virtual machine to translate each bytecode instruction to actual CPU instructions at runtime. Languages like C compile directly to native machine code for a specific CPU, so at runtime there is zero translation overhead.

Lesson 4 Lesson 6: IDEs

Lesson 5 - Software Series

Translators: Compilers, Interpreters and Assemblers

Starter activity

Ask two students to translate a sentence from English to French. One translates each word as they hear it (interpreter). The other listens to the whole sentence first, then gives a fluent translation (compiler). Which can start sooner? Which is more accurate? Which needs the whole input before it can begin?

Lesson objectives

Explain what a compiler does and describe what a compiled executable contains.

Explain what an interpreter does and how it differs from a compiler at runtime.

State at least four differences between a compiler and an interpreter.

Explain what an assembler does and state what it translates from and to.

Key vocabulary

Compiler

Translates entire source code into machine code before execution. Produces a standalone executable file.

Interpreter

Translates and executes source code line by line at runtime. No standalone executable produced.

Assembler

Translates assembly language mnemonics into machine code. One mnemonic becomes one machine instruction.

Token

The smallest meaningful unit a translator identifies: keywords, identifiers, operators, literals, and punctuation.

Bytecode

Intermediate compiled code for a virtual machine, e.g. Python .pyc files or Java .class files. Not native machine code.

Discussion questions

Python first compiles to bytecode (.pyc) then interprets it. Does this make Python compiled or interpreted? What does this tell us about the distinction?

Why would a game company prefer to compile their game to an executable rather than distribute source code with an interpreter?

When debugging a large program, would you prefer to use a compiled or interpreted language? Justify your answer.

Exit tickets

State three differences between a compiler and an interpreter. [3 marks]

Explain why a compiler is preferred over an interpreter for a finished commercial software product. [3 marks]

Describe what an assembler does. State one type of software where assembly language would be appropriate. [3 marks]

Homework suggestion

Research Python bytecode. Find out what a .pyc file contains and how the Python virtual machine uses it. Write a paragraph explaining why Python is a hybrid approach, and whether this makes it faster or slower than a fully compiled language.

Resources

Worksheet 1 Worksheet 2 Worksheet 3 Unit Exam Paper (all 6 lessons) Series Overview