Lesson 5 of 6
Series 7 - Lesson 5

Translators: Compilers, Interpreters and Assemblers

You cannot write Python and hand it to a CPU. Something has to translate it. That something is a translator - and the type you use changes how your program is distributed, debugged, and how fast it runs.

45 minutes GCSE Translators
When you run Python, the CPU never sees Python

A CPU executes binary machine code instructions. Full stop. It does not know what Python is. It does not know what Java is. It only knows operations like "load this value", "add these registers", "jump to this memory address".

So when you press Run in Python, what happens? A piece of software called a translator steps in. It reads your Python code and either converts it to machine code all at once (compiler) or executes it line by line as it translates (interpreter). The CPU never actually runs Python - it runs the machine code that the translator produces.

Think: If you compile a Python program on a Windows machine, could you send the compiled output to someone with a Mac and have it run? Why or why not?
The three types of translator
Compiler
Translates an entire HLL program into machine code in one pass before execution. Produces a standalone executable. Languages: C, C++, Rust.
Interpreter
Translates and executes a HLL program one line at a time. No standalone executable produced. Languages: Python, JavaScript, Ruby.
Assembler
Translates assembly language into machine code. Each assembly instruction converts to one machine code instruction. One-to-one mapping.
Compiler

Translates whole program first, then runs.

Produces fast executable - no translation overhead at runtime
Source code is hidden - distribute the .exe without revealing your code
All syntax errors found before running
Longer development cycle - must recompile after every change
Platform-specific - must recompile for each OS/architecture
All errors reported at once - can be overwhelming to debug
Interpreter

Translates and runs one line at a time.

Easier to debug - stops at first error, shows exactly where
Cross-platform - any machine with the interpreter can run it
No compile step - run immediately, great for development
Slower execution - translation happens at runtime every time
Source code must be distributed with the program
Interpreter must be installed on the user's machine
Inside the compilation process: step by step

Compilation is not one operation - it is a pipeline of stages. Click each stage to see what it does to your code.

1
Lexical Analysis (Tokenisation)
Source code is broken into tokens
The lexer reads your source code character by character and groups them into meaningful units called tokens: keywords (if, while, print), identifiers (variable names), operators (+, -, =), numbers, strings and punctuation. Whitespace is discarded. Unknown characters cause a lexical error.
x = 5 + y
Tokens: [IDENTIFIER:x] [OPERATOR:=] [NUMBER:5] [OPERATOR:+] [IDENTIFIER:y]
2
Syntax Analysis (Parsing)
Tokens are checked against grammar rules
The parser takes the token stream and checks whether it conforms to the grammar rules of the language. It builds a parse tree (Abstract Syntax Tree) representing the program's structure. If tokens are in the wrong order - for example, a missing closing bracket - a syntax error is reported here.
3
Semantic Analysis
Meaning is checked for logical consistency
Even syntactically correct code can be meaningless. The semantic analyser checks that variables are declared before use, that types are used correctly (you cannot add a number to a string in many languages), and that function calls match their definitions. These are semantic errors, distinct from syntax errors.
4
Code Generation and Optimisation
Machine code is produced and improved
The compiler generates machine code instructions from the parse tree. It then optimises this code - removing redundant operations, reordering instructions for the CPU pipeline, and taking advantage of specific hardware features. A heavily optimised compiled program can run many times faster than a naive translation.
5
Linking and Output
Final executable is assembled
The linker combines the compiled object code with any libraries the program uses (standard library functions, third-party packages) and produces the final executable file. This .exe or binary can be distributed and run on any compatible system without the original source code or compiler.
Tokeniser demo: see stage 1 of compilation live

Type a line of code and watch the lexer break it into tokens in real time. Each token type is colour-coded.

Live Tokeniser
Type code - see it split into tokens instantly
keyword identifier number operator string punctuation
Compiler vs interpreter: watch the difference

The compiler reads the entire program first, checks all of it, then runs the output in one go. The interpreter translates and executes one line at a time. Try adding a bug to line 3 to see the most important exam difference.

Compiler vs Interpreter
Same code - watch how each translator approaches it differently
Compiler
Waiting - press Run
1x = 10
2y = 20
3z = x + y
4print(z)
Output will appear here
Interpreter
Waiting - press Run
1x = 10
2y = 20
3z = x + y
4print(z)
Output will appear here
Press Run to see both translators process the same code. Then try adding a bug.
Exam focus - the big comparison table

This is one of the most common exam topics. Be ready to state three differences between a compiler and an interpreter, with reasons. Key points: (1) compiler translates whole program first vs interpreter line by line; (2) compiled code runs faster as no translation overhead at runtime; (3) interpreter stops at first error making debugging easier; (4) compiled produces standalone executable, interpreter requires the interpreter software to be installed; (5) source code is hidden after compilation, visible with an interpreter.

Think deeper

Python is described as an interpreted language, but modern Python uses a hybrid approach: it first compiles your code to bytecode (.pyc files), then the Python interpreter runs that bytecode. Why might this be faster than pure interpretation? And why is it still not as fast as compiled languages like C?

Bytecode is a compact intermediate representation that is faster to translate to machine code than raw source text. Pre-compiling to bytecode means the lexing and parsing steps (which are slow text operations) only happen once per file. However, bytecode is not machine code - it still requires the Python virtual machine to translate each bytecode instruction to actual CPU instructions at runtime. Languages like C compile directly to native machine code for a specific CPU, so at runtime there is zero translation overhead.
Lesson 5 - Software Series
Translators: Compilers, Interpreters and Assemblers
Starter activity
Ask two students to translate a sentence from English to French. One translates each word as they hear it (interpreter). The other listens to the whole sentence first, then gives a fluent translation (compiler). Which can start sooner? Which is more accurate? Which needs the whole input before it can begin?
Lesson objectives
1
Explain what a compiler does and describe what a compiled executable contains.
2
Explain what an interpreter does and how it differs from a compiler at runtime.
3
State at least four differences between a compiler and an interpreter.
4
Explain what an assembler does and state what it translates from and to.
Key vocabulary
Compiler
Translates entire source code into machine code before execution. Produces a standalone executable file.
Interpreter
Translates and executes source code line by line at runtime. No standalone executable produced.
Assembler
Translates assembly language mnemonics into machine code. One mnemonic becomes one machine instruction.
Token
The smallest meaningful unit a translator identifies: keywords, identifiers, operators, literals, and punctuation.
Bytecode
Intermediate compiled code for a virtual machine, e.g. Python .pyc files or Java .class files. Not native machine code.
Discussion questions
Python first compiles to bytecode (.pyc) then interprets it. Does this make Python compiled or interpreted? What does this tell us about the distinction?
Why would a game company prefer to compile their game to an executable rather than distribute source code with an interpreter?
When debugging a large program, would you prefer to use a compiled or interpreted language? Justify your answer.
Exit tickets
State three differences between a compiler and an interpreter. [3 marks]
Explain why a compiler is preferred over an interpreter for a finished commercial software product. [3 marks]
Describe what an assembler does. State one type of software where assembly language would be appropriate. [3 marks]
Homework suggestion
Research Python bytecode. Find out what a .pyc file contains and how the Python virtual machine uses it. Write a paragraph explaining why Python is a hybrid approach, and whether this makes it faster or slower than a fully compiled language.