Lesson 7 of 8
Algorithms: Lesson 7

Insertion Sort and Merge Sort

Two more powerful sorting algorithms. Insertion sort builds a sorted section one element at a time. Merge sort uses a divide-and-conquer strategy to sort in O(n log n) - dramatically better than bubble sort for large lists.

55 minutes Insertion sort step-through visualiser
Pseudocode style: Saved automatically

Bubble sort works, but it is slow for large lists. Its O(n2) worst case means that doubling the list size quadruples the time. For a list of 1,000,000 items, bubble sort would need up to a trillion operations. Insertion sort is still O(n2) in the worst case, but it is significantly more efficient in practice for nearly-sorted data. Merge sort solves the problem entirely with O(n log n) - fast enough to sort a million items in about 20 million operations instead of a trillion.

Think about it: How do you sort a hand of playing cards? You probably pick up each card and insert it into its correct position among the ones you are already holding. You have just described insertion sort. Now think about a different approach: split the deck in half, sort each half separately, then merge the two sorted halves together. That is merge sort.
Terms you need to know
Insertion sort
Builds a sorted sub-list one element at a time by inserting each new element into its correct position among the already-sorted elements.
Sorted sub-list
The left portion of the list that is already in sorted order. After inserting element i, the first i+1 elements are sorted.
Key element
The element currently being inserted into the sorted sub-list. It is compared to elements to its left until its correct position is found.
Merge sort
A divide-and-conquer sorting algorithm that recursively splits the list in half, sorts each half, then merges the two sorted halves.
Divide and conquer
A problem-solving strategy that splits a problem into smaller sub-problems, solves each, then combines the results.
Merge
Combining two sorted lists into one sorted list. The key operation in merge sort: compare the smallest elements of each list and take the smaller one.
Building order one element at a time

Insertion sort maintains a sorted sub-list at the left of the array. At the start, the sorted sub-list contains just the first element (a single element is trivially sorted). The algorithm then takes each subsequent element - the key - and inserts it into the correct position in the sorted sub-list by shifting larger elements one position to the right.

This is exactly how most people sort a hand of cards: the sorted sub-list is the cards you are already holding in order, and the key is the next card you pick up.

Step-by-step trace

Trace insertion sort on [5, 2, 8, 1, 9]:

StepKeySorted sub-listActionList after
Start-[5]First element is trivially sorted[5, 2, 8, 1, 9]
i=12[5]2 < 5 - shift 5 right, insert 2[2, 5, 8, 1, 9]
i=28[2,5]8 > 5 - insert 8 at end[2, 5, 8, 1, 9]
i=31[2,5,8]1 < 8, 1 < 5, 1 < 2 - shift all three, insert 1[1, 2, 5, 8, 9]
i=49[1,2,5,8]9 > 8 - insert at end[1, 2, 5, 8, 9]

Bold = confirmed sorted portion. The list is fully sorted after 4 insertions (n-1 insertions for n elements).

Pseudocode
// Insertion sort - ascending order (AQA)
SUBROUTINE insertionSort(myList)
    n  LEN(myList)
    FOR i  1 TO n - 1
        key  myList[i]    // Element to insert
        j  i - 1
        WHILE j >= 0 AND myList[j] > key
            myList[j+1]  myList[j]    // Shift right
            j  j - 1
        ENDWHILE
        myList[j+1]  key    // Insert key in correct position
    ENDFOR
    RETURN myList
ENDSUBROUTINE
# Insertion sort - ascending order (OCR)
procedure insertionSort(myList)
    n = len(myList)
    for i = 1 to n - 1
        key = myList[i]
        j = i - 1
        while j >= 0 and myList[j] > key
            myList[j+1] = myList[j]
            j = j - 1
        end while
        myList[j+1] = key
    next i
    return myList
endprocedure
Exam angle

Key exam fact: insertion sort has O(n2) worst case (reverse sorted input) but O(n) best case (already sorted input - the inner WHILE loop never executes). This makes it significantly better than bubble sort on nearly-sorted data. Exam questions often ask you to show the list after inserting each key element - always show the shifting of elements to the right, not just the final position.

Step through insertion sort

Green = sorted sub-list. Orange = key element being inserted. Purple = element being compared to the key.

Press Start to begin insertion sort on [5, 2, 8, 1, 9, 3].
Divide, sort, merge

Merge sort is a divide-and-conquer algorithm. It works in two phases:

Phase 1 - Divide: Repeatedly split the list in half until you have sub-lists of length 1. A list of one element is trivially sorted.

Phase 2 - Merge: Repeatedly merge pairs of sorted sub-lists into larger sorted lists. When merging two sorted lists, always take the smallest remaining element from the front of either list.

Divide-and-conquer diagram

Sorting [38, 27, 43, 3, 9, 82, 10]:

Step 1: Divide
[38, 27, 43, 3, 9, 82, 10]
Split ↓
[38, 27, 43]
[3, 9, 82, 10]
Split ↓
[38]
[27, 43]
[3, 9]
[82, 10]
Split ↓
[38]
[27]
[43]
[3]
[9]
[82]
[10]
Step 2: Merge (recombine in sorted order)
[27, 38]
[43]
[3, 9]
[10, 82]
Merge ↓
[27, 38, 43]
[3, 9, 10, 82]
Merge ↓
[3, 9, 10, 27, 38, 43, 82]
The merge operation

The key to merge sort is the merge step: given two sorted lists, produce one sorted list. Compare the first element of each list; take whichever is smaller and add it to the result. Repeat until one list is empty, then append the remainder of the other list.

Merging [3, 9] and [10, 82] step by step

Compare 3 and 10: 3 is smaller, take 3. Result: [3]
Compare 9 and 10: 9 is smaller, take 9. Result: [3, 9]
First list is empty. Append remainder of second list: [10, 82]
Final merged result: [3, 9, 10, 82]

Both input lists were sorted, so the comparison always correctly identifies the next-smallest element.

// Merge two sorted sub-lists into one sorted list (AQA)
FUNCTION merge(left, right)
    result  []
    WHILE LEN(left) > 0 AND LEN(right) > 0
        IF left[0] <= right[0] THEN
            result.append(left[0])
            left  left[1:]    // Remove first element
        ELSE
            result.append(right[0])
            right  right[1:]
        ENDIF
    ENDWHILE
    result  result + left + right    // Append any remaining
    RETURN result
ENDFUNCTION
# Merge two sorted lists (OCR)
function merge(left, right)
    result = []
    while len(left) > 0 and len(right) > 0
        if left[0] <= right[0] then
            result.append(left[0])
            left = left[1:]
        else
            result.append(right[0])
            right = right[1:]
        end if
    end while
    result = result + left + right
    return result
endfunction
Exam angle

Merge sort exam questions often ask you to: (1) Show the divide stage for a given list. (2) Trace the merge of two given sorted lists. (3) State why merge sort is more efficient than bubble sort. For (3), the answer is complexity: merge sort is O(n log n) in all cases, bubble sort is O(n2) worst case. For a list of 100 items, that is ~665 vs ~5000 operations.

Comparing all four sorting algorithms
AlgorithmBest caseWorst caseAvg caseRequires sorted?Good for
Bubble sortO(n) *O(n2)O(n2)NoEducation, small lists, nearly-sorted data
Insertion sortO(n)O(n2)O(n2)NoNearly-sorted data, small lists, online sorting
Merge sortO(n log n)O(n log n)O(n log n)NoLarge lists, stable sort required, predictable performance
Binary search *O(1)O(log n)O(log n)YesSearching sorted lists efficiently

* Bubble sort best case is O(n) only with early termination. * Binary search is a searching algorithm, not a sorting algorithm - included for reference.

Insertion sort vs bubble sort - which is better?

Both are O(n2) in the worst case, but insertion sort is generally preferred over bubble sort for the following reasons:

1. Insertion sort has O(n) best case (already sorted list). Bubble sort also has O(n) best case with early termination, but without it, runs O(n2).
2. Insertion sort makes approximately half as many comparisons as bubble sort on random data in practice.
3. Insertion sort is an "online" algorithm - it can sort data as it arrives, one element at a time. Bubble sort requires all data to be present first.

Test yourself

1. After inserting the 3rd element during insertion sort on [7, 3, 5, 1, 8], what does the sorted sub-list contain?

Start: sorted = [7]. Insert 3: 3 < 7, shift, insert. Sorted = [3,7]. Insert 5: 5 < 7 but 5 > 3, insert between. Sorted = [3,5,7]. After inserting the 3rd element (5, at index i=2), the sorted sub-list is [3,5,7].

2. In the divide phase of merge sort on [4, 7, 2, 1], what are the final sub-lists before merging begins?

Merge sort divides until all sub-lists have length 1. [4,7,2,1] splits to [4,7] and [2,1], then each splits again to [4],[7],[2],[1]. Single-element lists are trivially sorted and the merge phase begins.

3. What is the worst-case complexity of merge sort?

Merge sort is O(n log n) in all cases (best, average and worst). This makes it dramatically more efficient than bubble or insertion sort (both O(n2) worst case) for large lists.

4. When merging two sorted lists [2, 5, 9] and [3, 6, 8], what are the first two elements added to the result?

Compare first elements: 2 (from left) vs 3 (from right). 2 is smaller, take 2. Result: [2]. Now compare 5 vs 3. 3 is smaller, take 3. Result: [2,3]. Continue comparing front elements of each list.

5. Which sorting algorithm is most efficient for sorting a list that is already nearly in order?

Insertion sort is particularly efficient for nearly-sorted data. When an element is already close to its correct position, the inner WHILE loop makes very few iterations. In the best case (already sorted), insertion sort runs in O(n) - just one pass to confirm everything is in place. Merge sort always does O(n log n) work even on sorted data.
Challenge question

Merge sort is described as a "divide and conquer" algorithm. Explain what this means in the context of merge sort, and identify one advantage and one disadvantage of merge sort compared to insertion sort.

Divide and conquer: The problem (sorting a list) is divided into smaller sub-problems (sorting each half), each sub-problem is solved independently (recursively, until single elements), and the solutions are combined (merged) to produce the final result. The key insight is that merging two sorted lists is easy and efficient - the hard work of sorting is done by reducing to the trivial base case.

Advantage of merge sort over insertion sort: O(n log n) worst case vs O(n2). For large lists, merge sort is dramatically faster. Sorting 1,000,000 items: merge sort needs ~20,000,000 operations; insertion sort needs up to 1,000,000,000,000 (a trillion) in the worst case.

Disadvantage of merge sort: Merge sort requires additional memory to store the merged sub-lists (O(n) extra space). Insertion sort is an in-place algorithm - it only needs the space of the original list plus one key variable. For very memory-constrained environments, insertion sort may be preferred.
Printable Worksheets

Practice what you have learned

Three levelled worksheets. Download, print and complete offline.

Recall
Insertion and Merge Sort Basics
Define key terms, describe each algorithm in plain English, and draw the divide stage for a given list.
Download
Apply
Tracing Both Algorithms
Show the list after each insertion step. Complete the full divide-and-merge diagram for a given list.
Download
Exam Style
Exam-Style Questions
Compare algorithms, trace a merge, write insertion sort pseudocode and justify algorithm choice for a given scenario.
Download
Lesson 7 - Algorithms
Insertion Sort and Merge Sort
Starter activity
Give each student a hand of 5 shuffled numbered cards. Ask them to sort the cards without looking at more than one unsorted card at a time - they can only look at the next card and find where it fits among the ones they are already holding. Debrief: what they just did is insertion sort. Then ask: what if you could split the deck in two, give half to a friend, sort each half separately, then merge them together? Which approach was faster?
Lesson objectives
1
Describe insertion sort and explain the roles of the key element and sorted sub-list.
2
Trace insertion sort step by step showing the list state after each insertion.
3
Explain the divide-and-conquer strategy of merge sort using a diagram.
4
Trace the merge of two sorted lists into one sorted list.
5
Compare bubble sort, insertion sort and merge sort in terms of efficiency and use cases.
Key vocabulary
Insertion sort
Builds sorted sub-list one element at a time. O(n) best case (sorted), O(n2) worst case.
Key element
The element being inserted. Compared to elements to its left; larger ones shift right.
Merge sort
Divide-and-conquer. Split to single elements, then merge pairs. O(n log n) always.
Merge
Combine two sorted lists into one by always taking the smaller front element.
Discussion questions
Why is insertion sort called an "online" algorithm? What does this mean in practice?
Merge sort needs extra memory to store the merged lists. Is this always a problem?
If both insertion sort and bubble sort are O(n2), why might a real-world developer choose insertion sort?
Exit tickets
Show the list [6,1,4,2] after each insertion step of insertion sort. [3 marks]
State the worst-case complexity of merge sort and explain why it is more efficient than bubble sort. [2 marks]
Trace the merge of [1,4,7] and [2,5,6]. [2 marks]
Homework suggestion
Complete the full merge sort diagram for [9,3,7,1,5,8,2,6] - show every split step and every merge step. Then count how many comparisons were made in total during all merge operations. Compare this to how many comparisons bubble sort would make on the same list in the worst case.