Binary Search
The divide-and-conquer search algorithm that eliminates half the remaining candidates at every step. Far faster than linear search on large sorted lists - but it comes with a strict requirement.
Imagine you are searching for a word in a printed dictionary. You do not start at page 1 and read every word - you open roughly in the middle. If your word would come before that page, you discard the second half entirely. If it would come after, you discard the first half. You repeat this, each time halving the remaining pages, until you find the word. You have just performed binary search. This is why it can find any word in a 1,000-page dictionary in just 10 steps.
Binary search maintains three pointers: low (leftmost boundary), high (rightmost boundary) and mid (middle index, calculated as (low + high) DIV 2). At each step:
| Step | Action | Result |
|---|---|---|
| 1 | Calculate mid = (low + high) DIV 2 | Identifies the middle element |
| 2a | If myList[mid] = target | Found! Return mid and stop. |
| 2b | If target < myList[mid] | Target is in the left half. Set high = mid - 1. |
| 2c | If target > myList[mid] | Target is in the right half. Set low = mid + 1. |
| 3 | If low > high | Search space is empty. Target not found. Return -1. |
| 4 (loop) | Return to step 1 | Repeat with the new, smaller search range. |
Binary search relies entirely on the comparison in steps 2b and 2c. When we see that target < myList[mid], we conclude the target must be in the left half - but only if the list is sorted in ascending order. If the list is unsorted, elements to the left of mid could be larger than target, and we would be discarding half the list incorrectly. The entire algorithm breaks down on an unsorted list.
// Binary search - returns index of target or -1 (AQA) SUBROUTINE binarySearch(myList, target) low ← 0 high ← LEN(myList) - 1 found ← FALSE index ← -1 WHILE low <= high AND found = FALSE mid ← (low + high) DIV 2 IF myList[mid] = target THEN found ← TRUE index ← mid ELSEIF target < myList[mid] THEN high ← mid - 1 // Discard right half ELSE low ← mid + 1 // Discard left half ENDIF ENDWHILE RETURN index ENDSUBROUTINE
# Binary search - returns index of target or -1 (OCR) function binarySearch(myList, target) low = 0 high = len(myList) - 1 found = false index = -1 while low <= high and found == false mid = (low + high) DIV 2 if myList[mid] == target then found = true index = mid elseif target < myList[mid] then high = mid - 1 else low = mid + 1 end if end while return index endfunction
The most common exam error is setting high = mid instead of high = mid - 1 (and low = mid instead of low = mid + 1). Using mid instead of mid ± 1 can cause an infinite loop because the search space never shrinks when low = mid = high. Always use mid - 1 and mid + 1. Also know: DIV means integer division (discard the remainder). (7 + 4) DIV 2 = 5, not 5.5.
Trace binary search for target = 23 in myList = [2, 5, 8, 12, 16, 23, 38, 47]:
| Iteration | low | high | mid | myList[mid] | Comparison | Action |
|---|---|---|---|---|---|---|
| 1 | 0 | 7 | (0+7) DIV 2 = 3 | 12 | 23 > 12 | low = 3 + 1 = 4 |
| 2 | 4 | 7 | (4+7) DIV 2 = 5 | 23 | 23 = 23 | Found! Return index 5. |
Only 2 comparisons to find 23 in an 8-element list. Linear search would have needed 6 comparisons to reach index 5.
Now trace for target = 10 (not in list): myList = [2, 5, 8, 12, 16, 23, 38, 47]
| Iteration | low | high | mid | myList[mid] | Comparison | Action |
|---|---|---|---|---|---|---|
| 1 | 0 | 7 | 3 | 12 | 10 < 12 | high = 3 - 1 = 2 |
| 2 | 0 | 2 | 1 | 5 | 10 > 5 | low = 1 + 1 = 2 |
| 3 | 2 | 2 | 2 | 8 | 10 > 8 | low = 2 + 1 = 3 |
| Exit | 3 | 2 | - | - | low > high (3 > 2) | Return -1 (not found) |
| Property | Linear search | Binary search |
|---|---|---|
| Requirement | None - works on any list | List must be sorted in order |
| Best case | 1 comparison (target is first) | 1 comparison (target is the first mid) |
| Worst case | n comparisons | log2(n) comparisons |
| Average case | n/2 comparisons | log2(n) comparisons |
| n = 100 | Up to 100 | Up to 7 |
| n = 1,000,000 | Up to 1,000,000 | Up to 20 |
| Complexity | O(n) - linear | O(log n) - logarithmic |
The most-tested comparison question: "State one advantage of binary search over linear search and one advantage of linear search over binary search."
Binary search advantage: Much fewer comparisons needed for large lists (O(log n) vs O(n)).
Linear search advantage: Does not require the list to be sorted.
The list is already sorted. Enter a target and press Start to watch binary search step through. Orange = mid (being compared). Blue = low marker. Green = high marker. Faded = eliminated from search range.
Test yourself
1. Binary search is performed on [1, 4, 7, 10, 15, 19, 25, 30]. What is the first mid index calculated? (low=0, high=7)
2. A binary search is looking for target = 4 in [1, 4, 7, 10, 15, 19, 25, 30]. After comparing to myList[3]=10, what happens?
3. What is the maximum number of comparisons binary search needs to find an item in a sorted list of 128 items?
4. Why does binary search require the list to be sorted?
5. When binary search exits because low > high, what does this mean?
A developer has a list of 10,000 customer records sorted by customer ID. New customers are added daily in random order (so the list must be resorted each time before binary search can be used). At what point would it be more efficient to stop resorting for binary search and just use linear search instead? What factors would affect this decision?
Binary search: Requires sorting first (sorting 10,000 items using an efficient sort is approximately 10,000 x log(10,000) = ~130,000 operations), then each search is ~14 comparisons.
Linear search: No sorting needed. Each search is up to 10,000 comparisons.
Key insight: If you only search the list once after each sort, binary search may not be worth it (130,000 operations to sort + 14 to search vs 10,000 for linear). But if you search many times between updates, binary search wins massively. The break-even point depends on: (1) how often the list is searched between updates, (2) the cost of the sorting algorithm used, and (3) whether the list can be maintained in sorted order as items are inserted (insertion into a sorted list can be done efficiently).
Practice what you have learned
Three levelled worksheets. Download, print and complete offline.